Home Improving ClaudIA Even More
🎖️

Improving ClaudIA Even More

Advanced step-by-step guide on how to further improve the retention and quality results of ClaudIA
FabrĂ­cio Rissetto
By and 2 others
• 4 articles

How to Increase Retention in 3 Steps

Step 1: Analyze the current metric and your goal The first step is to identify what the current retention level is and how much is needed to reach the desired level. To do this, simply access the Dashboard through the Cloud Humans Hub and look for the graph "% of retention/transfer by time aggregation" in the "Retention Metrics" tab. After adding filters like date and aggregation, it will be possible to identify the current retention levels. Step 2: Understand the main reasons for transfer Understanding the current retention levels and the desired goal, the next step is to understand the main reasons why ClaudIA is transferring the service. Understanding them will allow you to map opportunities with the greatest impact on increasing retention. To do this, simply access the graph "Proportion of transfer reasons in N2 tickets" in the "Retention Metrics" tab. In it, you will see that each color represents a different reason why ClaudIA transferred a service, such as "Eddie Transferred" or "Claudia used N2 content" - all reasons and their details can be found directly on the Dashboard, just above the graph. Generally, there is one or two transfer reasons that account for the majority of escalated tickets. These reasons can be prioritized for implementing improvements since they will have a greater impact on retention. To identify them, simply map the transfer reason (the color in the graph) with the highest % in the tickets. In the example above, the main transfer reasons are Claudia used N2 content and Claudia detected transfer action. Additionally, it's possible to analyze the escalated tickets for each of the reasons through the "Number of tickets/Transfer Reason" table right below, on the same tab. Step 3: Implement improvement actions After understanding the main reasons for transfer, one must understand their nature to start implementing improvements. Different reasons will require different strategies. For example, if it is observed that the main reason for N2 is Error in Eddie's call, the containment strategy will involve adjustments in the flow of the specific Eddie. If transfers are occurring due to Claudia used N2 content, the content base of ClaudIA should be reviewed. Below, we detail the main suggested improvement actions by transfer reason: Improvements involving content Claudia used N2 content Among the contents that returned to Claudia to respond, she chose to use an N2 content to respond to the customer. To reduce this type of transfer, it is necessary to reduce the number of N2 sections. That is, if there is a smaller proportion of N2 sections - along with N1 content that can resolve the customer's issues - there is a lower likelihood of N2 sections being used and, consequently, a potential reduction in escalated tickets. There are some possible strategies to do this while ensuring quality in service: (i) exclude N2 sections that are not being used correctly You can exclude sections that are not being used correctly, as well as duplicate sections. Refer to this article to learn about best practices for creating and maintaining content, ensuring that the sections are triggered correctly. (ii) transform N2 sections into N1 We know that the existence of N2 sections is necessary, especially in cases where there is a need to execute an internal process. However, in some cases, it is possible to adapt an N2 section into a step-by-step N1, allowing the customer to execute such actions. Using a "order status" section as an example, the adaptation could be made as follows: Before (N2 section) - Title: Where is my order? - Response: To check the status of your order, I will forward you to an agent. After (N1 section) - Title: Where is my order? - Response: To check the status of your order, you can access the link (www.orderstatuslink.com) and enter the tracking code received in your purchase email. After that, you will be able to see where your order is and what its delivery forecast is. (iii) transform N2 sections into INTERACTIVE If it is not possible to exclude or transform N2 sections into N1 sections, it is possible to make them INTERACTIVE, that is, to have them trigger Eddies that can retain the tickets. Using the example of the order status with an INTERACTIVE section, the customer could already send the tracking code, and ClaudIA (through Eddie) would return the order status in real time. For more information on how to build Eddies, simply access the articles in this folder. Claudia detected transfer action Customer was transferred after Claudia sent some message indicating that she would transfer the service to an agent/support. These cases tend to occur when an N2 section contains in its response some phrase indicating that the service will be transferred, for example, "I will transfer you to an agent." That is, these cases also represent situations where ClaudIA escalated a ticket through an N2 section. Therefore, to understand appropriate strategies for these situations, simply consult the information above, regarding the transfer reason Claudia used N2 content. Improvements involving Content and/or Feature Customer Asked for Human Customer insisted on speaking to a human and Claudia escalated. To reduce this type of transfer, it is necessary to reduce the need for a user to ask to speak to a human. Usually, users request contact with an agent for two reasons: (1) because they do not trust that the AI can resolve their issue - these cases can generally be identified when a customer asks to speak to a human right at the beginning of the service, without having stated their issue; or (2) because the AI is indeed unable to assist them. To mitigate these cases, we recommend two main strategies: (i) Activate "AI Selling" The "AI Selling" is a feature that, when a customer requests to speak to a human, instead of transferring them automatically, sends a message "selling ClaudIA", indicating to the customer that they can send their issue and that the AI will try to resolve it. If, in their second message, the customer asks to speak to a human again, ClaudIA will transfer the service. In other words, with this feature, a customer needs to ask twice to speak to a human for them to be transferred. Example of service with the "AI Selling" feature: Tickets transferred to N2 by this feature will have their reason indicated as Multiprompt handover. To activate this feature and set the "AI Selling" message, please contact the Cloud Humans team. (ii) Identify cases where AI did not resolve the customer's issue and create or improve sections based on the observed service problems. This strategy is useful for cases where ClaudIA attempted to resolve the customer's issue, but unsuccessfully, which made them ask to speak to a human. These cases tend to occur because the available content (N1 sections) was not useful for the customer. To identify the N1 sections that were not effective and generated the need for human service, simply access the graph "Content used before the customer asked for a human" in the "Retention Metrics" tab. After locating the main sections that are not being effective, one must understand what can be improved in their content so that it resolves the customers' issues. Improvements involving Eddie Eddie Transferred Customer was transferred within a flow inside Eddie To reduce this type of transfer, it is necessary to decrease the "Forward to Human [N2]" cards in Eddie flows, that is, reduce the likelihood of a service ending up in a branch of the flow that will be escalated. To do this, we recommend (1) identifying the Eddies that escalate services the most, (2) identifying branches that end with "Forward to Human [N2]" and (3) understanding what needs to be done to transform N2 cards into N1 cards. To locate the Eddies that escalate the most, simply access the graph "Volume of tickets in Eddie" in the "General Information - Live Projects" tab of the Dashboard in the Hub. Transforming an N2 card into N1 can be as simple as making an adjustment to the text preceding the card and updating the integration with ClaudIA. However, some situations may require the creation of APIs and other back-office infrastructure requirements. If you have questions about the possibility of creating an N1 branch within an N2 Eddie, we suggest you contact your technology team. Other transfer reasons (under construction)

Last updated on Aug 12, 2025

Test Cases in ClaudIA: What They Are, How to Use Them, and Why They Make a Difference

https://www.loom.com/share/8355ed9aee974f8a9174bc9a7f47cd3e What are Test Cases? Test Cases are a feature that allows you to test changes in ClaudIA quickly, safely, and reproducibly. They simulate real interactions based on historical tickets and check if ClaudIA responds as expected after changes to sessions or prompts. What are they for? You can use Test Cases to: - Correct inappropriate behaviors (e.g., incomplete responses); - Ensure that changes in sessions do not break other responses; - Measure the impact of an adjustment before deploying it to production; - Test new flows, responses, or instructions more quickly. Example of Use Problem: ClaudIA is using the correct session, but omitting part of the content in the response to the customer. Solution with Test Case: 1. Tag the tickets with this problem. 2. Create a Test Case with these tickets. 3. Make adjustments to the sessions (e.g., instruction to always send the complete content). 4. Run the Test Case and verify if the problem was resolved. 5. Ensure that no other desired behavior was affected. How to Create a Test Case (Step by Step) 1. Access a ticket with an inappropriate response. 2. Click on the 🧪 icon next to ClaudIA's message. 3. Click on “New” to create a new Test Case. - Give it a descriptive name (e.g., “Incomplete Response”). 1. Configure: - Checkbox “Reuse returned sections”: - Check if you want to use the same sections from the time of the original ticket. - Uncheck if you want to test with new sections or recent changes. - Number of executions per ticket (e.g., 5x) to ensure consistency. 5. Choose the type of verification: - LLM: Uses a prompt to evaluate the response. - Embeddings: Compares the original response with the current one by vector distance. - Regex: Checks if certain words should or should not appear. Here is a link explaining Regex Available Test Types LLM - Main Use: Evaluate if the new response meets defined rules via prompt - Example: “Does the response contain 100% of the text from the session?” Embeddings - Main Use: Compare similarity between the original response and the new one - Example: Vector distance less than 0.2 Regex - Main Use: Ensure that ClaudIA uses (or avoids) certain words - Example: cannot contain “sorry” How to Add More Tickets to a Test Case 1. Access the second (or third, fourth…) ticket with the same problem. 2. Click on the 🧪 icon. 3. Go to the “Existing” tab and select the already created Test Case. 4. Save. How to Interpret the Results After running the Test Case: - ✅ Passed: ClaudIA's response is according to the rule. - ❌ Failed: The response is still incorrect or incomplete. You can see: - Which executions passed or failed. - Complete history of the Test Case performance. Ensuring No Other Responses Were Affected After running a specific test, click on “Run All” at the top of the Test Cases page. This executes all existing cases and checks if any changes you made broke other flows. How to Schedule Periodic Executions You can schedule your Test Cases to run automatically: - Every 6 hours - Daily - Weekly This helps to detect breaks caused by unforeseen changes. And what about the A/B Test Function? If you want to test without impacting production, use the A/B Test mode. In it, you can run Test Cases with a new version of the prompt or session without changing ClaudIA's behavior for end customers. To enable it, just contact the Cloud Humans team. TL;DR (Final Summary) - Test Cases are like automated tests to ensure ClaudIA is responding correctly. - They help fix problems faster and safely. - You can create, edit, rerun, and schedule tests based on real tickets. - It is possible to test different types of validation (LLM, regex, embeddings). - Use the “Run All” button to ensure nothing was broken after adjustments.

Last updated on Aug 12, 2025

How to Improve Claudia's Quality

In this tutorial, you will learn how to analyze the quality of Claudinha's responses (CSAT), detailing available metrics on the dashboard, the impact of tags and sections, and ticket auditing. It also addresses common errors in Claudia, their causes and solutions, helping to optimize responses and improve the customer experience. Below, we have two videos and subsequently a textual explanation to improve Claudia's quality: Best practices for auditing, CSAT, and Analysis of audits: https://youtu.be/sRt_eA-qqJ8 https://youtu.be/7V0gmlEP5xo https://youtu.be/_caaHdoMUuk Best Practices in Auditing 1. What should I analyze when auditing a ticket? - Check if the AI's response is correct and complete. - If the response is correct, record a positive vote. - If there is an error, identify the cause: - Did the AI use the wrong section? - Was the response incomplete? - Was a qualification step needed before responding? - To audit the messages, follow this explanation of the options: - 1. Should have used another section to respond The AI used an incorrect information source and should have consulted a more appropriate or relevant section of the available content to respond correctly to the customer. 2. The response was incomplete The AI partially answered the request, failing to address important points or those requested by the customer in the original message. 3. Needed to clarify before responding The AI misinterpreted or guessed the customer's intention without first seeking clarification, which compromised the accuracy of the response. 4. Should not have clarified The AI asked an unnecessary clarification question when there was already sufficient information to formulate a useful response. 5. Hallucination The AI included information that was not present in any of the available sources, inventing data or making unsupported claims. 6. Should have clarified differently The AI tried to clarify the customer's doubt, but the way it asked the question was confusing, generic, or poorly directed, making it difficult to continue the conversation. 7. Should have escalated due to lack of content The AI did not have enough basis in the current content to respond correctly, and therefore should have forwarded the service to another flow or to a human. 8. Problem with IDS content The content used by the AI (IDS information) was incorrect, outdated, or poorly structured, harming the quality of the response. 9. Other Issues outside of common patterns, such as technical errors, inappropriate language, confusion in service logic, or unexpected AI behavior. 2. How to identify an incomplete response? - Compare the given response with the available content. - If the correct section was used, but the response did not address all necessary information, mark the error as incomplete response. - Use the prompt led section to avoid this problem in the future. 3. What to do if the AI does not have the necessary content? - Check if there is a relevant section for the doubt. - If not, create new content to cover this case. - Record the error as "missing content" in the error detail. 4. How to audit numerical responses or calculations? - Confirm if the AI used the correct section to perform the calculation. - If the calculation is correct, mark the response as valid. - If the AI could have used another more appropriate section, adjust the audit. 5. How to know if the AI should have transferred a service to N2? - Check if the AI had all the necessary information to resolve the issue within N1 level. - If the response was satisfactory and complete within N1, there is no need for transfer. - Otherwise, the audit should indicate that the AI should have forwarded it to N2. 6. How to handle cases of service abandonment? - If the service flow worked correctly and the audit did not point out errors, record the case as correct flow. - If there are improvement points, note them for future adjustments. 7. What to do if the AI has no information on a topic? - If the customer asks about something not in the knowledge base, record the error as "missing content" in the error detail. - If relevant content exists but was not used, adjust the section title to ensure better information retrieval. 8. How to deal with generic or uninformative responses? - If the AI answered something vague or did not fully clarify the customer's doubt, ideally, it should ask for more information before responding. - If necessary, adjust the content to include more specific questions and improve the contextualization of the response. 9. When should I modify a content title? - If the AI did not find a relevant section, but a similar content exists, it may be necessary to adjust the title to facilitate retrieval. Follow the best practices for content creation and maintenance available at this link. Quality Analysis of Claudinha (CSAT) 1. What is the quality analysis of Claudinha? The quality analysis aims to evaluate the performance of Claudinha's responses, identifying improvement opportunities based on the adjustments we can make and the improvements you can also implement. 2. How to view CSAT data on the dashboard? On Claudinha's dashboard, you can access: - CSAT overview: a broad analysis of the quality of responses. - Comparison between humans and Claudinha: comparison between responses given by Claudinha and tickets that were escalated for human assistance. - CSAT by tag: Claudinha's performance on different topics. - Historical data for the last five weeks: allows tracking the evolution of response quality over time. - Table view: aggregated display by tag, showing the total number of responses and the respective CSAT. 4. How to analyze the impact of tags and Eddie on CSAT? There is a specific visualization on the dashboard that allows understanding if a particular EDDIE or Tag is negatively impacting the CSAT. This helps identify if a specific content is harming the quality of responses. 5. How to understand the impact of the sections used on CSAT? We can cross-reference the sections used in conversations with customers against the CSAT results. This allows us to check if a particular section is associated with a negative CSAT, helping identify which contents are negatively affecting the customer experience. 6. How to visualize overall CSATs? CSATs are categorized into: - Positive ("goods" 4 and 5) - Negative and neutral (1, 2, and 3) Tickets from each category can be analyzed to understand if Claudinha performed below expectations or if the customer was dissatisfied with the response. 7. What to do after analyzing CSAT? After this analysis, we can audit the tickets to identify the main reasons for Claudinha's errors and act on them. This audit will be addressed in the next step of Claudinha's continuous improvement. Auditing and Quality of Claudia 1. What can we learn from auditing tickets? Auditing CSAT tickets and other tickets allows us to identify errors and areas for improvement in Claudia. With this analysis, we can reduce failures and optimize the responses provided by the AI. 2. How to visualize Claudia's overall quality? On the general quality tab of the dashboard, it is possible to see: - The total percentage of errors from Claudia - Classification of errors (Error 1, 2, or 3) - Analysis of correct and incorrect tags - Topics and tags that most negatively impact The provided links allow access to error details by selected tag. 3. How to identify contents that generate the most errors? The "error percentage by content" section of the dashboard allows visualizing which contents are more prone to generating errors in Claudia. This information is useful for prioritizing corrections. 4. What does the "breakdown by message sent" mean? This section details the errors made by Claudia in generating messages. For each response, Claudia receives a set of 15 to 20 contents (TopK) and, based on them, constructs the response. Errors can occur in different ways: - Correct section outside TopK: the ideal response was not in the received set. - Wrong choice: Claudia received the correct section but incorrectly chose another. - Incomplete response: Claudia used the right response but omitted critical parts. 5. How to analyze errors by section used? The table presents: - Analyzed ticket - Section used in the response - Correct section that should have been used - Description of errors made 6. What can errors indicate? - Confusion between similar sections: Sections with similar titles may hinder Claudia's correct choice. - Solution: Review the titles of the sections following best practices for content maintenance and creation available at this link or use the Prompt Led Section functionality available at this link to better specify the use of each section. - Incomplete responses: When Claudia omits critical information. - Solution: Adjust the sections or use Prompt Led Section to reinforce the importance of complete content. - Sections do not appear in TopK: The correct content is not among the 15 to 20 most relevant. - Solution: Adjust titles so they appear better in certain customer searches. 7. How do these insights help in improving Claudia? With the detailed analysis of errors and utilized contents, it is possible to: - Improve the accuracy of responses - Reduce classification errors - Ensure more complete responses - Adjust titles to improve indexing and relevance - Use the Prompt Led Section functionality to enhance the choice of correct sections This way, we can optimize Claudia's performance and enhance the customer experience. Access the mind map of best practices here!

Last updated on Aug 12, 2025