K Pro
How do we measure/prevent hallucinations?
We have daily monitoring in place (e.g., using a metric like Tool Call Accuracy (TCA)) to track the percentage of times the correct tool is identified and called. This helps us quickly identify systemic issues where an agent might be "hallucinating" the need for a specific tool when another is more suitable, or failing to identify any tool when one is needed. Parameter Monitoring: Once a tool is called, it's important that the correct parameters are passed to it. Incorrect parameters can lead to wrong actions, all of which manifest as hallucinations in the agent's final response. We monitor the accuracy and completeness of parameters passed to tools via some automated tests on a set of evaluation questions.