Hallucinations / Correctness Evaluation Criteria

AI agents can be set up to identify hallucination and evaluate correctness. You can read about how the AI agents conduct evaluations as per the defined evaluation criteria here.

The following table describes the evaluation categories and their definitions. The AI agents will categorize each data point into one of the following categories. The power of agents is to conduct semantic evaluations that require complex reasoning and content understanding.

Evaluation Categories	Definitions
Correct	The information in the generated answer semantically matches the correct answer.
Partially correct	The generated answer partially matches ground truth answer or has additional information along with correct answer.
No answer	The generated response does not provide any information that matches the content of the ground truth, effectively offering no answer.
Incorrect or Hallucination	The generated response provides information that does not match the ground truth, showing clear discrepancies in factual content or details. This includes both complete fabrications and minor inaccuracies that lead to a fundamental misalignment with the ground truth.

You can further customize the categories and definitions as per your use case easily on the Lighthouz Eval Agent framework.