Helpfulness Evaluation Criteria
AI agents can be used to measure the helpfulness of responses. You can read about how the AI agents conduct evaluations as per the defined evaluation criteria here.
The following table describes the evaluation categories and their definitions. The AI agents will categorize each data point into one of the following categories. The power of agents is to conduct semantic evaluations that require complex reasoning and content understanding.
Evaluation Categories | Definitions |
---|---|
Not Helpful | The generated response is completely unrelated, lacks coherence, and fails to provide any meaningful information. |
Somewhat Helpful | The generated response bears some relevance but remains largely superficial and unclear, addressing only the peripheral aspects of the user's needs. |
Moderately Helpful | The generated response is mostly relevant and clear, covering the basic aspects of the query, but lacks depth and comprehensive elucidation. |
Helpful | The generated response is on-point, detailed, and well-articulated, offering valuable information and clarifications that meet the user's primary needs and enhance understanding. |
You can further customize the categories and definitions as per your use case easily on the Lighthouz Eval Agent framework.