The Evaluator Library provides a comprehensive collection of pre-built quality checks designed to systematically assess AI outputs. Each evaluator comes with a predefined input and output schema. When using an evaluator, you’ll need to map your data to its input schema.

Evaluator Types

Character Count

Analyze response length and verbosity to ensure outputs meet specific length requirements.

Character Count Ratio

Measure the ratio of characters to the input to assess response proportionality and expansion.

Word Count

Ensure appropriate response detail level by tracking the total number of words in outputs.

Word Count Ratio

Measure the ratio of words to the input to compare input/output verbosity and expansion patterns.

Answer Relevancy

Verify responses address the query to ensure AI outputs stay on topic and remain relevant.

Faithfulness

Detect hallucinations and verify facts to maintain accuracy and truthfulness in AI responses.

PII Detection

Identify personal information exposure to protect user privacy and ensure data security compliance.

Profanity Detection

Flag inappropriate language use to maintain content quality standards and professional communication.

Secrets Detection

Monitor for credential and key leaks to prevent accidental exposure of sensitive information.

SQL Validation

Validate SQL queries to ensure proper syntax and structure in database-related AI outputs.

JSON Validation

Validate JSON responses to ensure proper formatting and structure in API-related outputs.

Regex Validation

Validate regex patterns to ensure correct regular expression syntax and functionality.

Placeholder Regex

Validate placeholder regex patterns to ensure proper template and variable replacement structures.

Semantic Similarity

Validate semantic similarity between expected and actual responses to measure content alignment.

Agent Goal Accuracy

Validate agent goal accuracy to ensure AI systems achieve their intended objectives effectively.

Topic Adherence

Validate topic adherence to ensure responses stay focused on the specified subject matter.

Measure Perplexity

Measure text perplexity from logprobs to assess the predictability and coherence of generated text.