One of the key features of Traceloop is the ability to monitor the quality of your LLM outputs. It helps you to detect hallucinations and regressions in the quality of your models and prompts.

To start monitoring your LLM outputs, make sure you installed OpenLLMetry and configured it to send data to Traceloop. If you haven’t done that yet, you can follow the instructions in the Getting Started guide. Next, if you’re not using a framework like LangChain or LlamaIndex, make sure to annotate workflows and tasks.

You can then define any of the following monitors to track the quality of your LLM outputs.

Semantic Metrics

  • QA Relevancy: Asses the relevant of an answer generated by a model with respect to a question. This is especially useful when running RAG pipelines.
  • Faithfulness: Checks whether some generated content was inferred or deducted from a given context. Relevant for RAG pipelines, entity extraction, summarization, and many other text-related tasks.
  • Text Quality: Evaluates the overall readability and coherence of text.
  • Grammar Correctness: Checks for grammatical errors in generated texts.
  • Redundancy Detection: Identifies repetitive content.
  • Focus Assessment: Measures whether a given paragraph focuses on a single subject or “jumps” between multiple ones.

Syntactic TextMetrics

  • Text Length: Checks if the length of the generated text is within a given range (constant or with respect to an input).
  • Word Count: Checks if the number of words in the generated text is within a given range.

Safety Metrics

  • PII Detection: Identifies personally identifiable information in generated texts or input prompts.
  • Secret Detection: Identifies secrets and API keys in generated texts or input prompts.
  • Toxicity Detection: Identifies toxic content in generated texts or input prompts.

Structural Metrics

  • Regex Validation: Ensures that the output of a model matches a given regular expression.
  • SQL Validation: Ensures SQL queries are syntactically correct.
  • JSON Schema Validation: Ensures that the output of a model matches a given JSON schema.
  • Code Validation: Ensures that the output of a model is valid code in a given language.