Result Overview

Experiment Runs
Experiment Tasks

All experiments are logged in the Traceloop platform. Each experiment is executed through the SDK.

Experiment Runs

An experiment can be run multiple times against different datasets and tasks. All runs are logged in the Traceloop platform to enable easy comparison.

Experiment Tasks

An experiment run is made up of multiple tasks, where each task represents the experiment flow applied to a single dataset row. The task logging captures:

Task input – the data taken from the dataset row.
Task outputs – the results produced by running the task, which are then passed as input to the evaluator.
Evaluator results – the evaluator’s assessment based on the task outputs.

Introduction Run via SDK

⌘I

Learn

Self-host

Datasets

Playgrounds

Evaluators

Experiments

Monitoring

Prompt Management

Settings

Integrations

Experiment Runs

Experiment Tasks

Learn

Self-host

Datasets

Playgrounds

Evaluators

Experiments

Monitoring

Prompt Management

Settings

Integrations

​Experiment Runs

​Experiment Tasks

Experiment Runs

Experiment Tasks