Playgrounds are interactive spreadsheets where you can organize your data and experiment with LLMs, evaluate outputs, and analyze data. Think of them as powerful workbenches for AI development that combine the flexibility of a spreadsheet with the power of LLM evaluation and execution. It’s designed for everyone, from product managers and analysts to QA, data engineers, and software developers.
Playgrounds can be used to build datasets for experiments and evaluation. Once you’ve structured your data in a playground, you can export it to a dataset and publish a version for reproducible testing. Learn more about datasets.

Playground Structure

A playground is organized as a table-like structure with three fundamental components: rows, columns, and cells. Understanding how these work together is essential for effective playground usage.

Rows

Rows represent individual data points or test cases in your playground. Each row is a complete record that spans across all columns. Each row in the Playground is independent and can be executed on its own, maintains an order that can be rearranged as needed.

Row Operations

  • Add Row: Create new rows manually or through bulk operations
  • Generate Rows: Use the AI row generator to create new rows based on the existing data in your Playground.
  • Delete Row: Remove unwanted rows individually or in bulk
  • Execute Row: Execute all cells in a specific row

Columns

Columns are the building blocks of playgrounds, defining what kind of data you can store, process, and analyze. They come in different types to handle various data formats and use cases: Data Input Columns store static data such as text, json, numbers and tags Prompt Columns execute LLM prompts directly on your data with full model configuration, allowing you to test different prompts and compare outputs side by side. Evaluation Columns assess AI outputs and data quality using pre-built evaluators or custom evaluators tailored to your specific needs. Learn more about evaluators. You can manage columns by reordering, hiding, editing, duplicating, or deleting them as your analysis evolves. Learn more about column types and column management.

Create a Playground

Data can be imported from different sources:
  1. CSV files
  2. JSON file
  3. From A Dataset
  4. From production spans
You can create a Playground from scratch and import data later. Simply set a name for the Playground and start adding columns, rows, and data.

Running a Playground

Execute all cells in your playground by clicking the play button in the top right corner. This runs all prompt columns and evaluation columns across every row, allowing you to process your entire dataset at once. You can also run individual cells, rows, or columns by clicking on their respective play buttons to test specific configurations. For example, you might run a single agent execution, test one user input, or evaluate a specific chat conversation. Ready to build more sophisticated playgrounds? Dive into the complete documentation or explore specific column types to unlock the full power of Traceloop Playgrounds!