Evaluations are the offline testing layer for your agents and LLM pipelines. Define inputs, run them through the version you want to test, score the outputs, and compare across runs to see if anything regressed.Documentation Index
Fetch the complete documentation index at: https://laminar.sh/docs/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
What evaluations are and when to use them.
Quickstart
Write and run your first evaluation in five minutes.
Concepts
Datapoints, executors, evaluators, and how they map to traces.
Compare runs
Groups, the progression chart, and side-by-side comparison.
Datasets
Back your evaluation with a Laminar dataset instead of hardcoded data.
Manual API
Lower-level control for pipelines where
evaluate() is too opinionated.Self-hosted
Point the SDK at a self-hosted Laminar instance.
