Evaluation jobs run a rubric’s judge against agent runs in a collection. The evaluation runs server-side — you start the job and monitor progress. See Rubrics and Judges for evaluation concepts.Documentation Index
Fetch the complete documentation index at: https://docs.transluce.org/llms.txt
Use this file to discover all available pages before exploring further.
Start an Evaluation Job
Parameters
ID of the collection.
ID of the rubric to evaluate with.
Maximum number of agent runs to evaluate. If
None, evaluates all runs in the collection.Number of independent judge rollouts per agent run. More rollouts improve reliability
at the cost of more LLM calls.
Backend concurrency limit for the evaluation job. If
None, uses the server default.Whether the judge prompt should include agent run metadata.
Returns
ID of the created (or reused) evaluation job. If an identical job is already running,
its ID is returned instead of creating a duplicate.
Get Evaluation Results
Retrieve the current state of a rubric evaluation, including results and progress.Parameters
ID of the collection.
ID of the rubric.
Rubric version. If
None, uses the latest version.Optional filter to apply to results.
Whether to include failed judge results in the response.
Returns
Evaluation state.
get_rubric_run_state does not start an evaluation. Use start_rubric_eval_job()
first, then poll get_rubric_run_state() to check progress.
