> ## Documentation Index
> Fetch the complete documentation index at: https://docs.transluce.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Manage Rubrics

> Create, retrieve, and list rubrics and judges

<Note>
  We no longer recommend authoring rubrics by hand. The [Docent plugin](/installation) generates [Reading steps](/analysis/reading-steps) inside an [Analysis Plan](/analysis/analysis-plans) for you. This SDK reference is kept for users with existing rubrics.
</Note>

Rubrics define evaluation criteria for agent runs. A judge is an LLM configured to evaluate
runs against a rubric. See [Rubrics and Judges](/legacy/rubrics) for concepts.

## Create a Rubric

```python theme={null}
from docent import Docent
from docent.judges.types import Rubric

client = Docent()

rubric = Rubric(
    rubric_text="""
    Evaluate whether the agent successfully completed the user's request.

    Decision procedure:
    1. Identify what the user asked for
    2. Check if the agent's final response addresses the request
    3. Verify the response is factually correct
    """,
    output_schema={
        "type": "object",
        "properties": {
            "label": {"type": "string", "enum": ["pass", "fail"]},
            "explanation": {"type": "string", "citations": True},
        },
        "required": ["label", "explanation"],
    },
)

rubric_id = client.create_rubric("my-collection-id", rubric)
print(rubric_id)
```

### Parameters

<ParamField body="collection_id" type="str" required>
  ID of the collection.
</ParamField>

<ParamField body="rubric" type="Rubric" required>
  The rubric configuration. Must have `version=1` for new rubrics.

  <Expandable title="Rubric fields">
    <ResponseField name="rubric_text" type="str" required>
      The evaluation criteria and decision procedure. This is the core content
      the judge uses to evaluate agent runs.
    </ResponseField>

    <ResponseField name="output_schema" type="dict">
      JSON schema for the judge's output. Default schema has `label` (enum: match/no match)
      and `explanation` (string with citations) fields.
    </ResponseField>

    <ResponseField name="judge_model" type="ModelOption">
      LLM model to use for judging. Uses the platform default if not specified.
    </ResponseField>

    <ResponseField name="n_rollouts_per_input" type="int" default="1">
      Number of independent judge evaluations per agent run. Used with majority
      voting or multi-reflection judge variants.
    </ResponseField>

    <ResponseField name="judge_variant" type="str" default="majority">
      Judge strategy: `"majority"` for majority voting, `"multi-reflect"` for
      multi-stage reflection.
    </ResponseField>

    <ResponseField name="prompt_templates" type="list[PromptTemplateMessage]">
      Custom prompt templates. Each has a `role` (`"system"`, `"user"`, `"assistant"`)
      and `content` string. The content can use `{rubric}`, `{agent_run}`, and
      `{output_schema}` template variables.
    </ResponseField>

    <ResponseField name="output_parsing_mode" type="str" default="xml_key">
      How to parse judge output: `"xml_key"` extracts from XML tags,
      `"constrained_decoding"` parses entire output as JSON.
    </ResponseField>

    <ResponseField name="response_xml_key" type="str" default="response">
      XML tag name for extracting output (when using `xml_key` parsing mode).
    </ResponseField>

    <ResponseField name="output_format" type="Literal[&#x22;json&#x22;, &#x22;yaml&#x22;]" default="yaml">
      Format the judge is instructed to emit and that the SDK parses. `"yaml"`
      is the default for new rubrics; `"json"` is preserved for rubrics created
      before this field existed.
    </ResponseField>
  </Expandable>
</ParamField>

### Returns

<ResponseField name="rubric_id" type="str">
  The ID of the created rubric.
</ResponseField>

***

## Get a Rubric

```python theme={null}
rubric = client.get_rubric("my-collection-id", rubric_id)
print(rubric.rubric_text)
print(rubric.output_schema)
```

### Parameters

<ParamField body="collection_id" type="str" required>
  ID of the collection.
</ParamField>

<ParamField body="rubric_id" type="str" required>
  ID of the rubric to retrieve.
</ParamField>

<ParamField body="version" type="int | None">
  Specific version number. If `None`, returns the latest version.
</ParamField>

### Returns

<ResponseField name="rubric" type="Rubric">
  The rubric configuration object.
</ResponseField>

***

## List Rubrics

```python theme={null}
rubrics = client.list_rubrics("my-collection-id")
for r in rubrics:
    print(f"{r['id']}: {r.get('rubric_text', '')[:80]}")
```

### Parameters

<ParamField body="collection_id" type="str" required>
  ID of the collection.
</ParamField>

### Returns

<ResponseField name="rubrics" type="list[dict]">
  List of rubric information dictionaries.
</ResponseField>

***

## Get a Judge

Download a rubric configuration and create a callable judge instance. The judge reads
LLM provider API keys from environment variables (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc.).

```python theme={null}
judge = client.get_judge("my-collection-id", rubric_id)

# Inspect the configuration
print(judge.cfg.rubric_text)
print(judge.cfg.judge_model)

# Run locally (async)
import asyncio

async def evaluate():
    run = client.get_agent_run("my-collection-id", "run-id-123")
    result = await judge(run)
    print(result.output)  # {"label": "pass", "explanation": "..."}
    print(result.result_type)  # ResultType.DIRECT_RESULT

asyncio.run(evaluate())
```

### Parameters

<ParamField body="collection_id" type="str" required>
  ID of the collection.
</ParamField>

<ParamField body="rubric_id" type="str" required>
  ID of the rubric/judge to retrieve.
</ParamField>

<ParamField body="version" type="int | None">
  Specific version number. If `None`, returns the latest version.
</ParamField>

### Returns

<ResponseField name="judge" type="BaseJudge">
  A callable judge instance. Use `await judge(agent_run)` to evaluate a run.

  <Expandable title="BaseJudge interface">
    <ResponseField name="cfg" type="Rubric">
      The underlying rubric configuration.
    </ResponseField>

    <ResponseField name="__call__(agent_run, *, temperature=1.0, max_new_tokens=16384, timeout=180.0)" type="async -> JudgeResult">
      Evaluate an agent run. Returns a `JudgeResult` with `output`, `result_type`,
      and `result_metadata` fields.
    </ResponseField>
  </Expandable>
</ResponseField>

<Note>
  Running a judge locally requires the appropriate LLM provider API key set in your
  environment (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`). The required provider
  depends on the rubric's `judge_model` configuration.
</Note>
