Skip to main content
We no longer recommend authoring rubrics by hand. The Docent plugin generates Reading steps inside an Analysis Plan for you. This SDK reference is kept for users with existing rubrics.
Rubrics define evaluation criteria for agent runs. A judge is an LLM configured to evaluate runs against a rubric. See Rubrics and Judges for concepts.

Create a Rubric

from docent import Docent
from docent.judges.types import Rubric

client = Docent()

rubric = Rubric(
    rubric_text="""
    Evaluate whether the agent successfully completed the user's request.

    Decision procedure:
    1. Identify what the user asked for
    2. Check if the agent's final response addresses the request
    3. Verify the response is factually correct
    """,
    output_schema={
        "type": "object",
        "properties": {
            "label": {"type": "string", "enum": ["pass", "fail"]},
            "explanation": {"type": "string", "citations": True},
        },
        "required": ["label", "explanation"],
    },
)

rubric_id = client.create_rubric("my-collection-id", rubric)
print(rubric_id)

Parameters

collection_id
str
required
ID of the collection.
rubric
Rubric
required
The rubric configuration. Must have version=1 for new rubrics.

Returns

rubric_id
str
The ID of the created rubric.

Get a Rubric

rubric = client.get_rubric("my-collection-id", rubric_id)
print(rubric.rubric_text)
print(rubric.output_schema)

Parameters

collection_id
str
required
ID of the collection.
rubric_id
str
required
ID of the rubric to retrieve.
version
int | None
Specific version number. If None, returns the latest version.

Returns

rubric
Rubric
The rubric configuration object.

List Rubrics

rubrics = client.list_rubrics("my-collection-id")
for r in rubrics:
    print(f"{r['id']}: {r.get('rubric_text', '')[:80]}")

Parameters

collection_id
str
required
ID of the collection.

Returns

rubrics
list[dict]
List of rubric information dictionaries.

Get a Judge

Download a rubric configuration and create a callable judge instance. The judge reads LLM provider API keys from environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
judge = client.get_judge("my-collection-id", rubric_id)

# Inspect the configuration
print(judge.cfg.rubric_text)
print(judge.cfg.judge_model)

# Run locally (async)
import asyncio

async def evaluate():
    run = client.get_agent_run("my-collection-id", "run-id-123")
    result = await judge(run)
    print(result.output)  # {"label": "pass", "explanation": "..."}
    print(result.result_type)  # ResultType.DIRECT_RESULT

asyncio.run(evaluate())

Parameters

collection_id
str
required
ID of the collection.
rubric_id
str
required
ID of the rubric/judge to retrieve.
version
int | None
Specific version number. If None, returns the latest version.

Returns

judge
BaseJudge
A callable judge instance. Use await judge(agent_run) to evaluate a run.
Running a judge locally requires the appropriate LLM provider API key set in your environment (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY). The required provider depends on the rubric’s judge_model configuration.