> ## Documentation Index
> Fetch the complete documentation index at: https://docs.transluce.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Metadata

# Metadata

Docent supports metadata at multiple levels:

* Collection metadata attached to the collection itself
* Agent run metadata attached to an [AgentRun](/concepts/agent-run)
* Transcript group metadata attached to a `TranscriptGroup`
* Transcript metadata attached to a [Transcript](/concepts/transcript)

Any metadata should be JSON serializable. When metadata is rendered/stored, Docent converts it to JSON-compatible values using Pydantic's serializer (which supports common Python collections and nested Pydantic models).

## Choosing a metadata level

* Use collection metadata for information shared by the entire collection, such as dataset provenance, eval configuration, environment, or model family.
* Use agent run metadata for values that vary run to run, especially scores or other fields you want to analyze across a collection.
* Use transcript group or transcript metadata for finer-grained context within a single run.

## Collection metadata

Collection metadata lives on the collection rather than on individual runs. It is a good fit for collection-wide configuration and provenance.

### From the Python SDK

```python theme={null}
from docent import Docent

client = Docent()
collection_id = "..."

client.update_collection_metadata(
    collection_id,
    {
        "dataset": "helpdesk_jan_2026",
        "config": {
            "model": "gpt-5",
            "prompt_version": "v3",
        },
    },
)

metadata = client.get_collection_metadata(collection_id)

metadata_after_delete, not_found = client.delete_collection_metadata_keys(
    collection_id,
    ["config.prompt_version"],
)
```

Updates are deep-merged into the existing collection metadata, so patching `config.model` does not remove unrelated keys under `config`. Deletions support dot paths for nested keys.

### From tracing

```python theme={null}
from docent.trace import collection_metadata, initialize_tracing

initialize_tracing("customer-support-evals")

collection_metadata(
    {
        "dataset": "helpdesk_jan_2026",
        "environment": "staging",
        "config": {
            "model": "gpt-5",
            "prompt_version": "v3",
        },
    }
)
```

You can call `collection_metadata()` any time after `initialize_tracing()`. Unlike `agent_run_metadata()`, it does not require an active agent run or transcript context.

## Agent run, transcript group, and transcript metadata

We recommend including information about metrics / scores in metadata, as well as other information about the agent or task setup.

Scoring fields are useful for tracking metrics, like task completion or reward, but they are a convention rather than a required schema. Neither `AgentRun` nor `Transcript` enforces required metadata keys.

Here's an example of what a typical agent run metadata dict might look like:

```python theme={null}
metadata = {
    # Optional conventional fields
    "scores": {"reward_1": 0.1, "reward_2": 0.5, "reward_3": 0.8},
    # Custom fields
    "episode": 42,
    "policy_version": "v1.2.3",
    "training_step": 12500,
}
```

If you're using Inspect, `docent.loaders.load_inspect` also contains a `load_inspect_log` function which reads the standard scoring and metadata information from Inspect logs and copies them into Docent metadata.
