Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.transluce.org/llms.txt

Use this file to discover all available pages before exploring further.

This page explains how the /docent plugin handles ingestion under the hood. To ingest your data, point your coding agent at a directory containing your trajectories. For best results, sort your trajectories by format before invoking it.
/docent Ingest the trajectories at <PATH_TO_DATA>

How the Docent plugin handles ingestion

Using the /docent plugin, your coding agent uploads your agent logs into Docent by writing a Python script that converts them into AgentRun format. It investigates your file structure, examines your trajectory format, and maps each field in your schema to a Docent object. It produces:
  1. ingestion-plan.md: A mapping of your trajectory fields to Docent’s data model, including any fields that will be intentionally omitted. Review this file to verify the intended organization and display of your data.
  2. ingest.py: A Python script that reads your logs and uploads them via the SDK. You can modify and rerun this as needed.
Your coding agent asks clarifying questions if your data format is ambiguous or it needs more context about how you want the data structured. If your coding agent can’t infer your format or you need fine-grained control, write the ingest script directly. See SDK ingestion.

Best practices

  • Ingest one trajectory format at a time. If you have multiple formats, sort your trajectories by the scaffold that generated them (e.g., openhands/, mini-swe-agent/, custom/) before invoking /docent.
  • Include metadata relevant to your analysis. Fields like reward, model name, and task ID enable downstream filtering, DQL queries, and rubric evaluation.
  • Verify your uploaded data in the web interface. After upload, check that transcripts display correctly and metadata appears where you expect. It’s normal to iterate a few times on different organization structures.

What’s next