Welcome to Docent

Docent is a behavior analysis platform for agents. After you run an evaluation, Docent analyzes your traces and explains what failure modes or environment issues are driving your team’s evaluation results. Teams use Docent to:

Iterate on scaffolds. Docent returns actionable insights to inform prompt tuning, tool instructions, or orchestration logic.
Post-train models. Compare behavior across checkpoints or training steps to identify what’s driving shifts in eval results.
Build better benchmarks. Catch reward hacking, evaluation awareness, broken environments, and ambiguous task specifications.

Read about how Docent helped align Claude 4 and debug a regression between two Codex checkpoints on Terminal-Bench.

Get started

Installation

Install the Docent plugin and add your API key.

Get in touch

Join our Slack community to ask questions and chat with our team.

Installation

⌘I

Get Started

Agentic Analysis

Ingestion

Data Models

Guides

Legacy

Support

Welcome to Docent

Get started

Installation

Get in touch

​Get started

Installation

Get in touch

Get started