Skip to main content
Docent Query Language (DQL) is a read-only SQL subset for ad-hoc queries over a Docent collection. Queries can only run over a single collection by design (if you need multi-collection support, please reach out to us!) The Docent Agent is the most ergonomic way to write DQL. To write DQL by hand, see the DQL schema reference for the column schema of each table and the execute_dql reference for Python SDK methods, operators, and common patterns. For filtering agent runs (select_agent_run_ids), see Query Agent Runs.

When to use DQL

DQL is great for structured queries like getting the average reward by model, or identifying tasks where one model regressed compared to another. Ask the agent:
/docent What's the average reward by model across this collection?
/docent Which tasks had one model scoring lower than another when averaging across rollouts?

A few illustrative queries

A simple SELECT

SELECT
  id,
  name,
  metadata_json->'model'->>'name' AS model_name,
  created_at
FROM agent_runs
WHERE metadata_json->>'status' = 'completed'
ORDER BY created_at DESC
LIMIT 10;

Aggregating with CTEs

Per-environment success rates, normalized via a CTE.
WITH normalized_runs AS (
  SELECT
    metadata_json->>'environment' AS environment,
    metadata_json->>'status' AS status
  FROM agent_runs
  WHERE metadata_json ? 'environment'
)
SELECT
  environment,
  COUNT() AS total_runs,
  SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS completed_runs,
  CAST(SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS DOUBLE PRECISION)
    / NULLIF(COUNT(), 0) AS completion_rate
FROM normalized_runs
GROUP BY environment
ORDER BY total_runs DESC;

Joining runs with judge results

Pulls the most recent rubric result per run, then joins to surface the model responsible for each score.
WITH latest_scores AS (
  SELECT
    agent_run_id,
    MAX(rubric_version) AS rubric_version
  FROM judge_results
  WHERE rubric_id = 'helpful_response_v1'
  GROUP BY agent_run_id
)
SELECT
  ar.id,
  ar.metadata_json->'model'->>'name' AS model_name,
  jr.output->>'score' AS score,
  jr.result_metadata->>'label' AS label
FROM latest_scores ls
JOIN judge_results jr
  ON jr.agent_run_id = ls.agent_run_id
  AND jr.rubric_version = ls.rubric_version
  AND jr.rubric_id = 'helpful_response_v1'
JOIN agent_runs ar ON ar.id = jr.agent_run_id
WHERE ar.metadata_json->>'environment' = 'prod'
ORDER BY CAST(jr.output->>'score' AS DOUBLE PRECISION) DESC
LIMIT 15;

What DQL does and doesn’t do

  • Read-only. Only SELECT-style queries are permitted.
  • Single statement. Batches and multiple statements are rejected.
  • Explicit projection. Wildcard * is disallowed; list columns explicitly so downstream tooling stays predictable.
  • Single-collection scope. A query runs against one collection at a time.
  • 10,000 row cap. Every query is capped by the server. Use LIMIT/OFFSET for pagination, or export offline for larger sets.
  • JSON performance. Metadata is stored as JSON; heavy traversal across large collections is slower than filtering on indexed scalar columns.
  • Type awareness. JSON paths expose a generic json type. Cast explicitly (e.g., CAST(metadata_json->>'duration_ms' AS BIGINT)) when precision matters.

Writing DQL by hand

See the full schema reference for the column schema of each table.