> ## Documentation Index
> Fetch the complete documentation index at: https://docs.transluce.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Structured Queries (DQL)

> A read-only SQL subset for structured, quantitative questions about your data

<Note>
  You usually don't write DQL by hand anymore. The [Docent plugin](/installation) produces [DQL steps](/analysis/dql-steps) inside an [Analysis Plan](/analysis/analysis-plans); this page is the language reference if you want to inspect or edit what your coding agent generated.
</Note>

Docent Query Language (DQL) is a read-only SQL subset for ad-hoc queries over a Docent collection. Queries can only run over a single collection by design (if you need multi-collection support, please reach out to us!)

Your coding agent (using the Docent plugin) is the easiest way to get DQL written for your collection. To write DQL by hand, see the [DQL schema reference](/sdk/dql/schema) for the column schema of each table and the [execute\_dql reference](/sdk/dql/execute) for Python SDK methods, operators, and common patterns. For filtering agent runs (`select_agent_run_ids`), see [Query Agent Runs](/sdk/agent-runs/query).

## When to use DQL

DQL is great for structured queries like getting the average reward by model, or identifying tasks where one model regressed compared to another. Ask the agent:

```text wrap theme={null}
/docent What's the average reward by model across this collection?
```

```text wrap theme={null}
/docent Which tasks had one model scoring lower than another when averaging across rollouts?
```

## A few illustrative queries

### A simple `SELECT`

```sql theme={null}
SELECT
  id,
  name,
  metadata_json->'model'->>'name' AS model_name,
  created_at
FROM agent_runs
WHERE metadata_json->>'status' = 'completed'
ORDER BY created_at DESC
LIMIT 10;
```

### Aggregating with CTEs

Per-environment success rates, normalized via a CTE.

```sql theme={null}
WITH normalized_runs AS (
  SELECT
    metadata_json->>'environment' AS environment,
    metadata_json->>'status' AS status
  FROM agent_runs
  WHERE metadata_json ? 'environment'
)
SELECT
  environment,
  COUNT() AS total_runs,
  SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS completed_runs,
  CAST(SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS DOUBLE PRECISION)
    / NULLIF(COUNT(), 0) AS completion_rate
FROM normalized_runs
GROUP BY environment
ORDER BY total_runs DESC;
```

### Joining runs with judge results

Pulls the most recent rubric result per run, then joins to surface the model responsible for each score.

```sql theme={null}
WITH latest_scores AS (
  SELECT
    agent_run_id,
    MAX(rubric_version) AS rubric_version
  FROM judge_results
  WHERE rubric_id = 'helpful_response_v1'
  GROUP BY agent_run_id
)
SELECT
  ar.id,
  ar.metadata_json->'model'->>'name' AS model_name,
  jr.output->>'score' AS score,
  jr.result_metadata->>'label' AS label
FROM latest_scores ls
JOIN judge_results jr
  ON jr.agent_run_id = ls.agent_run_id
  AND jr.rubric_version = ls.rubric_version
  AND jr.rubric_id = 'helpful_response_v1'
JOIN agent_runs ar ON ar.id = jr.agent_run_id
WHERE ar.metadata_json->>'environment' = 'prod'
ORDER BY CAST(jr.output->>'score' AS DOUBLE PRECISION) DESC
LIMIT 15;
```

## What DQL does and doesn't do

* **Read-only.** Only `SELECT`-style queries are permitted.
* **Single statement.** Batches and multiple statements are rejected.
* **Explicit projection.** Wildcard `*` is disallowed; list columns explicitly so downstream tooling stays predictable.
* **Single-collection scope.** A query runs against one collection at a time.
* **10,000 row cap.** Every query is capped by the server. Use `LIMIT`/`OFFSET` for pagination, or export offline for larger sets.
* **JSON performance.** Metadata is stored as JSON; heavy traversal across large collections is slower than filtering on indexed scalar columns.
* **Type awareness.** JSON paths expose a generic `json` type. Cast explicitly (e.g., `CAST(metadata_json->>'duration_ms' AS BIGINT)`) when precision matters.

## Writing DQL by hand

See the [full schema reference](/sdk/dql/schema) for the column schema of each table.