> ## Documentation Index
> Fetch the complete documentation index at: https://docs.transluce.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Execute DQL Queries

> Run queries against your data with the Docent Query Language

DQL (Docent Query Language) lets you query agent runs, metadata, and evaluation results using a SQL-like syntax.

<Note>
  For the conceptual overview of DQL, see [Structured queries (DQL)](/analysis/dql). For the column schema of each table, see [DQL schema reference](/sdk/dql/schema).
</Note>

## Execute a Query

```python theme={null}
from docent import Docent

client = Docent()

result = client.execute_dql(
    "my-collection-id",
    "SELECT id, metadata_json->>'model' AS model FROM agent_runs LIMIT 10",
)
```

### Parameters

<ParamField body="collection_id" type="str" required>
  ID of the collection to query.
</ParamField>

<ParamField body="dql" type="str" required>
  The DQL query string.
</ParamField>

<ParamField body="reading_plan_id" type="str | None" default="None">
  Optional reading plan ID for `$alias` substitution in queries that reference reading step aliases.
</ParamField>

<ParamField body="source" type="Literal[&#x22;endpoint&#x22;, &#x22;mcp&#x22;]" default="&#x22;endpoint&#x22;">
  Analytics source for the execution path. Defaults to `"endpoint"`; the MCP server sets `"mcp"` automatically. You typically do not need to set this.
</ParamField>

<ParamField body="ensure_latest_results" type="bool" default="False">
  When `True`, the server refuses the optimized parquet path unless the parquet replica reflects every committed mutation, and runs the query against Postgres otherwise. Use this when you cannot tolerate parquet lag — for example, immediately after adding or deleting agent runs, or when verifying a write.
</ParamField>

### Returns

<ResponseField name="result" type="dict">
  Query result.

  <Expandable title="Fields">
    <ResponseField name="columns" type="list[str]">Column names.</ResponseField>
    <ResponseField name="rows" type="list[list]">Row data, where each row is a list of values matching the column order.</ResponseField>
    <ResponseField name="truncated" type="bool">Whether the result was truncated by the server limit.</ResponseField>
    <ResponseField name="applied_limit" type="int | None">The limit that was applied.</ResponseField>
  </Expandable>
</ResponseField>

### Errors

* **`ValueError`** — `dql` is empty
* **`HTTPError`** — Invalid DQL syntax or collection not found

***

## Convert Results to Dicts

```python theme={null}
result = client.execute_dql(
    "my-collection-id",
    "SELECT id AS run_id, metadata_json->>'model' AS model FROM agent_runs LIMIT 5",
)

rows = client.dql_result_to_dicts(result)
for row in rows:
    print(row)  # {"run_id": "abc-123", "model": "gpt-4"}
```

### Parameters

<ParamField body="dql_result" type="dict" required>
  A result dict returned by `execute_dql`.
</ParamField>

### Returns

<ResponseField name="rows" type="list[dict]">
  List of dictionaries, one per row, with column names as keys.
</ResponseField>

***

## Common Query Patterns

### Filter by metadata

```python theme={null}
result = client.execute_dql(
    collection_id,
    """
    SELECT id, metadata_json->>'score' AS score
    FROM agent_runs
    WHERE metadata_json->>'model' = 'gpt-4'
      AND CAST(metadata_json->>'score' AS DOUBLE PRECISION) > 0.8
    """
)
```

### Join with evaluation results

```python theme={null}
result = client.execute_dql(
    collection_id,
    """
    SELECT
        ar.id,
        jr.output->>'label' AS label,
        jr.output->>'explanation' AS explanation
    FROM agent_runs ar
    JOIN judge_results jr ON ar.id = jr.agent_run_id
    WHERE jr.rubric_id = 'rubric-123'
    """
)
```

### Count and aggregate

```python theme={null}
result = client.execute_dql(
    collection_id,
    """
    SELECT
        jr.output->>'label' AS label,
        COUNT() AS count
    FROM judge_results jr
    WHERE jr.rubric_id = 'rubric-123'
    GROUP BY jr.output->>'label'
    """
)
```

***

## Allowed syntax

DQL supported keywords:

| Feature                                                                                                                                                                                                       |                                                                                                                   |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| `SELECT`, `DISTINCT`, `FROM`, `WHERE`, subqueries                                                                                                                                                             |                                                                                                                   |
| `JOIN`, `LEFT JOIN`, `RIGHT JOIN`, `FULL JOIN`, `CROSS JOIN`                                                                                                                                                  |                                                                                                                   |
| `WITH` (CTEs)                                                                                                                                                                                                 |                                                                                                                   |
| `UNION [ALL]`, `INTERSECT`, `EXCEPT`                                                                                                                                                                          |                                                                                                                   |
| `GROUP BY`, `HAVING`                                                                                                                                                                                          |                                                                                                                   |
| Aggregations (`COUNT`, `AVG`, `MIN`, `MAX`, `SUM`, `STDDEV_POP`, `STDDEV_SAMP`, `ARRAY_AGG`, `STRING_AGG`, `JSON_AGG`, `JSONB_AGG`, `JSON_OBJECT_AGG`, `PERCENTILE_CONT`, `PERCENTILE_DISC` (`WITHIN GROUP`)) |                                                                                                                   |
| Window functions (`ROW_NUMBER`, `RANK`, `DENSE_RANK`, `NTILE`, `LAG`, `LEAD`, `FIRST_VALUE`, `LAST_VALUE`, `NTH_VALUE`, `PERCENT_RANK`, `CUME_DIST`)                                                          |                                                                                                                   |
| `ORDER BY`, `LIMIT`, `OFFSET`                                                                                                                                                                                 |                                                                                                                   |
| Conditional & null helpers (`CASE`, `COALESCE`, `NULLIF`)                                                                                                                                                     |                                                                                                                   |
| Boolean logic (`AND`, `OR`, `NOT`)                                                                                                                                                                            |                                                                                                                   |
| Comparison operators (`=`, `!=`, `<`, `<=`, `>`, `>=`, `IS`, `IS NOT`, `IS DISTINCT FROM`, `IN`, `BETWEEN`, `LIKE`, `ILIKE`, `EXISTS`, `SIMILAR TO`, `~`, `~*`, `!~`, `!~*`)                                  |                                                                                                                   |
| Arithmetic & math (`+`, `-`, `*`, `/`, `%`, `POWER`, `ABS`, `SIGN`, `SQRT`, `LN`, `LOG`, `EXP`, `GREATEST`, `LEAST`, `FLOOR`, `CEIL`, `ROUND`, `RANDOM`)                                                      |                                                                                                                   |
| String helpers (`SUBSTRING`, `LEFT`, `RIGHT`, `LENGTH`, `UPPER`, `LOWER`, `INITCAP`, `TRIM`, `REPLACE`, `SPLIT_PART`, `POSITION`, `CONCAT`, `CONCAT_WS`, `STRING_AGG`)                                        |                                                                                                                   |
| JSON operators & functions (`->`, `->>`, `#>`, `#>>`, `@>`, `?`, \`?                                                                                                                                          | `, `?&`, `jsonb\_array\_length`, `json\_agg`, `jsonb\_agg`, `json\_object\_agg`, `convert\_from`, `convert\_to\`) |
| Date/time basics (`CURRENT_DATE`, `CURRENT_TIME`, `CURRENT_TIMESTAMP`, `NOW()`, `EXTRACT`, `DATE_TRUNC`, `AT TIME ZONE`)                                                                                      |                                                                                                                   |
| Interval arithmetic (`timestamp +/- INTERVAL`, `INTERVAL` literals)                                                                                                                                           |                                                                                                                   |
| Construction & conversion (`MAKE_DATE`, `MAKE_TIME`, `MAKE_TIMESTAMP`, `TO_CHAR`)                                                                                                                             |                                                                                                                   |
| Array helpers (`array_cat`, `unnest`)                                                                                                                                                                         |                                                                                                                   |
| Type helpers (`CAST`, `::`)                                                                                                                                                                                   |                                                                                                                   |

Unsupported constructs include wildcard `*` in SELECT clauses (e.g., `SELECT *`, `COUNT(*)`), user-defined functions, and any DDL or DML commands.

***

## SQL patterns

Bare DQL snippets for common tasks.

### Filter by metadata field

```sql theme={null}
SELECT id, name FROM agent_runs
WHERE metadata_json->>'environment' = 'prod'
```

### Filter by date range

```sql theme={null}
SELECT id, name, created_at FROM agent_runs
WHERE created_at >= NOW() - INTERVAL '7 days'
ORDER BY created_at DESC
```

### Count by category

```sql theme={null}
SELECT
  metadata_json->>'model' AS model,
  COUNT() AS run_count
FROM agent_runs
GROUP BY metadata_json->>'model'
ORDER BY run_count DESC
```

### Check if a metadata field exists

```sql theme={null}
SELECT id, name FROM agent_runs
WHERE metadata_json ? 'custom_field'
```

### Numeric comparison on a JSON field

```sql theme={null}
SELECT id, name FROM agent_runs
WHERE CAST(metadata_json->>'score' AS DOUBLE PRECISION) > 0.8
```

### Cast JSON for aggregation

```sql theme={null}
SELECT
  AVG(CAST(metadata_json->>'latency_ms' AS DOUBLE PRECISION)) AS avg_latency_ms
FROM agent_runs
WHERE metadata_json ? 'latency_ms';
```

### Counting transcript messages

`transcripts.messages` is stored as `bytea` (UTF-8 JSON), not `jsonb`. Operators like `messages -> 0` or `jsonb_array_length(messages)` raise `operator does not exist: bytea -> integer`. Decode to `jsonb` first, then count array elements:

```sql theme={null}
jsonb_array_length(convert_from(messages, 'UTF8')::jsonb)
```

The same applies to `transcripts.metadata_json` — decode with `convert_from(metadata_json, 'UTF8')::jsonb` before using JSON operators.

**Agent runs with at least N messages in any transcript:**

```sql theme={null}
SELECT DISTINCT ar.id AS agent_run_id
FROM agent_runs ar
JOIN transcripts t ON t.agent_run_id = ar.id
WHERE jsonb_array_length(convert_from(t.messages, 'UTF8')::jsonb) >= 10;
```

**Per-transcript message counts** (compute once in a subquery, then filter):

```sql theme={null}
SELECT
  transcript_id,
  agent_run_id,
  message_count
FROM (
  SELECT
    t.id AS transcript_id,
    t.agent_run_id,
    jsonb_array_length(convert_from(t.messages, 'UTF8')::jsonb) AS message_count
  FROM transcripts t
) AS counted
WHERE message_count >= 10
ORDER BY message_count DESC;
```

**Reading nested `transcripts.metadata_json` fields:**

```sql theme={null}
SELECT
  id,
  meta->'conversation'->>'speaker' AS speaker,
  meta->'conversation'->>'topic' AS topic
FROM (
  SELECT
    id,
    convert_from(metadata_json, 'UTF8')::jsonb AS meta
  FROM transcripts
) AS t
WHERE meta->>'status' = 'flagged';
```

<Tip>
  Express filters like "≥10 messages" with the pattern above. Don't materialize matching IDs elsewhere and paste them into a giant `WHERE id IN ('…', '…', …)` clause — that blows past query size limits and is hard to maintain.
</Tip>

### Join transcripts with agent runs

```sql theme={null}
SELECT
  ar.name AS run_name,
  t.name AS transcript_name
FROM agent_runs ar
JOIN transcripts t ON t.agent_run_id = ar.id
```

### Transcript counts per group

```sql theme={null}
SELECT
  tg.id AS group_id,
  tg.name AS group_name,
  COUNT(t.id) AS transcript_count
FROM transcript_groups tg
JOIN transcripts t ON t.transcript_group_id = tg.id
GROUP BY tg.id, tg.name
HAVING COUNT(t.id) > 1
ORDER BY transcript_count DESC;
```

### Transcript coverage audit

Finds transcript groups that are marked as `must_have` but have no associated transcripts.

```sql theme={null}
SELECT
  tg.id AS group_id,
  tg.name AS group_name,
  COUNT(t.id) AS transcript_count
FROM transcript_groups tg
LEFT JOIN transcripts t
  ON t.transcript_group_id = tg.id
  AND t.collection_id = tg.collection_id
WHERE tg.metadata_json->>'priority' = 'must_have'
GROUP BY tg.id, tg.name
HAVING COUNT(t.id) = 0
ORDER BY group_name;
```

### Flagged judge results

```sql theme={null}
SELECT
  jr.agent_run_id,
  jr.rubric_id,
  jr.result_metadata->>'label' AS label,
  jr.output->>'score' AS score
FROM judge_results jr
WHERE jr.result_metadata->>'severity' = 'high'
  AND EXISTS (
    SELECT 1
    FROM agent_runs ar
    WHERE ar.id = jr.agent_run_id
      AND ar.metadata_json->>'environment' = 'prod'
  )
ORDER BY score DESC
LIMIT 25;
```

***

## Common gotchas

### "column X does not exist"

* DQL requires explicit column selection. Wildcards (`*`) are not supported.
* Check the schema using `client.get_dql_schema(collection_id)` to see available columns.

### Numeric comparisons not working as expected

JSON fields are strings by default. Cast them for numeric operations:

```sql theme={null}
-- Wrong: string comparison
WHERE metadata_json->>'score' > '0.5'

-- Correct: numeric comparison
WHERE CAST(metadata_json->>'score' AS DOUBLE PRECISION) > 0.5
```

### Query returns no results but data exists

* Check that you're querying the correct collection
* Verify metadata field names are exact matches (case-sensitive)
* Use `?` operator to check if a field exists before filtering on it

### Results truncated unexpectedly

DQL caps results at 10,000 rows. Use `LIMIT` and `OFFSET` for pagination:

```sql theme={null}
-- First page
SELECT id, name FROM agent_runs LIMIT 1000 OFFSET 0
-- Second page
SELECT id, name FROM agent_runs LIMIT 1000 OFFSET 1000
```

### "syntax error" on valid-looking SQL

Some SQL features aren't supported in DQL:

* No `*` wildcard in SELECT
* No `INSERT`, `UPDATE`, `DELETE`
* No user-defined functions