Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.transluce.org/llms.txt

Use this file to discover all available pages before exploring further.

DQL (Docent Query Language) lets you query agent runs, metadata, and evaluation results using a SQL-like syntax.
For the conceptual overview of DQL, see Structured queries (DQL). For the column schema of each table, see DQL schema reference.

Execute a Query

from docent import Docent

client = Docent()

result = client.execute_dql(
    "my-collection-id",
    "SELECT id, metadata_json->>'model' AS model FROM agent_runs LIMIT 10",
)

Parameters

collection_id
str
required
ID of the collection to query.
dql
str
required
The DQL query string.

Returns

result
dict
Query result.

Errors

  • ValueErrordql is empty
  • HTTPError — Invalid DQL syntax or collection not found

Convert Results to Dicts

result = client.execute_dql(
    "my-collection-id",
    "SELECT id AS run_id, metadata_json->>'model' AS model FROM agent_runs LIMIT 5",
)

rows = client.dql_result_to_dicts(result)
for row in rows:
    print(row)  # {"run_id": "abc-123", "model": "gpt-4"}

Parameters

dql_result
dict
required
A result dict returned by execute_dql.

Returns

rows
list[dict]
List of dictionaries, one per row, with column names as keys.

Common Query Patterns

Filter by metadata

result = client.execute_dql(
    collection_id,
    """
    SELECT id, metadata_json->>'score' AS score
    FROM agent_runs
    WHERE metadata_json->>'model' = 'gpt-4'
      AND CAST(metadata_json->>'score' AS DOUBLE PRECISION) > 0.8
    """
)

Join with evaluation results

result = client.execute_dql(
    collection_id,
    """
    SELECT
        ar.id,
        jr.output->>'label' AS label,
        jr.output->>'explanation' AS explanation
    FROM agent_runs ar
    JOIN judge_results jr ON ar.id = jr.agent_run_id
    WHERE jr.rubric_id = 'rubric-123'
    """
)

Count and aggregate

result = client.execute_dql(
    collection_id,
    """
    SELECT
        jr.output->>'label' AS label,
        COUNT() AS count
    FROM judge_results jr
    WHERE jr.rubric_id = 'rubric-123'
    GROUP BY jr.output->>'label'
    """
)

Allowed syntax

DQL supported keywords:
Feature
SELECT, DISTINCT, FROM, WHERE, subqueries
JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, CROSS JOIN
WITH (CTEs)
UNION [ALL], INTERSECT, EXCEPT
GROUP BY, HAVING
Aggregations (COUNT, AVG, MIN, MAX, SUM, STDDEV_POP, STDDEV_SAMP, ARRAY_AGG, STRING_AGG, JSON_AGG, JSONB_AGG, JSON_OBJECT_AGG, PERCENTILE_CONT, PERCENTILE_DISC (WITHIN GROUP))
Window functions (ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE, PERCENT_RANK, CUME_DIST)
ORDER BY, LIMIT, OFFSET
Conditional & null helpers (CASE, COALESCE, NULLIF)
Boolean logic (AND, OR, NOT)
Comparison operators (=, !=, <, <=, >, >=, IS, IS NOT, IS DISTINCT FROM, IN, BETWEEN, LIKE, ILIKE, EXISTS, SIMILAR TO, ~, ~*, !~, !~*)
Arithmetic & math (+, -, *, /, %, POWER, ABS, SIGN, SQRT, LN, LOG, EXP, GREATEST, LEAST, FLOOR, CEIL, ROUND, RANDOM)
String helpers (SUBSTRING, LEFT, RIGHT, LENGTH, UPPER, LOWER, INITCAP, TRIM, REPLACE, SPLIT_PART, POSITION, CONCAT, CONCAT_WS, STRING_AGG)
JSON operators & functions (->, ->>, #>, #>>, @>, ?, `?, ?&, json_agg, jsonb_agg, json_object_agg`)
Date/time basics (CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, NOW(), EXTRACT, DATE_TRUNC, AT TIME ZONE)
Interval arithmetic (timestamp +/- INTERVAL, INTERVAL literals)
Construction & conversion (MAKE_DATE, MAKE_TIME, MAKE_TIMESTAMP, TO_CHAR)
Array helpers (array_cat, unnest)
Type helpers (CAST, ::)
Unsupported constructs include wildcard * in SELECT clauses (e.g., SELECT *, COUNT(*)), user-defined functions, and any DDL or DML commands.

SQL patterns

Bare DQL snippets for common tasks.

Filter by metadata field

SELECT id, name FROM agent_runs
WHERE metadata_json->>'environment' = 'prod'

Filter by date range

SELECT id, name, created_at FROM agent_runs
WHERE created_at >= NOW() - INTERVAL '7 days'
ORDER BY created_at DESC

Count by category

SELECT
  metadata_json->>'model' AS model,
  COUNT() AS run_count
FROM agent_runs
GROUP BY metadata_json->>'model'
ORDER BY run_count DESC

Check if a metadata field exists

SELECT id, name FROM agent_runs
WHERE metadata_json ? 'custom_field'

Numeric comparison on a JSON field

SELECT id, name FROM agent_runs
WHERE CAST(metadata_json->>'score' AS DOUBLE PRECISION) > 0.8

Cast JSON for aggregation

SELECT
  AVG(CAST(metadata_json->>'latency_ms' AS DOUBLE PRECISION)) AS avg_latency_ms
FROM agent_runs
WHERE metadata_json ? 'latency_ms';

Join transcripts with agent runs

SELECT
  ar.name AS run_name,
  t.name AS transcript_name
FROM agent_runs ar
JOIN transcripts t ON t.agent_run_id = ar.id

Transcript counts per group

SELECT
  tg.id AS group_id,
  tg.name AS group_name,
  COUNT(t.id) AS transcript_count
FROM transcript_groups tg
JOIN transcripts t ON t.transcript_group_id = tg.id
GROUP BY tg.id, tg.name
HAVING COUNT(t.id) > 1
ORDER BY transcript_count DESC;

Transcript coverage audit

Finds transcript groups that are marked as must_have but have no associated transcripts.
SELECT
  tg.id AS group_id,
  tg.name AS group_name,
  COUNT(t.id) AS transcript_count
FROM transcript_groups tg
LEFT JOIN transcripts t
  ON t.transcript_group_id = tg.id
  AND t.collection_id = tg.collection_id
WHERE tg.metadata_json->>'priority' = 'must_have'
GROUP BY tg.id, tg.name
HAVING COUNT(t.id) = 0
ORDER BY group_name;

Flagged judge results

SELECT
  jr.agent_run_id,
  jr.rubric_id,
  jr.result_metadata->>'label' AS label,
  jr.output->>'score' AS score
FROM judge_results jr
WHERE jr.result_metadata->>'severity' = 'high'
  AND EXISTS (
    SELECT 1
    FROM agent_runs ar
    WHERE ar.id = jr.agent_run_id
      AND ar.metadata_json->>'environment' = 'prod'
  )
ORDER BY score DESC
LIMIT 25;

Common gotchas

”column X does not exist”

  • DQL requires explicit column selection. Wildcards (*) are not supported.
  • Check the schema using client.get_dql_schema(collection_id) to see available columns.

Numeric comparisons not working as expected

JSON fields are strings by default. Cast them for numeric operations:
-- Wrong: string comparison
WHERE metadata_json->>'score' > '0.5'

-- Correct: numeric comparison
WHERE CAST(metadata_json->>'score' AS DOUBLE PRECISION) > 0.5

Query returns no results but data exists

  • Check that you’re querying the correct collection
  • Verify metadata field names are exact matches (case-sensitive)
  • Use ? operator to check if a field exists before filtering on it

Results truncated unexpectedly

DQL caps results at 10,000 rows. Use LIMIT and OFFSET for pagination:
-- First page
SELECT id, name FROM agent_runs LIMIT 1000 OFFSET 0
-- Second page
SELECT id, name FROM agent_runs LIMIT 1000 OFFSET 1000

“syntax error” on valid-looking SQL

Some SQL features aren’t supported in DQL:
  • No * wildcard in SELECT
  • No INSERT, UPDATE, DELETE
  • No user-defined functions