NeMo Gym

NeMo Gym (docs, GitHub) is a library for building reinforcement learning environments for large language models. We offer helpers that import NeMo Gym rollout exports into Docent.

Use this when

Use NeMo Gym ingestion when your source data is either:

one parsed rollout dictionary already loaded in Python
a JSONL export where each line is one NeMo Gym rollout object

Main helpers

convert_nemogym_rollout_to_agent_run(rollout) converts one parsed rollout object into one AgentRun.
convert_nemogym_jsonl_file_to_agent_runs(file_path) reads a JSONL file and converts each line into one AgentRun.

from docent.sdk.integrations import (
    convert_nemogym_jsonl_file_to_agent_runs,
    convert_nemogym_rollout_to_agent_run,
)

Example

To convert a JSONL export:

from docent.sdk.integrations import convert_nemogym_jsonl_file_to_agent_runs

agent_runs = convert_nemogym_jsonl_file_to_agent_runs("rollouts.jsonl")

To convert one rollout dictionary already loaded in memory:

from docent.sdk.integrations import convert_nemogym_rollout_to_agent_run

agent_run = convert_nemogym_rollout_to_agent_run(rollout)

After conversion, upload normally:

from docent import Docent
from docent.sdk.integrations import convert_nemogym_jsonl_file_to_agent_runs

client = Docent()
agent_runs = convert_nemogym_jsonl_file_to_agent_runs("rollouts.jsonl")
client.add_agent_runs(collection_id, agent_runs)

More on the conversion process

Each NeMo Gym rollout becomes one Docent AgentRun. At a high level, the converter:

turns the rollout input and output into a single Docent transcript
maps developer messages to Docent system messages
converts NeMo Gym function calls into Docent tool calls and tool messages
stores the rollout reward in agent_run.metadata["scores"]["reward"]
preserves extra request and response data under agent_run.metadata["source"]

The converter is strict. It expects the rollout to include responses_create_params.input, response.output, agent_ref.name, _ng_task_index, and _ng_rollout_index. responses_create_params.input can be either:

a string, which Docent converts into a single user message
an array of structured input items

If the rollout uses unsupported message shapes, unsupported content-part types, invalid tool-call wiring, or missing required fields, the converter raises ConversionError instead of trying to approximate the data.

Get Started

Ingestion

Docent Agent

Tutorials

Core Concepts

Self-Hosting

Use this when

Main helpers

Example

More on the conversion process

Get Started

Ingestion

Docent Agent

Tutorials

Core Concepts

Self-Hosting

​Use this when

​Main helpers

​Example

​More on the conversion process

Use this when

Main helpers

Example

More on the conversion process