Skip to content

LLM providers and calls

Docent uses a unified interface to call and aggregate results from different LLM providers.

Provider registry

Each LLM provider is specified through a [ProviderConfig][docent_core._llm_util.providers.registry.ProviderConfig] object, which requires three functions:

  • async_client_getter: Returns an async client for the provider
  • single_output_getter: Gets a single completion from the provider, compatible with the [AsyncSingleOutputGetter][docent_core._llm_util.providers.registry.SingleOutputGetter] protocol
  • single_streaming_output_getter: Gets a streaming completion from the provider, compatible with the [AsyncSingleStreamingOutputGetter][docent_core._llm_util.providers.registry.SingleStreamingOutputGetter] protocol

We currently support anthropic, openai, and azure_openai.

Adding a new provider

  1. Create a new module in docent_core/_llm_util/providers/ (e.g., my_provider.py)
  2. Implement the functions required by ProviderConfig
  3. Add the provider to the [PROVIDERS][docent_core._llm_util.providers.registry.PROVIDERS] dictionary in registry.py

Selecting models for Docent functions

Docent uses a preference system to determine which LLM models to use for different functions. [ProviderPreferences][docent_core._llm_util.providers.preferences.ProviderPreferences] manages the mapping between Docent functions and their ordered preference of [ModelOption][docent_core._llm_util.providers.preferences.ModelOption] objects:

@cached_property
def function_name(self) -> list[ModelOption]:
    """Get model options for the function_name function.

    Returns:
        List of configured model options for this function.
    """
    return [
        ModelOption(
            provider="anthropic",
            model_name="claude-sonnet-4-20250514",
            reasoning_effort="medium"  # only for reasoning models
        ),
        ModelOption(
            provider="openai",
            model_name="o1",
            reasoning_effort="medium"
        ),
    ]

Any function that calls an LLM API must have a corresponding function in ProviderPreferences that returns its ModelOption preferences. LLMManager will try to use the first ModelOption, then fall back to following ones upon failure.

Usage

To customize which models are used for a specific function:

  1. Locate docent_core/_llm_util/providers/preferences.py
  2. Find or modify the cached property for the function you want to customize
  3. Specify the [ModelOption][docent_core._llm_util.providers.preferences.ModelOption] objects in the returned list

docent._llm_util.providers.provider_registry

Registry for LLM providers with their configurations.

PROVIDERS module-attribute

PROVIDERS: dict[str, ProviderConfig] = {'anthropic': ProviderConfig(async_client_getter=get_anthropic_client_async, single_output_getter=get_anthropic_chat_completion_async, single_streaming_output_getter=get_anthropic_chat_completion_streaming_async), 'google': ProviderConfig(async_client_getter=get_google_client_async, single_output_getter=get_google_chat_completion_async, single_streaming_output_getter=get_google_chat_completion_streaming_async), 'openai': ProviderConfig(async_client_getter=get_openai_client_async, single_output_getter=get_openai_chat_completion_async, single_streaming_output_getter=get_openai_chat_completion_streaming_async), 'azure_openai': ProviderConfig(async_client_getter=get_azure_openai_client_async, single_output_getter=get_openai_chat_completion_async, single_streaming_output_getter=get_openai_chat_completion_streaming_async), 'openrouter': ProviderConfig(async_client_getter=get_openrouter_client_async, single_output_getter=get_openrouter_chat_completion_async, single_streaming_output_getter=get_openrouter_chat_completion_streaming_async)}

Registry of supported LLM providers with their respective configurations.

SingleOutputGetter

Bases: Protocol

Protocol for getting non-streaming output from an LLM.

Defines the interface for async functions that retrieve a single non-streaming response from an LLM provider.

Source code in docent/_llm_util/providers/provider_registry.py
class SingleOutputGetter(Protocol):
    """Protocol for getting non-streaming output from an LLM.

    Defines the interface for async functions that retrieve a single
    non-streaming response from an LLM provider.
    """

    async def __call__(
        self,
        client: Any,
        messages: list[ChatMessage],
        model_name: str,
        *,
        tools: list[ToolInfo] | None,
        tool_choice: Literal["auto", "required"] | None,
        max_new_tokens: int,
        temperature: float,
        reasoning_effort: Literal["low", "medium", "high"] | None,
        logprobs: bool,
        top_logprobs: int | None,
        timeout: float,
    ) -> LLMOutput:
        """Get a single completion from an LLM.

        Args:
            client: The provider-specific client instance.
            messages: The list of messages in the conversation.
            model_name: The name of the model to use.
            tools: Optional list of tools available to the model.
            tool_choice: Optional specification for tool usage.
            max_new_tokens: Maximum number of tokens to generate.
            temperature: Controls randomness in output generation.
            reasoning_effort: Optional control for model reasoning depth.
            logprobs: Whether to return log probabilities.
            top_logprobs: Number of most likely tokens to return probabilities for.
            timeout: Maximum time to wait for a response in seconds.

        Returns:
            LLMOutput: The model's response.
        """
        ...

__call__ async

__call__(client: Any, messages: list[ChatMessage], model_name: str, *, tools: list[ToolInfo] | None, tool_choice: Literal['auto', 'required'] | None, max_new_tokens: int, temperature: float, reasoning_effort: Literal['low', 'medium', 'high'] | None, logprobs: bool, top_logprobs: int | None, timeout: float) -> LLMOutput

Get a single completion from an LLM.

Parameters:

Name Type Description Default
client Any

The provider-specific client instance.

required
messages list[ChatMessage]

The list of messages in the conversation.

required
model_name str

The name of the model to use.

required
tools list[ToolInfo] | None

Optional list of tools available to the model.

required
tool_choice Literal['auto', 'required'] | None

Optional specification for tool usage.

required
max_new_tokens int

Maximum number of tokens to generate.

required
temperature float

Controls randomness in output generation.

required
reasoning_effort Literal['low', 'medium', 'high'] | None

Optional control for model reasoning depth.

required
logprobs bool

Whether to return log probabilities.

required
top_logprobs int | None

Number of most likely tokens to return probabilities for.

required
timeout float

Maximum time to wait for a response in seconds.

required

Returns:

Name Type Description
LLMOutput LLMOutput

The model's response.

Source code in docent/_llm_util/providers/provider_registry.py
async def __call__(
    self,
    client: Any,
    messages: list[ChatMessage],
    model_name: str,
    *,
    tools: list[ToolInfo] | None,
    tool_choice: Literal["auto", "required"] | None,
    max_new_tokens: int,
    temperature: float,
    reasoning_effort: Literal["low", "medium", "high"] | None,
    logprobs: bool,
    top_logprobs: int | None,
    timeout: float,
) -> LLMOutput:
    """Get a single completion from an LLM.

    Args:
        client: The provider-specific client instance.
        messages: The list of messages in the conversation.
        model_name: The name of the model to use.
        tools: Optional list of tools available to the model.
        tool_choice: Optional specification for tool usage.
        max_new_tokens: Maximum number of tokens to generate.
        temperature: Controls randomness in output generation.
        reasoning_effort: Optional control for model reasoning depth.
        logprobs: Whether to return log probabilities.
        top_logprobs: Number of most likely tokens to return probabilities for.
        timeout: Maximum time to wait for a response in seconds.

    Returns:
        LLMOutput: The model's response.
    """
    ...

SingleStreamingOutputGetter

Bases: Protocol

Protocol for getting streaming output from an LLM.

Defines the interface for async functions that retrieve streaming responses from an LLM provider.

Source code in docent/_llm_util/providers/provider_registry.py
class SingleStreamingOutputGetter(Protocol):
    """Protocol for getting streaming output from an LLM.

    Defines the interface for async functions that retrieve streaming
    responses from an LLM provider.
    """

    async def __call__(
        self,
        client: Any,
        streaming_callback: AsyncSingleLLMOutputStreamingCallback | None,
        messages: list[ChatMessage],
        model_name: str,
        *,
        tools: list[ToolInfo] | None,
        tool_choice: Literal["auto", "required"] | None,
        max_new_tokens: int,
        temperature: float,
        reasoning_effort: Literal["low", "medium", "high"] | None,
        logprobs: bool,
        top_logprobs: int | None,
        timeout: float,
    ) -> LLMOutput:
        """Get a streaming completion from an LLM.

        Args:
            client: The provider-specific client instance.
            streaming_callback: Optional callback for processing streaming chunks.
            messages: The list of messages in the conversation.
            model_name: The name of the model to use.
            tools: Optional list of tools available to the model.
            tool_choice: Optional specification for tool usage.
            max_new_tokens: Maximum number of tokens to generate.
            temperature: Controls randomness in output generation.
            reasoning_effort: Optional control for model reasoning depth.
            logprobs: Whether to return log probabilities.
            top_logprobs: Number of most likely tokens to return probabilities for.
            timeout: Maximum time to wait for a response in seconds.

        Returns:
            LLMOutput: The complete model response after streaming finishes.
        """
        ...

__call__ async

__call__(client: Any, streaming_callback: AsyncSingleLLMOutputStreamingCallback | None, messages: list[ChatMessage], model_name: str, *, tools: list[ToolInfo] | None, tool_choice: Literal['auto', 'required'] | None, max_new_tokens: int, temperature: float, reasoning_effort: Literal['low', 'medium', 'high'] | None, logprobs: bool, top_logprobs: int | None, timeout: float) -> LLMOutput

Get a streaming completion from an LLM.

Parameters:

Name Type Description Default
client Any

The provider-specific client instance.

required
streaming_callback AsyncSingleLLMOutputStreamingCallback | None

Optional callback for processing streaming chunks.

required
messages list[ChatMessage]

The list of messages in the conversation.

required
model_name str

The name of the model to use.

required
tools list[ToolInfo] | None

Optional list of tools available to the model.

required
tool_choice Literal['auto', 'required'] | None

Optional specification for tool usage.

required
max_new_tokens int

Maximum number of tokens to generate.

required
temperature float

Controls randomness in output generation.

required
reasoning_effort Literal['low', 'medium', 'high'] | None

Optional control for model reasoning depth.

required
logprobs bool

Whether to return log probabilities.

required
top_logprobs int | None

Number of most likely tokens to return probabilities for.

required
timeout float

Maximum time to wait for a response in seconds.

required

Returns:

Name Type Description
LLMOutput LLMOutput

The complete model response after streaming finishes.

Source code in docent/_llm_util/providers/provider_registry.py
async def __call__(
    self,
    client: Any,
    streaming_callback: AsyncSingleLLMOutputStreamingCallback | None,
    messages: list[ChatMessage],
    model_name: str,
    *,
    tools: list[ToolInfo] | None,
    tool_choice: Literal["auto", "required"] | None,
    max_new_tokens: int,
    temperature: float,
    reasoning_effort: Literal["low", "medium", "high"] | None,
    logprobs: bool,
    top_logprobs: int | None,
    timeout: float,
) -> LLMOutput:
    """Get a streaming completion from an LLM.

    Args:
        client: The provider-specific client instance.
        streaming_callback: Optional callback for processing streaming chunks.
        messages: The list of messages in the conversation.
        model_name: The name of the model to use.
        tools: Optional list of tools available to the model.
        tool_choice: Optional specification for tool usage.
        max_new_tokens: Maximum number of tokens to generate.
        temperature: Controls randomness in output generation.
        reasoning_effort: Optional control for model reasoning depth.
        logprobs: Whether to return log probabilities.
        top_logprobs: Number of most likely tokens to return probabilities for.
        timeout: Maximum time to wait for a response in seconds.

    Returns:
        LLMOutput: The complete model response after streaming finishes.
    """
    ...

ProviderConfig

Bases: TypedDict

Configuration for an LLM provider.

Contains the necessary functions to create clients and interact with a specific LLM provider.

Attributes:

Name Type Description
async_client_getter Callable[[str | None], Any]

Function to get an async client for the provider.

single_output_getter SingleOutputGetter

Function to get a non-streaming completion.

single_streaming_output_getter SingleStreamingOutputGetter

Function to get a streaming completion.

Source code in docent/_llm_util/providers/provider_registry.py
class ProviderConfig(TypedDict):
    """Configuration for an LLM provider.

    Contains the necessary functions to create clients and interact with
    a specific LLM provider.

    Attributes:
        async_client_getter: Function to get an async client for the provider.
        single_output_getter: Function to get a non-streaming completion.
        single_streaming_output_getter: Function to get a streaming completion.
    """

    async_client_getter: Callable[[str | None], Any]
    single_output_getter: SingleOutputGetter
    single_streaming_output_getter: SingleStreamingOutputGetter

docent._llm_util.providers.preference_types

Provides preferences of which LLM models to use for different Docent functions.

ModelOption

Bases: BaseModel

Configuration for a specific model from a provider. Not to be confused with ModelInfo.

Attributes:

Name Type Description
provider str

The name of the LLM provider (e.g., "openai", "anthropic").

model_name str

The specific model to use from the provider.

reasoning_effort Literal['minimal', 'low', 'medium', 'high'] | None

Optional indication of computational effort to use.

Source code in docent/_llm_util/providers/preference_types.py
class ModelOption(BaseModel):
    """Configuration for a specific model from a provider. Not to be confused with ModelInfo.

    Attributes:
        provider: The name of the LLM provider (e.g., "openai", "anthropic").
        model_name: The specific model to use from the provider.
        reasoning_effort: Optional indication of computational effort to use.
    """

    provider: str
    model_name: str
    reasoning_effort: Literal["minimal", "low", "medium", "high"] | None = None

ModelOptionWithContext

Bases: BaseModel

Enhanced model option that includes context window information for frontend use. Not to be confused with ModelInfo or ModelOption.

Attributes:

Name Type Description
provider str

The name of the LLM provider (e.g., "openai", "anthropic").

model_name str

The specific model to use from the provider.

reasoning_effort Literal['minimal', 'low', 'medium', 'high'] | None

Optional indication of computational effort to use.

context_window int

The context window size in tokens.

uses_byok bool

Whether this model would use the user's own API key.

Source code in docent/_llm_util/providers/preference_types.py
class ModelOptionWithContext(BaseModel):
    """Enhanced model option that includes context window information for frontend use.
    Not to be confused with ModelInfo or ModelOption.

    Attributes:
        provider: The name of the LLM provider (e.g., "openai", "anthropic").
        model_name: The specific model to use from the provider.
        reasoning_effort: Optional indication of computational effort to use.
        context_window: The context window size in tokens.
        uses_byok: Whether this model would use the user's own API key.
    """

    provider: str
    model_name: str
    reasoning_effort: Literal["minimal", "low", "medium", "high"] | None = None
    context_window: int
    uses_byok: bool

    @classmethod
    def from_model_option(
        cls, model_option: ModelOption, uses_byok: bool = False
    ) -> "ModelOptionWithContext":
        """Create a ModelOptionWithContext from a ModelOption.

        Args:
            model_option: The base model option
            uses_byok: Whether this model requires bring-your-own-key

        Returns:
            ModelOptionWithContext with context window looked up from global mapping
        """
        context_window = get_context_window(model_option.model_name)

        return cls(
            provider=model_option.provider,
            model_name=model_option.model_name,
            reasoning_effort=model_option.reasoning_effort,
            context_window=context_window,
            uses_byok=uses_byok,
        )

from_model_option classmethod

from_model_option(model_option: ModelOption, uses_byok: bool = False) -> ModelOptionWithContext

Create a ModelOptionWithContext from a ModelOption.

Parameters:

Name Type Description Default
model_option ModelOption

The base model option

required
uses_byok bool

Whether this model requires bring-your-own-key

False

Returns:

Type Description
ModelOptionWithContext

ModelOptionWithContext with context window looked up from global mapping

Source code in docent/_llm_util/providers/preference_types.py
@classmethod
def from_model_option(
    cls, model_option: ModelOption, uses_byok: bool = False
) -> "ModelOptionWithContext":
    """Create a ModelOptionWithContext from a ModelOption.

    Args:
        model_option: The base model option
        uses_byok: Whether this model requires bring-your-own-key

    Returns:
        ModelOptionWithContext with context window looked up from global mapping
    """
    context_window = get_context_window(model_option.model_name)

    return cls(
        provider=model_option.provider,
        model_name=model_option.model_name,
        reasoning_effort=model_option.reasoning_effort,
        context_window=context_window,
        uses_byok=uses_byok,
    )

PublicProviderPreferences

Bases: BaseModel

Source code in docent/_llm_util/providers/preference_types.py
class PublicProviderPreferences(BaseModel):
    @cached_property
    def default_judge_models(self) -> list[ModelOption]:
        """Judge models that any user can access without providing their own API key"""

        return [
            ModelOption(provider="openai", model_name="gpt-5", reasoning_effort="medium"),
            ModelOption(provider="openai", model_name="gpt-5", reasoning_effort="low"),
            ModelOption(provider="openai", model_name="gpt-5", reasoning_effort="high"),
            ModelOption(provider="openai", model_name="gpt-5-mini", reasoning_effort="low"),
            ModelOption(provider="openai", model_name="gpt-5-mini", reasoning_effort="medium"),
            ModelOption(provider="openai", model_name="gpt-5-mini", reasoning_effort="high"),
            ModelOption(
                provider="anthropic",
                model_name="claude-sonnet-4-20250514",
                reasoning_effort="medium",
            ),
        ]

default_judge_models cached property

default_judge_models: list[ModelOption]

Judge models that any user can access without providing their own API key