LLM providers and calls

Docent uses a unified interface to call and aggregate results from different LLM providers.

Provider registry

Each LLM provider is specified through a [ProviderConfig][docent_core._llm_util.providers.registry.ProviderConfig] object, which requires three functions:

async_client_getter: Returns an async client for the provider
single_output_getter: Gets a single completion from the provider, compatible with the [AsyncSingleOutputGetter][docent_core._llm_util.providers.registry.SingleOutputGetter] protocol
single_streaming_output_getter: Gets a streaming completion from the provider, compatible with the [AsyncSingleStreamingOutputGetter][docent_core._llm_util.providers.registry.SingleStreamingOutputGetter] protocol

We currently support anthropic, openai, and azure_openai.

Adding a new provider

Create a new module in docent_core/_llm_util/providers/ (e.g., my_provider.py)
Implement the functions required by ProviderConfig
Add the provider to the [PROVIDERS][docent_core._llm_util.providers.registry.PROVIDERS] dictionary in registry.py

Selecting models for Docent functions

Docent uses a preference system to determine which LLM models to use for different functions. [ProviderPreferences][docent_core._llm_util.providers.preferences.ProviderPreferences] manages the mapping between Docent functions and their ordered preference of [ModelOption][docent_core._llm_util.providers.preferences.ModelOption] objects:

@cached_property
def function_name(self) -> list[ModelOption]:
    """Get model options for the function_name function.

    Returns:
        List of configured model options for this function.
    """
    return [
        ModelOption(
            provider="anthropic",
            model_name="claude-sonnet-4-20250514",
            reasoning_effort="medium"  # only for reasoning models
        ),
        ModelOption(
            provider="openai",
            model_name="o1",
            reasoning_effort="medium"
        ),
    ]

Any function that calls an LLM API must have a corresponding function in ProviderPreferences that returns its ModelOption preferences. LLMManager will try to use the first ModelOption, then fall back to following ones upon failure.

Usage

To customize which models are used for a specific function:

Locate docent_core/_llm_util/providers/preferences.py
Find or modify the cached property for the function you want to customize
Specify the [ModelOption][docent_core._llm_util.providers.preferences.ModelOption] objects in the returned list

docent._llm_util.providers.provider_registry

Registry for LLM providers with their configurations.

PROVIDERS `module-attribute`

PROVIDERS: dict[str, ProviderConfig] = {'anthropic': ProviderConfig(async_client_getter=get_anthropic_client_async, single_output_getter=get_anthropic_chat_completion_async, single_streaming_output_getter=get_anthropic_chat_completion_streaming_async), 'google': ProviderConfig(async_client_getter=get_google_client_async, single_output_getter=get_google_chat_completion_async, single_streaming_output_getter=get_google_chat_completion_streaming_async), 'openai': ProviderConfig(async_client_getter=get_openai_client_async, single_output_getter=get_openai_chat_completion_async, single_streaming_output_getter=get_openai_chat_completion_streaming_async), 'azure_openai': ProviderConfig(async_client_getter=get_azure_openai_client_async, single_output_getter=get_openai_chat_completion_async, single_streaming_output_getter=get_openai_chat_completion_streaming_async), 'openrouter': ProviderConfig(async_client_getter=get_openrouter_client_async, single_output_getter=get_openrouter_chat_completion_async, single_streaming_output_getter=get_openrouter_chat_completion_streaming_async)}

Registry of supported LLM providers with their respective configurations.

SingleOutputGetter

Bases: Protocol

Protocol for getting non-streaming output from an LLM.

Defines the interface for async functions that retrieve a single non-streaming response from an LLM provider.

Source code in docent/_llm_util/providers/provider_registry.py

class SingleOutputGetter(Protocol):
    """Protocol for getting non-streaming output from an LLM.

    Defines the interface for async functions that retrieve a single
    non-streaming response from an LLM provider.
    """

    async def __call__(
        self,
        client: Any,
        messages: list[ChatMessage],
        model_name: str,
        *,
        tools: list[ToolInfo] | None,
        tool_choice: Literal["auto", "required"] | None,
        max_new_tokens: int,
        temperature: float,
        reasoning_effort: Literal["low", "medium", "high"] | None,
        logprobs: bool,
        top_logprobs: int | None,
        timeout: float,
    ) -> LLMOutput:
        """Get a single completion from an LLM.

        Args:
            client: The provider-specific client instance.
            messages: The list of messages in the conversation.
            model_name: The name of the model to use.
            tools: Optional list of tools available to the model.
            tool_choice: Optional specification for tool usage.
            max_new_tokens: Maximum number of tokens to generate.
            temperature: Controls randomness in output generation.
            reasoning_effort: Optional control for model reasoning depth.
            logprobs: Whether to return log probabilities.
            top_logprobs: Number of most likely tokens to return probabilities for.
            timeout: Maximum time to wait for a response in seconds.

        Returns:
            LLMOutput: The model's response.
        """
        ...

call `async`

__call__(client: Any, messages: list[ChatMessage], model_name: str, *, tools: list[ToolInfo] | None, tool_choice: Literal['auto', 'required'] | None, max_new_tokens: int, temperature: float, reasoning_effort: Literal['low', 'medium', 'high'] | None, logprobs: bool, top_logprobs: int | None, timeout: float) -> LLMOutput

Get a single completion from an LLM.

Parameters:

Name	Type	Description	Default
`client`	`Any`	The provider-specific client instance.	required
`messages`	`list[ChatMessage]`	The list of messages in the conversation.	required
`model_name`	`str`	The name of the model to use.	required
`tools`	`list[ToolInfo] \| None`	Optional list of tools available to the model.	required
`tool_choice`	`Literal['auto', 'required'] \| None`	Optional specification for tool usage.	required
`max_new_tokens`	`int`	Maximum number of tokens to generate.	required
`temperature`	`float`	Controls randomness in output generation.	required
`reasoning_effort`	`Literal['low', 'medium', 'high'] \| None`	Optional control for model reasoning depth.	required
`logprobs`	`bool`	Whether to return log probabilities.	required
`top_logprobs`	`int \| None`	Number of most likely tokens to return probabilities for.	required
`timeout`	`float`	Maximum time to wait for a response in seconds.	required

Returns:

Name	Type	Description
`LLMOutput`	`LLMOutput`	The model's response.

Source code in docent/_llm_util/providers/provider_registry.py

async def __call__(
    self,
    client: Any,
    messages: list[ChatMessage],
    model_name: str,
    *,
    tools: list[ToolInfo] | None,
    tool_choice: Literal["auto", "required"] | None,
    max_new_tokens: int,
    temperature: float,
    reasoning_effort: Literal["low", "medium", "high"] | None,
    logprobs: bool,
    top_logprobs: int | None,
    timeout: float,
) -> LLMOutput:
    """Get a single completion from an LLM.

    Args:
        client: The provider-specific client instance.
        messages: The list of messages in the conversation.
        model_name: The name of the model to use.
        tools: Optional list of tools available to the model.
        tool_choice: Optional specification for tool usage.
        max_new_tokens: Maximum number of tokens to generate.
        temperature: Controls randomness in output generation.
        reasoning_effort: Optional control for model reasoning depth.
        logprobs: Whether to return log probabilities.
        top_logprobs: Number of most likely tokens to return probabilities for.
        timeout: Maximum time to wait for a response in seconds.

    Returns:
        LLMOutput: The model's response.
    """
    ...

SingleStreamingOutputGetter

Bases: Protocol

Protocol for getting streaming output from an LLM.

Defines the interface for async functions that retrieve streaming responses from an LLM provider.

Source code in docent/_llm_util/providers/provider_registry.py

class SingleStreamingOutputGetter(Protocol):
    """Protocol for getting streaming output from an LLM.

    Defines the interface for async functions that retrieve streaming
    responses from an LLM provider.
    """

    async def __call__(
        self,
        client: Any,
        streaming_callback: AsyncSingleLLMOutputStreamingCallback | None,
        messages: list[ChatMessage],
        model_name: str,
        *,
        tools: list[ToolInfo] | None,
        tool_choice: Literal["auto", "required"] | None,
        max_new_tokens: int,
        temperature: float,
        reasoning_effort: Literal["low", "medium", "high"] | None,
        logprobs: bool,
        top_logprobs: int | None,
        timeout: float,
    ) -> LLMOutput:
        """Get a streaming completion from an LLM.

        Args:
            client: The provider-specific client instance.
            streaming_callback: Optional callback for processing streaming chunks.
            messages: The list of messages in the conversation.
            model_name: The name of the model to use.
            tools: Optional list of tools available to the model.
            tool_choice: Optional specification for tool usage.
            max_new_tokens: Maximum number of tokens to generate.
            temperature: Controls randomness in output generation.
            reasoning_effort: Optional control for model reasoning depth.
            logprobs: Whether to return log probabilities.
            top_logprobs: Number of most likely tokens to return probabilities for.
            timeout: Maximum time to wait for a response in seconds.

        Returns:
            LLMOutput: The complete model response after streaming finishes.
        """
        ...

call `async`

__call__(client: Any, streaming_callback: AsyncSingleLLMOutputStreamingCallback | None, messages: list[ChatMessage], model_name: str, *, tools: list[ToolInfo] | None, tool_choice: Literal['auto', 'required'] | None, max_new_tokens: int, temperature: float, reasoning_effort: Literal['low', 'medium', 'high'] | None, logprobs: bool, top_logprobs: int | None, timeout: float) -> LLMOutput

Get a streaming completion from an LLM.

Parameters:

Name	Type	Description	Default
`client`	`Any`	The provider-specific client instance.	required
`streaming_callback`	`AsyncSingleLLMOutputStreamingCallback \| None`	Optional callback for processing streaming chunks.	required
`messages`	`list[ChatMessage]`	The list of messages in the conversation.	required
`model_name`	`str`	The name of the model to use.	required
`tools`	`list[ToolInfo] \| None`	Optional list of tools available to the model.	required
`tool_choice`	`Literal['auto', 'required'] \| None`	Optional specification for tool usage.	required
`max_new_tokens`	`int`	Maximum number of tokens to generate.	required
`temperature`	`float`	Controls randomness in output generation.	required
`reasoning_effort`	`Literal['low', 'medium', 'high'] \| None`	Optional control for model reasoning depth.	required
`logprobs`	`bool`	Whether to return log probabilities.	required
`top_logprobs`	`int \| None`	Number of most likely tokens to return probabilities for.	required
`timeout`	`float`	Maximum time to wait for a response in seconds.	required

Returns:

Name	Type	Description
`LLMOutput`	`LLMOutput`	The complete model response after streaming finishes.

Source code in docent/_llm_util/providers/provider_registry.py

async def __call__(
    self,
    client: Any,
    streaming_callback: AsyncSingleLLMOutputStreamingCallback | None,
    messages: list[ChatMessage],
    model_name: str,
    *,
    tools: list[ToolInfo] | None,
    tool_choice: Literal["auto", "required"] | None,
    max_new_tokens: int,
    temperature: float,
    reasoning_effort: Literal["low", "medium", "high"] | None,
    logprobs: bool,
    top_logprobs: int | None,
    timeout: float,
) -> LLMOutput:
    """Get a streaming completion from an LLM.

    Args:
        client: The provider-specific client instance.
        streaming_callback: Optional callback for processing streaming chunks.
        messages: The list of messages in the conversation.
        model_name: The name of the model to use.
        tools: Optional list of tools available to the model.
        tool_choice: Optional specification for tool usage.
        max_new_tokens: Maximum number of tokens to generate.
        temperature: Controls randomness in output generation.
        reasoning_effort: Optional control for model reasoning depth.
        logprobs: Whether to return log probabilities.
        top_logprobs: Number of most likely tokens to return probabilities for.
        timeout: Maximum time to wait for a response in seconds.

    Returns:
        LLMOutput: The complete model response after streaming finishes.
    """
    ...

ProviderConfig

Bases: TypedDict

Configuration for an LLM provider.

Contains the necessary functions to create clients and interact with a specific LLM provider.

Attributes:

Name	Type	Description
`async_client_getter`	`Callable[[str \| None], Any]`	Function to get an async client for the provider.
`single_output_getter`	`SingleOutputGetter`	Function to get a non-streaming completion.
`single_streaming_output_getter`	`SingleStreamingOutputGetter`	Function to get a streaming completion.

Source code in docent/_llm_util/providers/provider_registry.py

class ProviderConfig(TypedDict):
    """Configuration for an LLM provider.

    Contains the necessary functions to create clients and interact with
    a specific LLM provider.

    Attributes:
        async_client_getter: Function to get an async client for the provider.
        single_output_getter: Function to get a non-streaming completion.
        single_streaming_output_getter: Function to get a streaming completion.
    """

    async_client_getter: Callable[[str | None], Any]
    single_output_getter: SingleOutputGetter
    single_streaming_output_getter: SingleStreamingOutputGetter

docent._llm_util.providers.preference_types

Provides preferences of which LLM models to use for different Docent functions.

ModelOption

Bases: BaseModel

Configuration for a specific model from a provider. Not to be confused with ModelInfo.

Attributes:

Name	Type	Description
`provider`	`str`	The name of the LLM provider (e.g., "openai", "anthropic").
`model_name`	`str`	The specific model to use from the provider.
`reasoning_effort`	`Literal['minimal', 'low', 'medium', 'high'] \| None`	Optional indication of computational effort to use.

Source code in docent/_llm_util/providers/preference_types.py

class ModelOption(BaseModel):
    """Configuration for a specific model from a provider. Not to be confused with ModelInfo.

    Attributes:
        provider: The name of the LLM provider (e.g., "openai", "anthropic").
        model_name: The specific model to use from the provider.
        reasoning_effort: Optional indication of computational effort to use.
    """

    provider: str
    model_name: str
    reasoning_effort: Literal["minimal", "low", "medium", "high"] | None = None

ModelOptionWithContext

Bases: BaseModel

Enhanced model option that includes context window information for frontend use. Not to be confused with ModelInfo or ModelOption.

Attributes:

Name	Type	Description
`provider`	`str`	The name of the LLM provider (e.g., "openai", "anthropic").
`model_name`	`str`	The specific model to use from the provider.
`reasoning_effort`	`Literal['minimal', 'low', 'medium', 'high'] \| None`	Optional indication of computational effort to use.
`context_window`	`int`	The context window size in tokens.
`uses_byok`	`bool`	Whether this model would use the user's own API key.

Source code in docent/_llm_util/providers/preference_types.py

class ModelOptionWithContext(BaseModel):
    """Enhanced model option that includes context window information for frontend use.
    Not to be confused with ModelInfo or ModelOption.

    Attributes:
        provider: The name of the LLM provider (e.g., "openai", "anthropic").
        model_name: The specific model to use from the provider.
        reasoning_effort: Optional indication of computational effort to use.
        context_window: The context window size in tokens.
        uses_byok: Whether this model would use the user's own API key.
    """

    provider: str
    model_name: str
    reasoning_effort: Literal["minimal", "low", "medium", "high"] | None = None
    context_window: int
    uses_byok: bool

    @classmethod
    def from_model_option(
        cls, model_option: ModelOption, uses_byok: bool = False
    ) -> "ModelOptionWithContext":
        """Create a ModelOptionWithContext from a ModelOption.

        Args:
            model_option: The base model option
            uses_byok: Whether this model requires bring-your-own-key

        Returns:
            ModelOptionWithContext with context window looked up from global mapping
        """
        context_window = get_context_window(model_option.model_name)

        return cls(
            provider=model_option.provider,
            model_name=model_option.model_name,
            reasoning_effort=model_option.reasoning_effort,
            context_window=context_window,
            uses_byok=uses_byok,
        )

from_model_option `classmethod`

from_model_option(model_option: ModelOption, uses_byok: bool = False) -> ModelOptionWithContext

Create a ModelOptionWithContext from a ModelOption.

Parameters:

Name	Type	Description	Default
`model_option`	`ModelOption`	The base model option	required
`uses_byok`	`bool`	Whether this model requires bring-your-own-key	`False`

Returns:

Type	Description
`ModelOptionWithContext`	ModelOptionWithContext with context window looked up from global mapping

Source code in docent/_llm_util/providers/preference_types.py

@classmethod
def from_model_option(
    cls, model_option: ModelOption, uses_byok: bool = False
) -> "ModelOptionWithContext":
    """Create a ModelOptionWithContext from a ModelOption.

    Args:
        model_option: The base model option
        uses_byok: Whether this model requires bring-your-own-key

    Returns:
        ModelOptionWithContext with context window looked up from global mapping
    """
    context_window = get_context_window(model_option.model_name)

    return cls(
        provider=model_option.provider,
        model_name=model_option.model_name,
        reasoning_effort=model_option.reasoning_effort,
        context_window=context_window,
        uses_byok=uses_byok,
    )

PublicProviderPreferences

Bases: BaseModel

Source code in docent/_llm_util/providers/preference_types.py

class PublicProviderPreferences(BaseModel):
    @cached_property
    def default_judge_models(self) -> list[ModelOption]:
        """Judge models that any user can access without providing their own API key"""

        return [
            ModelOption(provider="openai", model_name="gpt-5", reasoning_effort="medium"),
            ModelOption(provider="openai", model_name="gpt-5", reasoning_effort="low"),
            ModelOption(provider="openai", model_name="gpt-5", reasoning_effort="high"),
            ModelOption(provider="openai", model_name="gpt-5-mini", reasoning_effort="low"),
            ModelOption(provider="openai", model_name="gpt-5-mini", reasoning_effort="medium"),
            ModelOption(provider="openai", model_name="gpt-5-mini", reasoning_effort="high"),
            ModelOption(
                provider="anthropic",
                model_name="claude-sonnet-4-20250514",
                reasoning_effort="medium",
            ),
        ]

default_judge_models `cached` `property`

default_judge_models: list[ModelOption]

Judge models that any user can access without providing their own API key

LLM providers and calls

Provider registry

Adding a new provider

Selecting models for Docent functions

Usage

docent._llm_util.providers.provider_registry

PROVIDERS module-attribute

SingleOutputGetter

__call__ async

SingleStreamingOutputGetter

__call__ async

ProviderConfig

docent._llm_util.providers.preference_types

ModelOption

ModelOptionWithContext

from_model_option classmethod

PublicProviderPreferences

default_judge_models cached property

PROVIDERS `module-attribute`

call `async`

call `async`

from_model_option `classmethod`

default_judge_models `cached` `property`