tenant_llm_service.py

Overview

The tenant_llm_service.py file provides an abstraction layer for managing and interacting with tenant-specific Large Language Models (LLMs) within the InfiniFlow platform. It primarily defines services that handle the retrieval, configuration, instantiation, and usage tracking of various LLMs associated with tenants. This includes models of different types such as Chat, Embedding, Speech2Text, Image2Text, Rerank, and Text-to-Speech (TTS).

The file encapsulates logic for:

Fetching tenant-specific API keys and model configurations.
Creating instances of models based on tenant configurations.
Tracking token usage for billing or monitoring.
Resolving models and factories from composite names.
Integrating Langfuse for telemetry and traceability.

This service plays a crucial role in customizing LLM interactions per tenant, enabling multi-tenancy support, and ensuring usage accountability.

Classes and Functions

1. `LLMFactoriesService(CommonService)`

Purpose:
Service class for managing LLM factories metadata.

Attributes:

model = LLMFactories: Links this service to the LLMFactories database model.

Usage:
Primarily used to fetch and manage available LLM factory information.

2. `TenantLLMService(CommonService)`

Purpose:
Core service handling tenant-specific LLMs, including querying API keys, getting model configs, instantiating model objects, and updating usage metrics.

Attributes:

model = TenantLLM: Connects the service to the TenantLLM database model.

Methods:

`get_api_key(cls, tenant_id: str, model_name: str) -> TenantLLM | None`

Description:
Fetches the LLM configuration entry for a given tenant and model name. Handles cases where the model name includes a factory suffix (e.g., model@factory), and attempts fallback queries for known factory aliases.

Parameters:

tenant_id: Tenant identifier.
model_name: Name of the LLM model, optionally with factory suffix.

Returns:

The first matching TenantLLM object or None if not found.

Example:

api_key_obj = TenantLLMService.get_api_key("tenant123", "gpt-4@OpenAI")

`get_my_llms(cls, tenant_id: str) -> list[dict]`

Description:
Returns a list of LLMs configured for the tenant that have valid API keys, including factory metadata like logos and tags.

Parameters:

tenant_id: Tenant identifier.

Returns:

List of dictionaries containing tenant LLM details.

`split_model_name_and_factory(model_name: str) -> tuple[str, str | None]`

Description:
Utility method to split a composite model name into the base model name and factory suffix.

Parameters:

model_name: The full model string (e.g., "gpt-4@OpenAI").

Returns:

Tuple (model_name, factory_name) or (model_name, None) if no factory suffix is present.

Example:

model, factory = TenantLLMService.split_model_name_and_factory("gpt-4@OpenAI")
# model = "gpt-4", factory = "OpenAI"

`get_model_config(cls, tenant_id: str, llm_type: str, llm_name: str | None = None) -> dict`

Description:
Retrieves the LLM configuration dictionary for a given tenant and LLM type, optionally specifying a model name. Supports all major LLM types and includes fallback logic with error handling.

Parameters:

tenant_id: Tenant identifier.
llm_type: Type of LLM (embedding, chat, speech2text, etc.).
llm_name: Optional specific model name to fetch.

Returns:

Dictionary containing model configuration fields such as API key, factory name, model name, API base URL, and tool usage flag.

Raises:

LookupError if tenant or model not found or unauthorized.

`model_instance(cls, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs) -> object | None`

Description:
Creates and returns an instantiated model object for the tenant's LLM, depending on the type. Supports models from various factories and passes relevant API keys and configurations.

Parameters:

tenant_id: Tenant identifier.
llm_type: The LLM type (embedding, chat, etc.).
llm_name: Optional model name.
lang: Language preference, default "Chinese".
**kwargs: Additional parameters passed to the model constructor.

Returns:

An instance of the requested LLM class or None if unsupported.

Example:

chat_model = TenantLLMService.model_instance("tenant123", LLMType.CHAT.value, "gpt-4@OpenAI")

`increase_usage(cls, tenant_id: str, llm_type: str, used_tokens: int, llm_name: str | None = None) -> int`

Description:
Increments the token usage counter for a tenant's LLM by the specified number of tokens. Handles tenant lookup and updates the database accordingly.

Parameters:

tenant_id: Tenant identifier.
llm_type: Type of the LLM.
used_tokens: Number of tokens to increment.
llm_name: Optional model name override.

Returns:

Number of updated rows (typically 1 if successful, 0 otherwise).

`get_openai_models(cls) -> list[dict]`

Description:
Fetches all tenant LLM models with the factory "OpenAI" excluding certain embedding models.

Returns:

List of dictionaries representing OpenAI models.

`llm_id2llm_type(llm_id: str) -> str | None`

Description:
Maps an LLM identifier to its model type by checking factory info and database records.

Parameters:

llm_id: The LLM identifier string.

Returns:

The LLM model type string or None if not found.

3. `LLM4Tenant`

Purpose:
High-level wrapper that encapsulates an instantiated tenant LLM model, including configuration, usage limits, and telemetry integration with Langfuse.

Constructor: `init(self, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs)`

Description:
Initializes the tenant-specific LLM instance, retrieves the model configuration, sets token limits, and establishes Langfuse telemetry if available.

Parameters:

tenant_id: Tenant identifier.
llm_type: LLM type.
llm_name: Optional model name.
lang: Language code (default "Chinese").
**kwargs: Additional options (e.g., verbose_tool_use).

Raises:

AssertionError if the model instance cannot be created.

Properties Set:

self.mdl: The instantiated model object.
self.max_length: Max token length allowed.
self.is_tools: Flag indicating if the model uses external tools.
self.verbose_tool_use: Verbosity flag for tool usage.
self.langfuse: Langfuse client instance if telemetry keys are found.
self.trace_context: Dictionary with trace ID for Langfuse tracing.

Important Implementation Details

Factory and Model Name Parsing:
The system supports composite model names in the form model_name@factory_name. The split_model_name_and_factory method carefully handles parsing, including cases with multiple '@' characters.
Fallback Logic for Model Lookup:
When looking up models, the service attempts to match factory suffixes and, if not found, tries various known aliases such as "LocalAI", "HuggingFace", and "OpenAI-API-Compatible" to increase robustness.
Multi-Type LLM Support:
The service handles multiple LLM types, each possibly backed by different model classes or providers, and instantiates them accordingly.
Langfuse Integration:
LLM4Tenant integrates with Langfuse for telemetry if tenant-specific keys are available, enabling traceability of requests.
Database Context Management:
Most service methods are decorated with @DB.connection_context() to ensure proper database session management.

Interaction with Other System Components

Database Models:
- LLMFactories: Contains metadata about LLM providers.
- TenantLLM: Stores tenant-specific LLM configurations and usage stats.
- DB: Database connection context manager.
Other Services:
- TenantService: For tenant data retrieval.
- LLMService: For querying general LLM metadata.
- TenantLangfuseService: For fetching Langfuse telemetry keys.
LLM Model Classes:
- ChatModel, EmbeddingModel, CvModel, RerankModel, Seq2txtModel, TTSModel from rag.llm provide concrete implementations for different model types and factories.
Settings:
- Uses settings.FACTORY_LLM_INFOS to get known LLM factory providers.
Langfuse SDK:
- Used within LLM4Tenant for telemetry and tracing.

Usage Example

# Instantiate a chat model for a tenant
llm_wrapper = LLM4Tenant(tenant_id="tenant123", llm_type=LLMType.CHAT.value, llm_name="gpt-4@OpenAI")

# Use the model instance
response = llm_wrapper.mdl.chat("Hello, how can I help you?")

# Increase usage tokens after processing
TenantLLMService.increase_usage("tenant123", LLMType.CHAT.value, used_tokens=50, llm_name="gpt-4@OpenAI")

Mermaid Class Diagram

classDiagram
    class LLMFactoriesService {
        +model: LLMFactories
    }

    class TenantLLMService {
        +model: TenantLLM
        +get_api_key(tenant_id, model_name)
        +get_my_llms(tenant_id)
        +split_model_name_and_factory(model_name)
        +get_model_config(tenant_id, llm_type, llm_name=None)
        +model_instance(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
        +increase_usage(tenant_id, llm_type, used_tokens, llm_name=None)
        +get_openai_models()
        +llm_id2llm_type(llm_id)
    }

    class LLM4Tenant {
        -tenant_id: str
        -llm_type: str
        -llm_name: str
        -mdl: object
        -max_length: int
        -is_tools: bool
        -verbose_tool_use: bool
        -langfuse: Langfuse | None
        -trace_context: dict
        +__init__(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
    }

    LLMFactoriesService --|> CommonService
    TenantLLMService --|> CommonService
    LLM4Tenant --> TenantLLMService : uses
    TenantLLMService --> DB : manages connection
    TenantLLMService --> TenantService : fetch tenant info
    TenantLLMService --> LLMService : queries model metadata
    LLM4Tenant --> Langfuse : telemetry integration

Summary

The tenant_llm_service.py file is a foundational component in the InfiniFlow platform that enables tenant-specific management of LLMs across multiple model types and providers. It abstracts configuration retrieval, model instantiation, usage tracking, and telemetry integration, serving as a critical bridge between tenant data, LLM providers, and application logic. The file ensures flexibility and extensibility through careful factory and model name management, robust error handling, and adherence to multi-tenancy principles.

tenant_llm_service.py

Overview

Classes and Functions

1. LLMFactoriesService(CommonService)

2. TenantLLMService(CommonService)

Methods:

get_api_key(cls, tenant_id: str, model_name: str) -> TenantLLM | None

get_my_llms(cls, tenant_id: str) -> list[dict]

split_model_name_and_factory(model_name: str) -> tuple[str, str | None]

get_model_config(cls, tenant_id: str, llm_type: str, llm_name: str | None = None) -> dict

model_instance(cls, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs) -> object | None

increase_usage(cls, tenant_id: str, llm_type: str, used_tokens: int, llm_name: str | None = None) -> int

get_openai_models(cls) -> list[dict]

llm_id2llm_type(llm_id: str) -> str | None

3. LLM4Tenant

Constructor: __init__(self, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs)

Important Implementation Details

Interaction with Other System Components

Usage Example

Mermaid Class Diagram

Summary

1. `LLMFactoriesService(CommonService)`

2. `TenantLLMService(CommonService)`

`get_api_key(cls, tenant_id: str, model_name: str) -> TenantLLM | None`

`get_my_llms(cls, tenant_id: str) -> list[dict]`

`split_model_name_and_factory(model_name: str) -> tuple[str, str | None]`

`get_model_config(cls, tenant_id: str, llm_type: str, llm_name: str | None = None) -> dict`

`model_instance(cls, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs) -> object | None`

`increase_usage(cls, tenant_id: str, llm_type: str, used_tokens: int, llm_name: str | None = None) -> int`

`get_openai_models(cls) -> list[dict]`

`llm_id2llm_type(llm_id: str) -> str | None`

3. `LLM4Tenant`

Constructor: `init(self, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs)`