tenant_llm_service.py


Overview

The tenant_llm_service.py file provides an abstraction layer for managing and interacting with tenant-specific Large Language Models (LLMs) within the InfiniFlow platform. It primarily defines services that handle the retrieval, configuration, instantiation, and usage tracking of various LLMs associated with tenants. This includes models of different types such as Chat, Embedding, Speech2Text, Image2Text, Rerank, and Text-to-Speech (TTS).

The file encapsulates logic for:

This service plays a crucial role in customizing LLM interactions per tenant, enabling multi-tenancy support, and ensuring usage accountability.


Classes and Functions

1. LLMFactoriesService(CommonService)

Purpose:
Service class for managing LLM factories metadata.

Attributes:

Usage:
Primarily used to fetch and manage available LLM factory information.


2. TenantLLMService(CommonService)

Purpose:
Core service handling tenant-specific LLMs, including querying API keys, getting model configs, instantiating model objects, and updating usage metrics.

Attributes:


Methods:

get_api_key(cls, tenant_id: str, model_name: str) -> TenantLLM | None

Description:
Fetches the LLM configuration entry for a given tenant and model name. Handles cases where the model name includes a factory suffix (e.g., model@factory), and attempts fallback queries for known factory aliases.

Parameters:

Returns:

Example:

api_key_obj = TenantLLMService.get_api_key("tenant123", "gpt-4@OpenAI")

get_my_llms(cls, tenant_id: str) -> list[dict]

Description:
Returns a list of LLMs configured for the tenant that have valid API keys, including factory metadata like logos and tags.

Parameters:

Returns:


split_model_name_and_factory(model_name: str) -> tuple[str, str | None]

Description:
Utility method to split a composite model name into the base model name and factory suffix.

Parameters:

Returns:

Example:

model, factory = TenantLLMService.split_model_name_and_factory("gpt-4@OpenAI")
# model = "gpt-4", factory = "OpenAI"

get_model_config(cls, tenant_id: str, llm_type: str, llm_name: str | None = None) -> dict

Description:
Retrieves the LLM configuration dictionary for a given tenant and LLM type, optionally specifying a model name. Supports all major LLM types and includes fallback logic with error handling.

Parameters:

Returns:

Raises:


model_instance(cls, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs) -> object | None

Description:
Creates and returns an instantiated model object for the tenant's LLM, depending on the type. Supports models from various factories and passes relevant API keys and configurations.

Parameters:

Returns:

Example:

chat_model = TenantLLMService.model_instance("tenant123", LLMType.CHAT.value, "gpt-4@OpenAI")

increase_usage(cls, tenant_id: str, llm_type: str, used_tokens: int, llm_name: str | None = None) -> int

Description:
Increments the token usage counter for a tenant's LLM by the specified number of tokens. Handles tenant lookup and updates the database accordingly.

Parameters:

Returns:


get_openai_models(cls) -> list[dict]

Description:
Fetches all tenant LLM models with the factory "OpenAI" excluding certain embedding models.

Returns:


llm_id2llm_type(llm_id: str) -> str | None

Description:
Maps an LLM identifier to its model type by checking factory info and database records.

Parameters:

Returns:


3. LLM4Tenant

Purpose:
High-level wrapper that encapsulates an instantiated tenant LLM model, including configuration, usage limits, and telemetry integration with Langfuse.


Constructor: __init__(self, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs)

Description:
Initializes the tenant-specific LLM instance, retrieves the model configuration, sets token limits, and establishes Langfuse telemetry if available.

Parameters:

Raises:

Properties Set:


Important Implementation Details


Interaction with Other System Components


Usage Example

# Instantiate a chat model for a tenant
llm_wrapper = LLM4Tenant(tenant_id="tenant123", llm_type=LLMType.CHAT.value, llm_name="gpt-4@OpenAI")

# Use the model instance
response = llm_wrapper.mdl.chat("Hello, how can I help you?")

# Increase usage tokens after processing
TenantLLMService.increase_usage("tenant123", LLMType.CHAT.value, used_tokens=50, llm_name="gpt-4@OpenAI")

Mermaid Class Diagram

classDiagram
    class LLMFactoriesService {
        +model: LLMFactories
    }

    class TenantLLMService {
        +model: TenantLLM
        +get_api_key(tenant_id, model_name)
        +get_my_llms(tenant_id)
        +split_model_name_and_factory(model_name)
        +get_model_config(tenant_id, llm_type, llm_name=None)
        +model_instance(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
        +increase_usage(tenant_id, llm_type, used_tokens, llm_name=None)
        +get_openai_models()
        +llm_id2llm_type(llm_id)
    }

    class LLM4Tenant {
        -tenant_id: str
        -llm_type: str
        -llm_name: str
        -mdl: object
        -max_length: int
        -is_tools: bool
        -verbose_tool_use: bool
        -langfuse: Langfuse | None
        -trace_context: dict
        +__init__(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
    }

    LLMFactoriesService --|> CommonService
    TenantLLMService --|> CommonService
    LLM4Tenant --> TenantLLMService : uses
    TenantLLMService --> DB : manages connection
    TenantLLMService --> TenantService : fetch tenant info
    TenantLLMService --> LLMService : queries model metadata
    LLM4Tenant --> Langfuse : telemetry integration

Summary

The tenant_llm_service.py file is a foundational component in the InfiniFlow platform that enables tenant-specific management of LLMs across multiple model types and providers. It abstracts configuration retrieval, model instantiation, usage tracking, and telemetry integration, serving as a critical bridge between tenant data, LLM providers, and application logic. The file ensures flexibility and extensibility through careful factory and model name management, robust error handling, and adherence to multi-tenancy principles.