tenant_llm_service.py
Overview
The tenant_llm_service.py file provides an abstraction layer for managing and interacting with tenant-specific Large Language Models (LLMs) within the InfiniFlow platform. It primarily defines services that handle the retrieval, configuration, instantiation, and usage tracking of various LLMs associated with tenants. This includes models of different types such as Chat, Embedding, Speech2Text, Image2Text, Rerank, and Text-to-Speech (TTS).
The file encapsulates logic for:
Fetching tenant-specific API keys and model configurations.
Creating instances of models based on tenant configurations.
Tracking token usage for billing or monitoring.
Resolving models and factories from composite names.
Integrating Langfuse for telemetry and traceability.
This service plays a crucial role in customizing LLM interactions per tenant, enabling multi-tenancy support, and ensuring usage accountability.
Classes and Functions
1. LLMFactoriesService(CommonService)
Purpose:
Service class for managing LLM factories metadata.
Attributes:
model = LLMFactories: Links this service to theLLMFactoriesdatabase model.
Usage:
Primarily used to fetch and manage available LLM factory information.
2. TenantLLMService(CommonService)
Purpose:
Core service handling tenant-specific LLMs, including querying API keys, getting model configs, instantiating model objects, and updating usage metrics.
Attributes:
model = TenantLLM: Connects the service to theTenantLLMdatabase model.
Methods:
get_api_key(cls, tenant_id: str, model_name: str) -> TenantLLM | None
Description:
Fetches the LLM configuration entry for a given tenant and model name. Handles cases where the model name includes a factory suffix (e.g., model@factory), and attempts fallback queries for known factory aliases.
Parameters:
tenant_id: Tenant identifier.model_name: Name of the LLM model, optionally with factory suffix.
Returns:
The first matching
TenantLLMobject orNoneif not found.
Example:
api_key_obj = TenantLLMService.get_api_key("tenant123", "gpt-4@OpenAI")
get_my_llms(cls, tenant_id: str) -> list[dict]
Description:
Returns a list of LLMs configured for the tenant that have valid API keys, including factory metadata like logos and tags.
Parameters:
tenant_id: Tenant identifier.
Returns:
List of dictionaries containing tenant LLM details.
split_model_name_and_factory(model_name: str) -> tuple[str, str | None]
Description:
Utility method to split a composite model name into the base model name and factory suffix.
Parameters:
model_name: The full model string (e.g.,"gpt-4@OpenAI").
Returns:
Tuple
(model_name, factory_name)or(model_name, None)if no factory suffix is present.
Example:
model, factory = TenantLLMService.split_model_name_and_factory("gpt-4@OpenAI")
# model = "gpt-4", factory = "OpenAI"
get_model_config(cls, tenant_id: str, llm_type: str, llm_name: str | None = None) -> dict
Description:
Retrieves the LLM configuration dictionary for a given tenant and LLM type, optionally specifying a model name. Supports all major LLM types and includes fallback logic with error handling.
Parameters:
tenant_id: Tenant identifier.llm_type: Type of LLM (embedding, chat, speech2text, etc.).llm_name: Optional specific model name to fetch.
Returns:
Dictionary containing model configuration fields such as API key, factory name, model name, API base URL, and tool usage flag.
Raises:
LookupErrorif tenant or model not found or unauthorized.
model_instance(cls, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs) -> object | None
Description:
Creates and returns an instantiated model object for the tenant's LLM, depending on the type. Supports models from various factories and passes relevant API keys and configurations.
Parameters:
tenant_id: Tenant identifier.llm_type: The LLM type (embedding, chat, etc.).llm_name: Optional model name.lang: Language preference, default"Chinese".**kwargs: Additional parameters passed to the model constructor.
Returns:
An instance of the requested LLM class or
Noneif unsupported.
Example:
chat_model = TenantLLMService.model_instance("tenant123", LLMType.CHAT.value, "gpt-4@OpenAI")
increase_usage(cls, tenant_id: str, llm_type: str, used_tokens: int, llm_name: str | None = None) -> int
Description:
Increments the token usage counter for a tenant's LLM by the specified number of tokens. Handles tenant lookup and updates the database accordingly.
Parameters:
tenant_id: Tenant identifier.llm_type: Type of the LLM.used_tokens: Number of tokens to increment.llm_name: Optional model name override.
Returns:
Number of updated rows (typically 1 if successful, 0 otherwise).
get_openai_models(cls) -> list[dict]
Description:
Fetches all tenant LLM models with the factory "OpenAI" excluding certain embedding models.
Returns:
List of dictionaries representing OpenAI models.
llm_id2llm_type(llm_id: str) -> str | None
Description:
Maps an LLM identifier to its model type by checking factory info and database records.
Parameters:
llm_id: The LLM identifier string.
Returns:
The LLM model type string or
Noneif not found.
3. LLM4Tenant
Purpose:
High-level wrapper that encapsulates an instantiated tenant LLM model, including configuration, usage limits, and telemetry integration with Langfuse.
Constructor: __init__(self, tenant_id: str, llm_type: str, llm_name: str | None = None, lang: str = "Chinese", **kwargs)
Description:
Initializes the tenant-specific LLM instance, retrieves the model configuration, sets token limits, and establishes Langfuse telemetry if available.
Parameters:
tenant_id: Tenant identifier.llm_type: LLM type.llm_name: Optional model name.lang: Language code (default"Chinese").**kwargs: Additional options (e.g.,verbose_tool_use).
Raises:
AssertionErrorif the model instance cannot be created.
Properties Set:
self.mdl: The instantiated model object.self.max_length: Max token length allowed.self.is_tools: Flag indicating if the model uses external tools.self.verbose_tool_use: Verbosity flag for tool usage.self.langfuse: Langfuse client instance if telemetry keys are found.self.trace_context: Dictionary with trace ID for Langfuse tracing.
Important Implementation Details
Factory and Model Name Parsing:
The system supports composite model names in the formmodel_name@factory_name. Thesplit_model_name_and_factorymethod carefully handles parsing, including cases with multiple '@' characters.Fallback Logic for Model Lookup:
When looking up models, the service attempts to match factory suffixes and, if not found, tries various known aliases such as "LocalAI", "HuggingFace", and "OpenAI-API-Compatible" to increase robustness.Multi-Type LLM Support:
The service handles multiple LLM types, each possibly backed by different model classes or providers, and instantiates them accordingly.Langfuse Integration:
LLM4Tenantintegrates with Langfuse for telemetry if tenant-specific keys are available, enabling traceability of requests.Database Context Management:
Most service methods are decorated with@DB.connection_context()to ensure proper database session management.
Interaction with Other System Components
Database Models:
LLMFactories: Contains metadata about LLM providers.TenantLLM: Stores tenant-specific LLM configurations and usage stats.DB: Database connection context manager.
Other Services:
TenantService: For tenant data retrieval.LLMService: For querying general LLM metadata.TenantLangfuseService: For fetching Langfuse telemetry keys.
LLM Model Classes:
ChatModel,EmbeddingModel,CvModel,RerankModel,Seq2txtModel,TTSModelfromrag.llmprovide concrete implementations for different model types and factories.
Settings:
Uses
settings.FACTORY_LLM_INFOSto get known LLM factory providers.
Langfuse SDK:
Used within
LLM4Tenantfor telemetry and tracing.
Usage Example
# Instantiate a chat model for a tenant
llm_wrapper = LLM4Tenant(tenant_id="tenant123", llm_type=LLMType.CHAT.value, llm_name="gpt-4@OpenAI")
# Use the model instance
response = llm_wrapper.mdl.chat("Hello, how can I help you?")
# Increase usage tokens after processing
TenantLLMService.increase_usage("tenant123", LLMType.CHAT.value, used_tokens=50, llm_name="gpt-4@OpenAI")
Mermaid Class Diagram
classDiagram
class LLMFactoriesService {
+model: LLMFactories
}
class TenantLLMService {
+model: TenantLLM
+get_api_key(tenant_id, model_name)
+get_my_llms(tenant_id)
+split_model_name_and_factory(model_name)
+get_model_config(tenant_id, llm_type, llm_name=None)
+model_instance(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
+increase_usage(tenant_id, llm_type, used_tokens, llm_name=None)
+get_openai_models()
+llm_id2llm_type(llm_id)
}
class LLM4Tenant {
-tenant_id: str
-llm_type: str
-llm_name: str
-mdl: object
-max_length: int
-is_tools: bool
-verbose_tool_use: bool
-langfuse: Langfuse | None
-trace_context: dict
+__init__(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
}
LLMFactoriesService --|> CommonService
TenantLLMService --|> CommonService
LLM4Tenant --> TenantLLMService : uses
TenantLLMService --> DB : manages connection
TenantLLMService --> TenantService : fetch tenant info
TenantLLMService --> LLMService : queries model metadata
LLM4Tenant --> Langfuse : telemetry integration
Summary
The tenant_llm_service.py file is a foundational component in the InfiniFlow platform that enables tenant-specific management of LLMs across multiple model types and providers. It abstracts configuration retrieval, model instantiation, usage tracking, and telemetry integration, serving as a critical bridge between tenant data, LLM providers, and application logic. The file ensures flexibility and extensibility through careful factory and model name management, robust error handling, and adherence to multi-tenancy principles.