llm_app.py
Overview
The llm_app.py file is a Flask-based REST API module responsible for managing Large Language Model (LLM) factories and tenant-specific LLM configurations within the InfiniFlow platform. It provides endpoints to retrieve available LLM factories, add or delete tenant-specific LLM configurations, set API keys for LLM access, and list the LLMs associated with a tenant.
The core functionality revolves around tenant-specific LLM management, including validating API keys by testing connectivity to various LLM models (embedding, chat, rerank, image-to-text, text-to-speech), handling special authentication schemes for different LLM providers, and querying/updating LLM-related data persisted in the database via service layers.
This module interacts with services that abstract database models for tenants’ LLMs (TenantLLMService), general LLM metadata (LLMService), and LLM factories (LLMFactoriesService). It also integrates with the RAG (Retrieval-Augmented Generation) library to instantiate and test different LLM model types.
Detailed Explanations of Endpoints and Functions
1. factories()
Route: /factories
Method: GET
Authentication: Required (@login_required)
Purpose
Retrieve and return a list of all available LLM factories excluding specific ones (Youdao, FastEmbed, BAAI), along with the types of models supported by each factory.
Functionality
Calls LLMFactoriesService.get_all() to get all LLM factories.
Filters out unwanted factories by name.
Retrieves all LLM models via
LLMService.get_all().Maps each factory to the set of model types it supports, only including models with status
VALID.If no specific model types are found for a factory, it defaults to a broad set of known LLM types.
Returns JSON with factories and their supported model types.
Returns
JSON object with a list of factories, each containing details and supported model types.
Example Usage
GET /factories
Authorization: Bearer <token>
2. set_api_key()
Route: /set_api_key
Method: POST
Authentication: Required
Request Validation: Requires JSON fields llm_factory and api_key
Purpose
Set or update the API key for a tenant's LLM factory and validate the key by testing it against supported model types (embedding, chat, rerank).
Parameters
JSON body with fields:
llm_factory(str): The provider/factory name.api_key(str): API key to authenticate with the factory.Optional:
base_url,model_type,llm_name.
Functionality
Iterates over all LLM models for the specified factory.
For each model type (embedding, chat, rerank), it attempts to instantiate the corresponding RAG model class with the API key and make a test call.
Captures any errors during test calls and aggregates error messages.
If any test passes, clears error messages and proceeds.
If all fail, returns an error result with detailed messages.
Updates or inserts tenant-specific LLM configuration with API key and other details.
Returns
JSON result indicating success (
true) or failure with error messages.
Example Usage
POST /set_api_key
Content-Type: application/json
Authorization: Bearer <token>
{
"llm_factory": "OpenAI",
"api_key": "sk-xxxxxx",
"base_url": "https://api.openai.com"
}
3. add_llm()
Route: /add_llm
Method: POST
Authentication: Required
Request Validation: Requires JSON field llm_factory
Purpose
Add a new tenant-specific LLM configuration, with support for a variety of factories and their special authentication schemes.
Parameters
JSON body with fields:
llm_factory(str)model_type(str)Optional fields like
api_key,llm_name,api_base,max_tokens, and various provider-specific credentials.
Functionality
Handles special API key assembly for specific factories (e.g., VolcEngine, Tencent Hunyuan, Bedrock, LocalAI, HuggingFace, etc.).
Constructs a tenant LLM dictionary with all needed info.
Validates the API key by creating an instance of the appropriate model type from the RAG library and making a test call.
Aggregates error messages if any test fails.
If validation passes, saves or updates the tenant LLM configuration in the database.
Returns
JSON result indicating success (
true) or failure with detailed error messages.
Example Usage
POST /add_llm
Content-Type: application/json
Authorization: Bearer <token>
{
"llm_factory": "OpenAI",
"model_type": "chat",
"api_key": "sk-xxxx",
"llm_name": "gpt-4",
"max_tokens": 2048
}
4. delete_llm()
Route: /delete_llm
Method: POST
Authentication: Required
Request Validation: Requires JSON fields llm_factory and llm_name
Purpose
Delete a tenant-specific LLM configuration by factory and model name.
Parameters
JSON body fields:
llm_factory(str)llm_name(str)
Functionality
Calls
TenantLLMService.filter_deleteto remove matching LLM configurations belonging to the current user.
Returns
JSON result indicating success (
true).
5. delete_factory()
Route: /delete_factory
Method: POST
Authentication: Required
Request Validation: Requires JSON field llm_factory
Purpose
Delete all tenant-specific LLMs belonging to a particular LLM factory.
Parameters
JSON body field:
llm_factory(str)
Functionality
Calls
TenantLLMService.filter_deletewith tenant ID and factory name.
Returns
JSON result indicating success (
true).
6. my_llms()
Route: /my_llms
Method: GET
Authentication: Required
Purpose
Retrieve all LLMs configured for the current tenant, optionally including detailed information.
Parameters
Query parameter:
include_details(bool, optional) - If true, includes details about each LLM and factory tags.
Functionality
If
include_detailsis true:Queries tenant LLM configurations and all valid factories.
Groups LLMs by factory with tags and detailed fields (type, name, used tokens, API base, max tokens).
Otherwise:
Returns a simpler summary grouped by factory with minimal fields.
Returns
JSON object grouping LLMs by factory with relevant details.
7. list_app()
Route: /list
Method: GET
Authentication: Required
Purpose
List all LLMs available to the current tenant, marking their availability and filtering by optional model type.
Parameters
Query parameter:
model_type(str, optional) - Filters LLMs by their model type.
Functionality
Defines sets of self-deployed factories and weighted factories depending on global settings.
Queries tenant LLMs and all valid LLM models.
Marks LLMs as available if tenant has API key or if they belong to self-deployed factories or specific known models.
Merges tenant LLMs not included in the general LLM list to ensure completeness.
Groups LLMs by factory and applies model type filters if provided.
Returns
JSON object mapping factories to their list of LLM models with availability info.
Important Implementation Details & Algorithms
Model Validation: For API key validation, the module dynamically selects the appropriate model class from
rag.llmsubmodules (EmbeddingModel,ChatModel,RerankModel,CvModel,TTSModel) based on the factory and model type. It then performs lightweight test calls (e.g., encoding a test string, sending a chat message, similarity scoring) to verify API key validity.Special Authentication Handling: Some providers require assembling API keys from multiple fields or special formats (e.g., VolcEngine, Tencent Hunyuan, Bedrock). This is done by concatenating or JSON-encoding specific fields before passing to the model.
Tenant Isolation: All tenant-specific data manipulations are filtered by the current logged-in user (
current_user.id) to ensure data isolation and security.Error Aggregation: When multiple models are tested for API key validity, errors are aggregated into a message string returned to the client to aid troubleshooting.
Model Type Enumeration: Uses
LLMTypeenum to differentiate between model types (CHAT,EMBEDDING,RERANK,IMAGE2TEXT,SPEECH2TEXT,TTS) to handle them appropriately.
Interaction with Other System Components
Database Services: Uses service layers (
LLMFactoriesService,TenantLLMService,LLMService) to abstract database operations on tenant LLMs, LLM factories, and general LLM models.RAG Library: Integrates with the RAG library's model implementations to instantiate and validate LLMs from different providers.
Authentication: Relies on
flask_loginfor user authentication and context (current_user).Utility Modules: Uses utility functions for API response formatting (
get_json_result,server_error_response,get_data_error_result) and request validation (validate_request).Settings: Reads global application settings (e.g.,
settings.LIGHTEN) to adjust filtering of weighted LLM factories.
Visual Diagram
classDiagram
class LLMApp {
<<module>>
+factories()
+set_api_key()
+add_llm()
+delete_llm()
+delete_factory()
+my_llms()
+list_app()
}
class LLMFactoriesService {
+get_all()
+query()
}
class TenantLLMService {
+filter_update()
+save()
+filter_delete()
+query()
+get_my_llms()
}
class LLMService {
+get_all()
+query()
}
class EmbeddingModel {
<<dict>>
}
class ChatModel {
<<dict>>
}
class RerankModel {
<<dict>>
}
class CvModel {
<<dict>>
}
class TTSModel {
<<dict>>
}
LLMApp --> LLMFactoriesService : uses
LLMApp --> TenantLLMService : uses
LLMApp --> LLMService : uses
LLMApp --> EmbeddingModel : instantiates
LLMApp --> ChatModel : instantiates
LLMApp --> RerankModel : instantiates
LLMApp --> CvModel : instantiates
LLMApp --> TTSModel : instantiates
Summary
llm_app.py is a critical API component managing tenant-specific LLM configurations and factory metadata in InfiniFlow. It provides endpoints for CRUD operations on LLM configurations, API key validation through test calls, and retrieval of available LLMs and factories. The module ensures multi-tenant security and supports a variety of LLM providers with custom authentication schemes. It tightly integrates with database services and the RAG library for LLM interactions.
This file forms the backend management interface for tenants to configure and use LLMs safely and effectively within the InfiniFlow platform.