llm_app.py

Overview

The llm_app.py file is a Flask-based REST API module responsible for managing Large Language Model (LLM) factories and tenant-specific LLM configurations within the InfiniFlow platform. It provides endpoints to retrieve available LLM factories, add or delete tenant-specific LLM configurations, set API keys for LLM access, and list the LLMs associated with a tenant.

The core functionality revolves around tenant-specific LLM management, including validating API keys by testing connectivity to various LLM models (embedding, chat, rerank, image-to-text, text-to-speech), handling special authentication schemes for different LLM providers, and querying/updating LLM-related data persisted in the database via service layers.

This module interacts with services that abstract database models for tenants’ LLMs (TenantLLMService), general LLM metadata (LLMService), and LLM factories (LLMFactoriesService). It also integrates with the RAG (Retrieval-Augmented Generation) library to instantiate and test different LLM model types.

Detailed Explanations of Endpoints and Functions

1. `factories()`

Route: /factories
Method: GET
Authentication: Required (@login_required)

Purpose

Retrieve and return a list of all available LLM factories excluding specific ones (Youdao, FastEmbed, BAAI), along with the types of models supported by each factory.

Functionality

Calls LLMFactoriesService.get_all() to get all LLM factories.
Filters out unwanted factories by name.
Retrieves all LLM models via LLMService.get_all().
Maps each factory to the set of model types it supports, only including models with status VALID.
If no specific model types are found for a factory, it defaults to a broad set of known LLM types.
Returns JSON with factories and their supported model types.

Returns

JSON object with a list of factories, each containing details and supported model types.

Example Usage

GET /factories
Authorization: Bearer <token>

2. `set_api_key()`

Route: /set_api_key
Method: POST
Authentication: Required
Request Validation: Requires JSON fields llm_factory and api_key

Purpose

Set or update the API key for a tenant's LLM factory and validate the key by testing it against supported model types (embedding, chat, rerank).

Parameters

JSON body with fields:
- llm_factory (str): The provider/factory name.
- api_key (str): API key to authenticate with the factory.
- Optional: base_url, model_type, llm_name.

Functionality

Iterates over all LLM models for the specified factory.
For each model type (embedding, chat, rerank), it attempts to instantiate the corresponding RAG model class with the API key and make a test call.
Captures any errors during test calls and aggregates error messages.
If any test passes, clears error messages and proceeds.
If all fail, returns an error result with detailed messages.
Updates or inserts tenant-specific LLM configuration with API key and other details.

Returns

JSON result indicating success (true) or failure with error messages.

Example Usage

POST /set_api_key
Content-Type: application/json
Authorization: Bearer <token>

{
  "llm_factory": "OpenAI",
  "api_key": "sk-xxxxxx",
  "base_url": "https://api.openai.com"
}

3. `add_llm()`

Route: /add_llm
Method: POST
Authentication: Required
Request Validation: Requires JSON field llm_factory

Purpose

Add a new tenant-specific LLM configuration, with support for a variety of factories and their special authentication schemes.

Parameters

JSON body with fields:
- llm_factory (str)
- model_type (str)
- Optional fields like api_key, llm_name, api_base, max_tokens, and various provider-specific credentials.

Functionality

Handles special API key assembly for specific factories (e.g., VolcEngine, Tencent Hunyuan, Bedrock, LocalAI, HuggingFace, etc.).
Constructs a tenant LLM dictionary with all needed info.
Validates the API key by creating an instance of the appropriate model type from the RAG library and making a test call.
Aggregates error messages if any test fails.
If validation passes, saves or updates the tenant LLM configuration in the database.

Returns

JSON result indicating success (true) or failure with detailed error messages.

Example Usage

POST /add_llm
Content-Type: application/json
Authorization: Bearer <token>

{
  "llm_factory": "OpenAI",
  "model_type": "chat",
  "api_key": "sk-xxxx",
  "llm_name": "gpt-4",
  "max_tokens": 2048
}

4. `delete_llm()`

Route: /delete_llm
Method: POST
Authentication: Required
Request Validation: Requires JSON fields llm_factory and llm_name

Purpose

Delete a tenant-specific LLM configuration by factory and model name.

Parameters

JSON body fields:
- llm_factory (str)
- llm_name (str)

Functionality

Calls TenantLLMService.filter_delete to remove matching LLM configurations belonging to the current user.

Returns

JSON result indicating success (true).

5. `delete_factory()`

Route: /delete_factory
Method: POST
Authentication: Required
Request Validation: Requires JSON field llm_factory

Purpose

Delete all tenant-specific LLMs belonging to a particular LLM factory.

Parameters

JSON body field:
- llm_factory (str)

Functionality

Calls TenantLLMService.filter_delete with tenant ID and factory name.

Returns

JSON result indicating success (true).

6. `my_llms()`

Route: /my_llms
Method: GET
Authentication: Required

Purpose

Retrieve all LLMs configured for the current tenant, optionally including detailed information.

Parameters

Query parameter:
- include_details (bool, optional) - If true, includes details about each LLM and factory tags.

Functionality

If include_details is true:
- Queries tenant LLM configurations and all valid factories.
- Groups LLMs by factory with tags and detailed fields (type, name, used tokens, API base, max tokens).
Otherwise:
- Returns a simpler summary grouped by factory with minimal fields.

Returns

JSON object grouping LLMs by factory with relevant details.

7. `list_app()`

Route: /list
Method: GET
Authentication: Required

Purpose

List all LLMs available to the current tenant, marking their availability and filtering by optional model type.

Parameters

Query parameter:
- model_type (str, optional) - Filters LLMs by their model type.

Functionality

Defines sets of self-deployed factories and weighted factories depending on global settings.
Queries tenant LLMs and all valid LLM models.
Marks LLMs as available if tenant has API key or if they belong to self-deployed factories or specific known models.
Merges tenant LLMs not included in the general LLM list to ensure completeness.
Groups LLMs by factory and applies model type filters if provided.

Returns

JSON object mapping factories to their list of LLM models with availability info.

Important Implementation Details & Algorithms

Model Validation: For API key validation, the module dynamically selects the appropriate model class from rag.llm submodules (EmbeddingModel, ChatModel, RerankModel, CvModel, TTSModel) based on the factory and model type. It then performs lightweight test calls (e.g., encoding a test string, sending a chat message, similarity scoring) to verify API key validity.
Special Authentication Handling: Some providers require assembling API keys from multiple fields or special formats (e.g., VolcEngine, Tencent Hunyuan, Bedrock). This is done by concatenating or JSON-encoding specific fields before passing to the model.
Tenant Isolation: All tenant-specific data manipulations are filtered by the current logged-in user (current_user.id) to ensure data isolation and security.
Error Aggregation: When multiple models are tested for API key validity, errors are aggregated into a message string returned to the client to aid troubleshooting.
Model Type Enumeration: Uses LLMType enum to differentiate between model types (CHAT, EMBEDDING, RERANK, IMAGE2TEXT, SPEECH2TEXT, TTS) to handle them appropriately.

Interaction with Other System Components

Database Services: Uses service layers (LLMFactoriesService, TenantLLMService, LLMService) to abstract database operations on tenant LLMs, LLM factories, and general LLM models.
RAG Library: Integrates with the RAG library's model implementations to instantiate and validate LLMs from different providers.
Authentication: Relies on flask_login for user authentication and context (current_user).
Utility Modules: Uses utility functions for API response formatting (get_json_result, server_error_response, get_data_error_result) and request validation (validate_request).
Settings: Reads global application settings (e.g., settings.LIGHTEN) to adjust filtering of weighted LLM factories.

Visual Diagram

classDiagram
    class LLMApp {
        <<module>>
        +factories()
        +set_api_key()
        +add_llm()
        +delete_llm()
        +delete_factory()
        +my_llms()
        +list_app()
    }

    class LLMFactoriesService {
        +get_all()
        +query()
    }
    class TenantLLMService {
        +filter_update()
        +save()
        +filter_delete()
        +query()
        +get_my_llms()
    }
    class LLMService {
        +get_all()
        +query()
    }
    class EmbeddingModel {
        <<dict>>
    }
    class ChatModel {
        <<dict>>
    }
    class RerankModel {
        <<dict>>
    }
    class CvModel {
        <<dict>>
    }
    class TTSModel {
        <<dict>>
    }

    LLMApp --> LLMFactoriesService : uses
    LLMApp --> TenantLLMService : uses
    LLMApp --> LLMService : uses
    LLMApp --> EmbeddingModel : instantiates
    LLMApp --> ChatModel : instantiates
    LLMApp --> RerankModel : instantiates
    LLMApp --> CvModel : instantiates
    LLMApp --> TTSModel : instantiates

Summary

llm_app.py is a critical API component managing tenant-specific LLM configurations and factory metadata in InfiniFlow. It provides endpoints for CRUD operations on LLM configurations, API key validation through test calls, and retrieval of available LLMs and factories. The module ensures multi-tenant security and supports a variety of LLM providers with custom authentication schemes. It tightly integrates with database services and the RAG library for LLM interactions.

This file forms the backend management interface for tenants to configure and use LLMs safely and effectively within the InfiniFlow platform.

llm_app.py

Overview

Detailed Explanations of Endpoints and Functions

1. factories()

Purpose

Functionality

Returns

Example Usage

2. set_api_key()

Purpose

Parameters

Functionality

Returns

Example Usage

3. add_llm()

Purpose

Parameters

Functionality

Returns

Example Usage

4. delete_llm()

Purpose

Parameters

Functionality

Returns

5. delete_factory()

Purpose

Parameters

Functionality

Returns

6. my_llms()

Purpose

Parameters

Functionality

Returns

7. list_app()

Purpose

Parameters

Functionality

Returns

Important Implementation Details & Algorithms

Interaction with Other System Components

Visual Diagram

Summary

1. `factories()`

2. `set_api_key()`

3. `add_llm()`

4. `delete_llm()`

5. `delete_factory()`

6. `my_llms()`

7. `list_app()`