llm_service.py


Overview

The llm_service.py module is a core component of the InfiniFlow system responsible for managing and interfacing with Large Language Models (LLMs) tailored for tenant-specific usage. It provides services to initialize tenant LLM configurations, wrap LLM models with extended functionalities, handle token usage tracking, and facilitate various LLM-powered capabilities such as encoding, similarity scoring, image description, transcription, text-to-speech (TTS), and conversational chat.

This file acts as a bridge between raw LLM models and the tenant-aware application logic, ensuring proper usage accounting, support for model tools, multi-modal inputs, and integration with telemetry and tracing systems like Langfuse.


Classes and Functions

Class: LLMService

Usage:
LLMService is a simple service class primarily used for querying and manipulating LLM records from the database. It abstracts common DB operations for LLM entities.


Function: get_init_tenant_llm(user_id)

Implementation Details:

Usage Example:

tenant_llms = get_init_tenant_llm(user_id="tenant123")
for llm_conf in tenant_llms:
    print(llm_conf["llm_name"], llm_conf["llm_factory"])

Class: LLMBundle


Methods

bind_tools(toolcall_session, tools)

encode(texts: list) -> Tuple[List[float], int]

encode_queries(query: str) -> Tuple[List[float], int]

similarity(query: str, texts: list) -> Tuple[List[float], int]

describe(image, max_tokens=300) -> str

describe_with_prompt(image, prompt) -> str

transcription(audio) -> str

tts(text: str) -> Generator[bytes, None, None]

_remove_reasoning_content(txt: str) -> str

_clean_param(chat_partial, **kwargs) -> dict

chat(system: str, history: list, gen_conf: dict = {}, **kwargs) -> str

chat_streamly(system: str, history: list, gen_conf: dict = {}, **kwargs)

Important Implementation Details


Interaction with Other System Components


Visual Diagram

classDiagram
    class LLMService {
        +model: LLM
    }

    class LLMBundle {
        -tenant_id
        -llm_type
        -llm_name
        -lang
        -mdl
        -langfuse
        +__init__(tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs)
        +bind_tools(toolcall_session, tools)
        +encode(texts: list) -> (list, int)
        +encode_queries(query: str) -> (list, int)
        +similarity(query: str, texts: list) -> (list, int)
        +describe(image, max_tokens=300) -> str
        +describe_with_prompt(image, prompt) -> str
        +transcription(audio) -> str
        +tts(text: str) -> Generator[bytes, None, None]
        -_remove_reasoning_content(txt: str) -> str
        -_clean_param(chat_partial, **kwargs) -> dict
        +chat(system: str, history: list, gen_conf: dict = {}, **kwargs) -> str
        +chat_streamly(system: str, history: list, gen_conf: dict = {}, **kwargs)
    }

    LLMService <|-- LLMBundle
    LLMBundle ..> TenantLLMService : uses
    LLMBundle ..> LLM4Tenant : inherits

Summary

llm_service.py is a pivotal module that facilitates tenant-aware LLM management within InfiniFlow. It bridges database models, tenant services, and LLM models, wrapping them in a rich API that supports embedding, similarity, multi-modal inputs, transcription, TTS, and interactive chat—with careful tracking of usage and telemetry integration. Its design emphasizes extensibility (tool support), robustness (parameter validation), and observability (Langfuse), making it a cornerstone for scalable and auditable LLM-powered applications in a multi-tenant environment.