prompts.py

Overview

prompts.py is a utility and prompt management module within the InfiniFlow system designed to facilitate the construction, formatting, and management of natural language prompts for interaction with large language models (LLMs). It provides functionality for generating, formatting, and managing prompts related to tasks such as keyword extraction, question proposal, content tagging, multi-language translation, task analysis, next step planning, reflection, and summarization.

The file leverages Jinja2 templating to dynamically render prompt templates, integrates token counting and message length fitting to comply with LLM input limits, and wraps interaction patterns with LLM chat models.

Key features include:

Formatting and truncating messages to fit token limits.
Preparing knowledge base content for prompt inclusion.
Loading and rendering multiple prompt templates.
Wrapping various prompt types into callable functions that interact with chat models.
Managing tool schemas and task workflows via prompts.
JSON repair and parsing for robust prompt result handling.

This module is central to how InfiniFlow constructs and manages prompt engineering workflows that drive LLM-based reasoning and task execution.

Detailed Explanation of Functions

Utility Functions

`get_value(d: dict, k1: str, k2: str) -> any`

Returns the value for key k1 in dictionary d, or if not found, returns the value for key k2.

Parameters:
- d — Dictionary to lookup.
- k1 — Primary key to search.
- k2 — Secondary key fallback.
Returns: The value for k1 if present, else for k2, else None.
Usage Example:

value = get_value(chunk, "chunk_id", "id")

`chunks_format(reference: dict) -> list[dict]`

Formats a "reference" dictionary containing chunks of knowledge into a normalized list of chunk dictionaries with consistent keys.

Parameters:
- reference — Dictionary with a chunks key containing list of chunk dicts.
Returns: List of chunk dictionaries with keys like id, content, document_id, document_name, etc.
Details: Uses get_value to support multiple possible keys for the same semantic data.
Usage Example:

formatted_chunks = chunks_format(reference)

`message_fit_in(msg: list[dict], max_length: int=4000) -> Tuple[int, list[dict]]`

Truncates or filters a list of message dicts (with role and content) to fit under a specified token limit.

Parameters:
- msg — List of messages, each a dict with keys role and content.
- max_length — Maximum token length allowed (default 4000).
Returns: Tuple of (token count, possibly truncated/filtered message list).
Algorithm: Counts tokens using num_tokens_from_string; tries to preserve system messages and last message, truncates content if necessary.
Usage Example:

token_count, trimmed_msg = message_fit_in(messages, max_length=3000)

`kb_prompt(kbinfos: dict, max_tokens: int, hash_id: bool=False) -> list[str]`

Formats knowledge base chunks into a prompt-friendly list of strings, each representing a chunk with metadata and content.

Parameters:
- kbinfos — Dict containing knowledge chunks under key "chunks".
- max_tokens — Maximum tokens allowed in the prompt.
- hash_id — Whether to hash chunk IDs into integer IDs.
Returns: List of formatted knowledge strings ready for prompt inclusion.
Implementation Notes:
- Limits number of chunks to fit in max_tokens.
- Retrieves document metadata.
- Formats metadata with hierarchical "tree" style lines.
Usage Example:

knowledge_strings = kb_prompt(kbinfos, max_tokens=2000)

Prompt Rendering Functions

These functions render specific prompt templates using Jinja2 and interact with chat models.

`citation_prompt(user_defined_prompts: dict={}) -> str`

Renders the citation prompt template, optionally overridden by user-defined prompts.

Parameters:
- user_defined_prompts — Dictionary of user override templates.
Returns: Rendered citation prompt string.
Usage Example:

prompt_str = citation_prompt()

`citation_plus(sources: str) -> str`

Renders an enhanced citation prompt including example citations and provided sources.

Parameters:
- sources — String listing sources to cite.
Returns: Rendered citation prompt string incorporating sources.
Usage Example:

prompt_str = citation_plus("Source 1, Source 2")

`keyword_extraction(chat_mdl, content: str, topn: int=3) -> str`

Uses a chat model to extract the top N keywords from content.

Parameters:
- chat_mdl — Chat model instance supporting .chat() method.
- content — Text content to extract keywords from.
- topn — Number of keywords to extract.
Returns: Extracted keywords as string.
Implementation Details:
- Renders keyword extraction prompt.
- Fits message to token limit.
- Calls chat model with low temperature for deterministic output.
- Cleans output from internal markers and errors.
Usage Example:

keywords = keyword_extraction(chat_model, "The quick brown fox jumps...", topn=5)

`question_proposal(chat_mdl, content: str, topn: int=3) -> str`

Proposes questions based on given content, similar to keyword extraction.

Parameters: Same as keyword_extraction.
Returns: Proposed questions as a string.
Usage Example:

questions = question_proposal(chat_model, "Some article content...")

`full_question(tenant_id=None, llm_id=None, messages=[], language=None, chat_mdl=None) -> str`

Generates a full question prompt based on a conversation history.

Parameters:
- tenant_id — Tenant identifier.
- llm_id — ID of the language model to use.
- messages — List of message dicts representing conversation.
- language — Language code.
- chat_mdl — Optional pre-initialized chat model instance.
Returns: Generated question string or fallback to last user message on error.
Implementation Details:
- Determines or creates chat model instance.
- Formats conversation into a text block.
- Renders prompt with dates and conversation.
- Calls chat model and cleans output.
Usage Example:

question = full_question(tenant_id=1, llm_id="gpt-4", messages=chat_history)

`cross_languages(tenant_id, llm_id, query, languages=[]) -> str`

Translates or processes a query across multiple languages using a chat model.

Parameters:
- tenant_id, llm_id — Identifiers for tenant and model.
- query — Input query string.
- languages — List of target language codes.
Returns: Processed multilingual output or original query on error.
Usage Example:

translations = cross_languages(1, "gpt-4", "Hello world", ["fr", "de"])

`content_tagging(chat_mdl, content: str, all_tags: list, examples: list, topn: int=3) -> dict`

Tags content using a chat model and returns a dictionary of tag counts.

Parameters:
- chat_mdl — Chat model instance.
- content — Content to tag.
- all_tags — List of possible tags.
- examples — Example data to guide tagging.
- topn — Number of tags to consider.
Returns: Dictionary mapping tag names to integer counts.
Implementation Details:
- Renders content tagging template with examples.
- Calls chat model and attempts robust JSON repair on output.
- Filters and returns tags with positive counts.
Usage Example:

tags = content_tagging(chat_model, article_text, all_tags_list, example_data)

`vision_llm_describe_prompt(page=None) -> str`

Generates prompt for vision LLM describing a page.

Parameters:
- page — Optional page metadata.
Returns: Rendered prompt string.

`vision_llm_figure_describe_prompt() -> str`

Generates a prompt for vision LLM to describe a figure.

Returns: Rendered prompt string.

Tool and Task Management Functions

`tool_schema(tools_description: list[dict], complete_task=False) -> str`

Formats tool descriptions into a Markdown + JSON schema string for LLM consumption.

Parameters:
- tools_description — List of tool dicts with function metadata.
- complete_task — Whether to include a special "complete_task" function.
Returns: Combined Markdown string describing all tools.
Usage Example:

schema_str = tool_schema(tools_list, complete_task=True)

`form_history(history: list, limit: int = -6) -> str`

Formats recent chat history messages into a textual context string.

Parameters:
- history — List of message dicts.
- limit — Number of recent messages to include (negative index).
Returns: Concatenated string with roles and truncated content.

`analyze_task(chat_mdl, prompt: str, task_name: str, tools_description: list[dict], user_defined_prompts: dict={}) -> str`

Analyzes a task description and tools using the chat model.

Parameters:
- chat_mdl — Chat model instance.
- prompt — Agent prompt string.
- task_name — Name of the task.
- tools_description — List of tool dicts.
- user_defined_prompts — Optional dict to override templates.
Returns: Analysis result string or empty on error.

`next_step(chat_mdl, history: list, tools_description: list[dict], task_desc: str, user_defined_prompts: dict={}) -> Tuple[str, int]`

Determines the next tool to call or completion in a task workflow.

Parameters:
- chat_mdl — Chat model instance.
- history — Conversation history.
- tools_description — Tools available.
- task_desc — Task description string.
- user_defined_prompts — Template overrides.
Returns: Tuple of (JSON string with next step, token count).

`reflect(chat_mdl, history: list, tool_call_res: list[Tuple], user_defined_prompts: dict={}) -> str`

Reflects on prior tool calls and history to generate observations and reflections.

Parameters:
- chat_mdl — Chat model.
- history — Chat history.
- tool_call_res — List of tuples (tool_name, result).
- user_defined_prompts — Template overrides.
Returns: Formatted string with observations and reflection text.

`form_message(system_prompt: str, user_prompt: str) -> list`

Forms a standardized message list with system and user prompts.

Returns: List of two dict messages.

`tool_call_summary(chat_mdl, name: str, params: dict, result: str, user_defined_prompts: dict={}) -> str`

Generates a summary of a tool call result for memory or logging.

Parameters:
- chat_mdl — Chat model instance.
- name — Tool name.
- params — Parameters used in tool call.
- result — Result string.
- user_defined_prompts — Template overrides.
Returns: Summary string.

`rank_memories(chat_mdl, goal: str, sub_goal: str, tool_call_summaries: list[str], user_defined_prompts: dict={}) -> str`

Ranks multiple memory summaries according to goal relevance.

Parameters:
- chat_mdl — Chat model.
- goal — Main goal string.
- sub_goal — Sub goal string.
- tool_call_summaries — List of summary strings.
- user_defined_prompts — Template overrides.
Returns: Ranking output string.

`gen_meta_filter(chat_mdl, meta_data: dict, query: str) -> list`

Generates metadata filters applicable to a user query.

Parameters:
- chat_mdl — Chat model.
- meta_data — Dictionary of metadata keys.
- query — User question string.
Returns: List of generated filter strings or empty list on failure.

Important Implementation Details and Algorithms

Token Counting and Message Trimming:
The message_fit_in function carefully counts tokens across messages and attempts multiple heuristics to trim or prioritize system and last user messages to fit within token limits imposed by the LLM APIs.
Prompt Rendering with Jinja2:
The module uses a globally configured Jinja2 environment (PROMPT_JINJA_ENV) to render prompt templates dynamically, allowing for flexible prompt customization.
Robust JSON Handling:
Outputs from LLMs that are expected to be JSON are parsed with the json_repair library to recover from common formatting errors, using fallback heuristics if direct parsing fails.
Tool Schema Construction:
Tools are described in a JSON schema format embedded in Markdown, enabling LLMs to understand available functions and their parameters, facilitating function calling workflows.

Interactions with Other System Components

Chat Models (chat_mdl):
This file expects a chat model abstraction exposing a .chat() method to send prompts and receive responses. This abstraction is linked with services such as LLMBundle that manage specific LLM types and tenants.
Prompt Templates:
Loads prompt templates via load_prompt from rag.prompts.prompt_template.
Database Services:
Imports DocumentService from api.db.services.document_service to fetch document metadata related to knowledge chunks.
Utility Modules:
Uses utilities such as num_tokens_from_string for token counting, and encoder for encoding/decoding text to tokens.
JSON Repair:
Uses the json_repair library to handle and recover malformed JSON responses from LLMs.

Visual Diagram: Class / Function Structure

This file is a utility module primarily composed of standalone functions managing prompts and message formatting. Below is a flowchart representing the main functional relationships:

flowchart TD
    A[get_value]
    B[chunks_format]
    C[message_fit_in]
    D[kb_prompt]
    E[citation_prompt]
    F[citation_plus]
    G[keyword_extraction]
    H[question_proposal]
    I[full_question]
    J[cross_languages]
    K[content_tagging]
    L[vision_llm_describe_prompt]
    M[vision_llm_figure_describe_prompt]
    N[tool_schema]
    O[form_history]
    P[analyze_task]
    Q[next_step]
    R[reflect]
    S[form_message]
    T[tool_call_summary]
    U[rank_memories]
    V[gen_meta_filter]

    C --> G
    C --> H
    C --> K
    C --> T
    C --> U
    D --> DocumentService[DocumentService]
    I --> LLMBundle[LLMBundle]
    J --> LLMBundle
    P --> N
    Q --> N
    R --> C
    T --> C
    U --> C
    V --> chat_mdl[chat_mdl]

    style chat_mdl fill:#f9f,stroke:#333,stroke-width:1px
    style LLMBundle fill:#bbf,stroke:#333,stroke-width:1px
    style DocumentService fill:#bbf,stroke:#333,stroke-width:1px

Summary

prompts.py is a comprehensive prompt engineering utility module designed to support complex LLM-based workflows by managing prompt templates, message formatting, token limits, and interactions with chat models. It enables flexible, dynamic generation of prompts for a variety of tasks such as keyword extraction, content tagging, multilingual translation, task analysis, and reflection, while providing robust handling of LLM responses and integration with other system components like document services and LLM bundles.

This module is essential for orchestrating how the InfiniFlow system communicates and reasons via large language models.