prompts.py
Overview
prompts.py is a utility and prompt management module within the InfiniFlow system designed to facilitate the construction, formatting, and management of natural language prompts for interaction with large language models (LLMs). It provides functionality for generating, formatting, and managing prompts related to tasks such as keyword extraction, question proposal, content tagging, multi-language translation, task analysis, next step planning, reflection, and summarization.
The file leverages Jinja2 templating to dynamically render prompt templates, integrates token counting and message length fitting to comply with LLM input limits, and wraps interaction patterns with LLM chat models.
Key features include:
Formatting and truncating messages to fit token limits.
Preparing knowledge base content for prompt inclusion.
Loading and rendering multiple prompt templates.
Wrapping various prompt types into callable functions that interact with chat models.
Managing tool schemas and task workflows via prompts.
JSON repair and parsing for robust prompt result handling.
This module is central to how InfiniFlow constructs and manages prompt engineering workflows that drive LLM-based reasoning and task execution.
Detailed Explanation of Functions
Utility Functions
get_value(d: dict, k1: str, k2: str) -> any
Returns the value for key k1 in dictionary d, or if not found, returns the value for key k2.
Parameters:
d— Dictionary to lookup.k1— Primary key to search.k2— Secondary key fallback.
Returns: The value for
k1if present, else fork2, elseNone.Usage Example:
value = get_value(chunk, "chunk_id", "id")
chunks_format(reference: dict) -> list[dict]
Formats a "reference" dictionary containing chunks of knowledge into a normalized list of chunk dictionaries with consistent keys.
Parameters:
reference— Dictionary with achunkskey containing list of chunk dicts.
Returns: List of chunk dictionaries with keys like
id,content,document_id,document_name, etc.Details: Uses
get_valueto support multiple possible keys for the same semantic data.Usage Example:
formatted_chunks = chunks_format(reference)
message_fit_in(msg: list[dict], max_length: int=4000) -> Tuple[int, list[dict]]
Truncates or filters a list of message dicts (with role and content) to fit under a specified token limit.
Parameters:
msg— List of messages, each a dict with keysroleandcontent.max_length— Maximum token length allowed (default 4000).
Returns: Tuple of (token count, possibly truncated/filtered message list).
Algorithm: Counts tokens using
num_tokens_from_string; tries to preserve system messages and last message, truncates content if necessary.Usage Example:
token_count, trimmed_msg = message_fit_in(messages, max_length=3000)
kb_prompt(kbinfos: dict, max_tokens: int, hash_id: bool=False) -> list[str]
Formats knowledge base chunks into a prompt-friendly list of strings, each representing a chunk with metadata and content.
Parameters:
kbinfos— Dict containing knowledge chunks under key"chunks".max_tokens— Maximum tokens allowed in the prompt.hash_id— Whether to hash chunk IDs into integer IDs.
Returns: List of formatted knowledge strings ready for prompt inclusion.
Implementation Notes:
Limits number of chunks to fit in
max_tokens.Retrieves document metadata.
Formats metadata with hierarchical "tree" style lines.
Usage Example:
knowledge_strings = kb_prompt(kbinfos, max_tokens=2000)
Prompt Rendering Functions
These functions render specific prompt templates using Jinja2 and interact with chat models.
citation_prompt(user_defined_prompts: dict={}) -> str
Renders the citation prompt template, optionally overridden by user-defined prompts.
Parameters:
user_defined_prompts— Dictionary of user override templates.
Returns: Rendered citation prompt string.
Usage Example:
prompt_str = citation_prompt()
citation_plus(sources: str) -> str
Renders an enhanced citation prompt including example citations and provided sources.
Parameters:
sources— String listing sources to cite.
Returns: Rendered citation prompt string incorporating sources.
Usage Example:
prompt_str = citation_plus("Source 1, Source 2")
keyword_extraction(chat_mdl, content: str, topn: int=3) -> str
Uses a chat model to extract the top N keywords from content.
Parameters:
chat_mdl— Chat model instance supporting.chat()method.content— Text content to extract keywords from.topn— Number of keywords to extract.
Returns: Extracted keywords as string.
Implementation Details:
Renders keyword extraction prompt.
Fits message to token limit.
Calls chat model with low temperature for deterministic output.
Cleans output from internal markers and errors.
Usage Example:
keywords = keyword_extraction(chat_model, "The quick brown fox jumps...", topn=5)
question_proposal(chat_mdl, content: str, topn: int=3) -> str
Proposes questions based on given content, similar to keyword extraction.
Parameters: Same as
keyword_extraction.Returns: Proposed questions as a string.
Usage Example:
questions = question_proposal(chat_model, "Some article content...")
full_question(tenant_id=None, llm_id=None, messages=[], language=None, chat_mdl=None) -> str
Generates a full question prompt based on a conversation history.
Parameters:
tenant_id— Tenant identifier.llm_id— ID of the language model to use.messages— List of message dicts representing conversation.language— Language code.chat_mdl— Optional pre-initialized chat model instance.
Returns: Generated question string or fallback to last user message on error.
Implementation Details:
Determines or creates chat model instance.
Formats conversation into a text block.
Renders prompt with dates and conversation.
Calls chat model and cleans output.
Usage Example:
question = full_question(tenant_id=1, llm_id="gpt-4", messages=chat_history)
cross_languages(tenant_id, llm_id, query, languages=[]) -> str
Translates or processes a query across multiple languages using a chat model.
Parameters:
tenant_id,llm_id— Identifiers for tenant and model.query— Input query string.languages— List of target language codes.
Returns: Processed multilingual output or original query on error.
Usage Example:
translations = cross_languages(1, "gpt-4", "Hello world", ["fr", "de"])
content_tagging(chat_mdl, content: str, all_tags: list, examples: list, topn: int=3) -> dict
Tags content using a chat model and returns a dictionary of tag counts.
Parameters:
chat_mdl— Chat model instance.content— Content to tag.all_tags— List of possible tags.examples— Example data to guide tagging.topn— Number of tags to consider.
Returns: Dictionary mapping tag names to integer counts.
Implementation Details:
Renders content tagging template with examples.
Calls chat model and attempts robust JSON repair on output.
Filters and returns tags with positive counts.
Usage Example:
tags = content_tagging(chat_model, article_text, all_tags_list, example_data)
vision_llm_describe_prompt(page=None) -> str
Generates prompt for vision LLM describing a page.
Parameters:
page— Optional page metadata.
Returns: Rendered prompt string.
vision_llm_figure_describe_prompt() -> str
Generates a prompt for vision LLM to describe a figure.
Returns: Rendered prompt string.
Tool and Task Management Functions
tool_schema(tools_description: list[dict], complete_task=False) -> str
Formats tool descriptions into a Markdown + JSON schema string for LLM consumption.
Parameters:
tools_description— List of tool dicts withfunctionmetadata.complete_task— Whether to include a special "complete_task" function.
Returns: Combined Markdown string describing all tools.
Usage Example:
schema_str = tool_schema(tools_list, complete_task=True)
form_history(history: list, limit: int = -6) -> str
Formats recent chat history messages into a textual context string.
Parameters:
history— List of message dicts.limit— Number of recent messages to include (negative index).
Returns: Concatenated string with roles and truncated content.
analyze_task(chat_mdl, prompt: str, task_name: str, tools_description: list[dict], user_defined_prompts: dict={}) -> str
Analyzes a task description and tools using the chat model.
Parameters:
chat_mdl— Chat model instance.prompt— Agent prompt string.task_name— Name of the task.tools_description— List of tool dicts.user_defined_prompts— Optional dict to override templates.
Returns: Analysis result string or empty on error.
next_step(chat_mdl, history: list, tools_description: list[dict], task_desc: str, user_defined_prompts: dict={}) -> Tuple[str, int]
Determines the next tool to call or completion in a task workflow.
Parameters:
chat_mdl— Chat model instance.history— Conversation history.tools_description— Tools available.task_desc— Task description string.user_defined_prompts— Template overrides.
Returns: Tuple of (JSON string with next step, token count).
reflect(chat_mdl, history: list, tool_call_res: list[Tuple], user_defined_prompts: dict={}) -> str
Reflects on prior tool calls and history to generate observations and reflections.
Parameters:
chat_mdl— Chat model.history— Chat history.tool_call_res— List of tuples (tool_name, result).user_defined_prompts— Template overrides.
Returns: Formatted string with observations and reflection text.
form_message(system_prompt: str, user_prompt: str) -> list
Forms a standardized message list with system and user prompts.
Returns: List of two dict messages.
tool_call_summary(chat_mdl, name: str, params: dict, result: str, user_defined_prompts: dict={}) -> str
Generates a summary of a tool call result for memory or logging.
Parameters:
chat_mdl— Chat model instance.name— Tool name.params— Parameters used in tool call.result— Result string.user_defined_prompts— Template overrides.
Returns: Summary string.
rank_memories(chat_mdl, goal: str, sub_goal: str, tool_call_summaries: list[str], user_defined_prompts: dict={}) -> str
Ranks multiple memory summaries according to goal relevance.
Parameters:
chat_mdl— Chat model.goal— Main goal string.sub_goal— Sub goal string.tool_call_summaries— List of summary strings.user_defined_prompts— Template overrides.
Returns: Ranking output string.
gen_meta_filter(chat_mdl, meta_data: dict, query: str) -> list
Generates metadata filters applicable to a user query.
Parameters:
chat_mdl— Chat model.meta_data— Dictionary of metadata keys.query— User question string.
Returns: List of generated filter strings or empty list on failure.
Important Implementation Details and Algorithms
Token Counting and Message Trimming:
Themessage_fit_infunction carefully counts tokens across messages and attempts multiple heuristics to trim or prioritize system and last user messages to fit within token limits imposed by the LLM APIs.Prompt Rendering with Jinja2:
The module uses a globally configured Jinja2 environment (PROMPT_JINJA_ENV) to render prompt templates dynamically, allowing for flexible prompt customization.Robust JSON Handling:
Outputs from LLMs that are expected to be JSON are parsed with thejson_repairlibrary to recover from common formatting errors, using fallback heuristics if direct parsing fails.Tool Schema Construction:
Tools are described in a JSON schema format embedded in Markdown, enabling LLMs to understand available functions and their parameters, facilitating function calling workflows.
Interactions with Other System Components
Chat Models (
chat_mdl):
This file expects a chat model abstraction exposing a.chat()method to send prompts and receive responses. This abstraction is linked with services such asLLMBundlethat manage specific LLM types and tenants.Prompt Templates:
Loads prompt templates viaload_promptfromrag.prompts.prompt_template.Database Services:
ImportsDocumentServicefromapi.db.services.document_serviceto fetch document metadata related to knowledge chunks.Utility Modules:
Uses utilities such asnum_tokens_from_stringfor token counting, andencoderfor encoding/decoding text to tokens.JSON Repair:
Uses thejson_repairlibrary to handle and recover malformed JSON responses from LLMs.
Visual Diagram: Class / Function Structure
This file is a utility module primarily composed of standalone functions managing prompts and message formatting. Below is a flowchart representing the main functional relationships:
flowchart TD
A[get_value]
B[chunks_format]
C[message_fit_in]
D[kb_prompt]
E[citation_prompt]
F[citation_plus]
G[keyword_extraction]
H[question_proposal]
I[full_question]
J[cross_languages]
K[content_tagging]
L[vision_llm_describe_prompt]
M[vision_llm_figure_describe_prompt]
N[tool_schema]
O[form_history]
P[analyze_task]
Q[next_step]
R[reflect]
S[form_message]
T[tool_call_summary]
U[rank_memories]
V[gen_meta_filter]
C --> G
C --> H
C --> K
C --> T
C --> U
D --> DocumentService[DocumentService]
I --> LLMBundle[LLMBundle]
J --> LLMBundle
P --> N
Q --> N
R --> C
T --> C
U --> C
V --> chat_mdl[chat_mdl]
style chat_mdl fill:#f9f,stroke:#333,stroke-width:1px
style LLMBundle fill:#bbf,stroke:#333,stroke-width:1px
style DocumentService fill:#bbf,stroke:#333,stroke-width:1px
Summary
prompts.py is a comprehensive prompt engineering utility module designed to support complex LLM-based workflows by managing prompt templates, message formatting, token limits, and interactions with chat models. It enables flexible, dynamic generation of prompts for a variety of tasks such as keyword extraction, content tagging, multilingual translation, task analysis, and reflection, while providing robust handling of LLM responses and integration with other system components like document services and LLM bundles.
This module is essential for orchestrating how the InfiniFlow system communicates and reasons via large language models.