prompts.py


Overview

prompts.py is a utility and prompt management module within the InfiniFlow system designed to facilitate the construction, formatting, and management of natural language prompts for interaction with large language models (LLMs). It provides functionality for generating, formatting, and managing prompts related to tasks such as keyword extraction, question proposal, content tagging, multi-language translation, task analysis, next step planning, reflection, and summarization.

The file leverages Jinja2 templating to dynamically render prompt templates, integrates token counting and message length fitting to comply with LLM input limits, and wraps interaction patterns with LLM chat models.

Key features include:

This module is central to how InfiniFlow constructs and manages prompt engineering workflows that drive LLM-based reasoning and task execution.


Detailed Explanation of Functions

Utility Functions


get_value(d: dict, k1: str, k2: str) -> any

Returns the value for key k1 in dictionary d, or if not found, returns the value for key k2.

value = get_value(chunk, "chunk_id", "id")

chunks_format(reference: dict) -> list[dict]

Formats a "reference" dictionary containing chunks of knowledge into a normalized list of chunk dictionaries with consistent keys.

formatted_chunks = chunks_format(reference)

message_fit_in(msg: list[dict], max_length: int=4000) -> Tuple[int, list[dict]]

Truncates or filters a list of message dicts (with role and content) to fit under a specified token limit.

token_count, trimmed_msg = message_fit_in(messages, max_length=3000)

kb_prompt(kbinfos: dict, max_tokens: int, hash_id: bool=False) -> list[str]

Formats knowledge base chunks into a prompt-friendly list of strings, each representing a chunk with metadata and content.

knowledge_strings = kb_prompt(kbinfos, max_tokens=2000)

Prompt Rendering Functions

These functions render specific prompt templates using Jinja2 and interact with chat models.


citation_prompt(user_defined_prompts: dict={}) -> str

Renders the citation prompt template, optionally overridden by user-defined prompts.

prompt_str = citation_prompt()

citation_plus(sources: str) -> str

Renders an enhanced citation prompt including example citations and provided sources.

prompt_str = citation_plus("Source 1, Source 2")

keyword_extraction(chat_mdl, content: str, topn: int=3) -> str

Uses a chat model to extract the top N keywords from content.

keywords = keyword_extraction(chat_model, "The quick brown fox jumps...", topn=5)

question_proposal(chat_mdl, content: str, topn: int=3) -> str

Proposes questions based on given content, similar to keyword extraction.

questions = question_proposal(chat_model, "Some article content...")

full_question(tenant_id=None, llm_id=None, messages=[], language=None, chat_mdl=None) -> str

Generates a full question prompt based on a conversation history.

question = full_question(tenant_id=1, llm_id="gpt-4", messages=chat_history)

cross_languages(tenant_id, llm_id, query, languages=[]) -> str

Translates or processes a query across multiple languages using a chat model.

translations = cross_languages(1, "gpt-4", "Hello world", ["fr", "de"])

content_tagging(chat_mdl, content: str, all_tags: list, examples: list, topn: int=3) -> dict

Tags content using a chat model and returns a dictionary of tag counts.

tags = content_tagging(chat_model, article_text, all_tags_list, example_data)

vision_llm_describe_prompt(page=None) -> str

Generates prompt for vision LLM describing a page.


vision_llm_figure_describe_prompt() -> str

Generates a prompt for vision LLM to describe a figure.


Tool and Task Management Functions


tool_schema(tools_description: list[dict], complete_task=False) -> str

Formats tool descriptions into a Markdown + JSON schema string for LLM consumption.

schema_str = tool_schema(tools_list, complete_task=True)

form_history(history: list, limit: int = -6) -> str

Formats recent chat history messages into a textual context string.


analyze_task(chat_mdl, prompt: str, task_name: str, tools_description: list[dict], user_defined_prompts: dict={}) -> str

Analyzes a task description and tools using the chat model.


next_step(chat_mdl, history: list, tools_description: list[dict], task_desc: str, user_defined_prompts: dict={}) -> Tuple[str, int]

Determines the next tool to call or completion in a task workflow.


reflect(chat_mdl, history: list, tool_call_res: list[Tuple], user_defined_prompts: dict={}) -> str

Reflects on prior tool calls and history to generate observations and reflections.


form_message(system_prompt: str, user_prompt: str) -> list

Forms a standardized message list with system and user prompts.


tool_call_summary(chat_mdl, name: str, params: dict, result: str, user_defined_prompts: dict={}) -> str

Generates a summary of a tool call result for memory or logging.


rank_memories(chat_mdl, goal: str, sub_goal: str, tool_call_summaries: list[str], user_defined_prompts: dict={}) -> str

Ranks multiple memory summaries according to goal relevance.


gen_meta_filter(chat_mdl, meta_data: dict, query: str) -> list

Generates metadata filters applicable to a user query.


Important Implementation Details and Algorithms


Interactions with Other System Components


Visual Diagram: Class / Function Structure

This file is a utility module primarily composed of standalone functions managing prompts and message formatting. Below is a flowchart representing the main functional relationships:

flowchart TD
    A[get_value]
    B[chunks_format]
    C[message_fit_in]
    D[kb_prompt]
    E[citation_prompt]
    F[citation_plus]
    G[keyword_extraction]
    H[question_proposal]
    I[full_question]
    J[cross_languages]
    K[content_tagging]
    L[vision_llm_describe_prompt]
    M[vision_llm_figure_describe_prompt]
    N[tool_schema]
    O[form_history]
    P[analyze_task]
    Q[next_step]
    R[reflect]
    S[form_message]
    T[tool_call_summary]
    U[rank_memories]
    V[gen_meta_filter]

    C --> G
    C --> H
    C --> K
    C --> T
    C --> U
    D --> DocumentService[DocumentService]
    I --> LLMBundle[LLMBundle]
    J --> LLMBundle
    P --> N
    Q --> N
    R --> C
    T --> C
    U --> C
    V --> chat_mdl[chat_mdl]

    style chat_mdl fill:#f9f,stroke:#333,stroke-width:1px
    style LLMBundle fill:#bbf,stroke:#333,stroke-width:1px
    style DocumentService fill:#bbf,stroke:#333,stroke-width:1px

Summary

prompts.py is a comprehensive prompt engineering utility module designed to support complex LLM-based workflows by managing prompt templates, message formatting, token limits, and interactions with chat models. It enables flexible, dynamic generation of prompts for a variety of tasks such as keyword extraction, content tagging, multilingual translation, task analysis, and reflection, while providing robust handling of LLM responses and integration with other system components like document services and LLM bundles.

This module is essential for orchestrating how the InfiniFlow system communicates and reasons via large language models.