prompts.py

Overview

The prompts.py file defines a set of string constants that serve as carefully crafted prompt templates for an AI-driven reasoning and information extraction system within the InfiniFlow project. Its primary purpose is to provide structured instructions and special tokens for guiding an AI agent through multi-step question answering processes that rely on iterative web search and fact extraction.

This file does not contain any executable code such as classes or functions; instead, it supplies essential textual templates used elsewhere in the system to orchestrate complex reasoning workflows. The prompts include:

Markers for delimiting search queries and results within interaction logs.
A reasoning prompt instructing the AI how to break down questions, perform searches step-by-step, and synthesize final answers.
An extraction prompt that guides a downstream module to distill a precise fact from search results.

The design supports a chain-of-thought style interaction with external search tools, enabling multi-hop question answering and precise information retrieval.

Detailed Explanation of Constants and Prompt Templates

Special Tokens and Limits

Constant	Description
BEGIN_SEARCH_QUERY	Token marking the start of a search query issued by the AI.
`END_SEARCH_QUERY`	Token marking the end of the search query.
`BEGIN_SEARCH_RESULT`	Token marking the start of the returned search results from the system.
`END_SEARCH_RESULT`	Token marking the end of the search results.
`MAX_SEARCH_LIMIT`	Integer limit (6) indicating the maximum number of search queries the AI is allowed to issue.

These tokens are used to clearly delimit search queries and results embedded within prompt-response cycles, helping the system parse and separate actions from observations.

`REASON_PROMPT`

A multi-line string prompt that instructs the reasoning AI agent on how to approach answering user questions by decomposing them, querying a search system, analyzing results, and synthesizing a final answer.

Key elements:

Task Description: The AI is described as an "advanced reasoning agent" that must break down questions into verifiable steps.
Stepwise Instructions:
1. Analyze the question.
2. Issue search queries enclosed in special tokens.
3. Review search results.
4. Repeat until sufficient facts are collected.
5. Synthesize and give the final answer.
Tool Usage: Emphasizes the MUST use of search query tokens to interact with the search tool.
Search Limit: Specifies the maximum allowed search queries.
Example Workflows: Two detailed examples demonstrate:
- Multi-hop question answering (with multiple dependent searches).
- Simple fact retrieval (with a single or few queries).
Important Rules:
- Decompose queries to one fact at a time.
- Formulate precise queries.
- Provide final answers only after all necessary searches.
- Maintain language consistency.

Usage Example

An AI agent receiving this prompt would:

Parse the user question.

Frame a search query like:

<|begin_search_query|>who founded craigslist?<|end_search_query|>

Receive results delimited by <|begin_search_result|> and <|end_search_result|>.
Extract facts, and iterate with further queries as needed.
Finally, produce a concise answer.

`RELEVANT_EXTRACTION_PROMPT`

A prompt template used by an information extraction module that processes search results to extract the single most relevant fact answering a specific query.

Key elements:

Task Description: Extract the most relevant, concise fact from provided web pages that answers the current search query.
Focus: Ignore previous reasoning steps except for context; concentrate solely on the current search query.
Output Formats:
- If relevant information is found, respond starting with Final Information followed by the concise fact.
- If no relevant information is found, respond with Final Information followed by the phrase No helpful information found.
Contextual Inputs:
- Previous Reasoning Steps: Contextual background (for information only).
- Current Search Query: The query to be answered.
- Searched Web Pages: The text from which to extract the fact.

Usage Example

Given:

Current Search Query: "Where is Martin Campbell from?"
Searched Web Pages: Contains a sentence "Martin Campbell ... is a New Zealand film and television director."

The module should output:

Final Information
Martin Campbell is a New Zealand film and television director.

Important Implementation Details and Algorithms

Prompt Engineering: The file encapsulates prompt engineering best practices by providing clear, structured instructions and examples, enabling AI models to perform multi-hop search and reasoning effectively.
Token Delimitation: Use of unique tokens (<|begin_search_query|>, etc.) to mark boundaries between queries, results, and reasoning steps is crucial for parsing and controlling the flow of information.
Stepwise Reasoning: The reasoning prompt enforces an iterative approach, limiting the number of search queries and encouraging decomposition of complex questions into simpler facts.
Information Extraction Logic: The extraction prompt mandates a strict output format, enabling deterministic parsing of extracted facts or failure notices.

Interaction with Other Parts of the System

Search System Integration: The special tokens in this file are used to interface with an external or internal search engine that retrieves web page snippets or documents.
AI Reasoning Agent: The REASON_PROMPT guides the main reasoning AI component, telling it how to utilize search capabilities and when to produce answers.
Information Extraction Module: The RELEVANT_EXTRACTION_PROMPT is used by a dedicated extraction subcomponent to parse search results and identify precise answers.
Overall Flow:
1. User question → Reasoning agent (with REASON_PROMPT)
2. Reasoning agent issues queries (delimited by tokens)
3. Search system returns results (delimited by tokens)
4. Extraction module processes results (using RELEVANT_EXTRACTION_PROMPT)
5. Reasoning agent synthesizes final answer.

This modular design supports extensibility, allowing the system to plug in different search backends or reasoning engines by consistently using these prompts and tokens.

Mermaid Diagram: File Structure and Prompt Relationships

flowchart TD
    A[User Question Input] --> B[Reasoning Agent]
    B -->|Issues Search Query| C[Search System]
    C -->|Returns Search Results| D[Extraction Module]
    D -->|Extracted Fact| B
    B -->|Final Answer| E[User]

    subgraph prompts.py Content
        direction TB
        REASON_PROMPT["REASON_PROMPT\n- Instructions for multi-hop reasoning\n- Search query/result token usage\n- Examples and rules"]
        EXTRACTION_PROMPT["RELEVANT_EXTRACTION_PROMPT\n- Instructions for fact extraction\n- Output formatting rules"]
        TOKENS["Special Tokens & Limits\n- BEGIN/END_SEARCH_QUERY\n- BEGIN/END_SEARCH_RESULT\n- MAX_SEARCH_LIMIT"]
    end

    B --> REASON_PROMPT
    D --> EXTRACTION_PROMPT
    B --> TOKENS
    C --> TOKENS

Summary

prompts.py is a foundational utility file that defines the core prompt templates and tokens used by the InfiniFlow AI system to orchestrate multi-step reasoning and fact extraction through iterative web search. It contains no executable logic but provides the natural language scaffolding that enables AI components to interact with search tools and extract precise answers effectively. The file enables modular, explainable, and controlled AI workflows critical for advanced question answering applications.