deep_research.py
Overview
The deep_research.py file defines the DeepResearcher class, a core component designed to perform iterative, multi-step reasoning and information retrieval to answer complex questions. It integrates with multiple knowledge sources—including knowledge bases, knowledge graphs, and external web retrieval services—to gather relevant information based on dynamically generated search queries. The class uses a large language model (LLM) interface to generate reasoning steps, extract search queries, and summarize retrieved information in a conversational, streaming manner.
Key features include:
Iterative reasoning with automatic query generation and result analysis.
Multi-source information retrieval (knowledge base, knowledge graph, web).
Management of retrieved chunks and document aggregations with deduplication.
Streaming output of intermediate reasoning states for interactive applications.
Tag-based content extraction and cleaning to handle templated prompts and results.
This file is likely a central component in a system that supports deep, explainable question answering or agentic reasoning workflows.
Classes and Methods
Class: DeepResearcher
The main class responsible for orchestrating deep, multi-step research via reasoning and retrieval.
Initialization
def __init__(self,
chat_mdl: LLMBundle,
prompt_config: dict,
kb_retrieve: partial = None,
kg_retrieve: partial = None)
Parameters:
chat_mdl(LLMBundle): An instance of a language model wrapper used for chat-based interactions and reasoning generation.prompt_config(dict): Configuration dictionary including API keys and feature flags.kb_retrieve(partial, optional): Partial function for knowledge base retrieval given a query.kg_retrieve(partial, optional): Partial function for knowledge graph retrieval given a query.
Purpose:
Sets up the researcher with the LLM, prompt configurations, and retrieval functions.
Static and Helper Methods
_remove_tags
def _remove_tags(text: str, start_tag: str, end_tag: str) -> str
Parameters:
text(str): Input text containing tags.start_tag(str): Starting delimiter tag.end_tag(str): Ending delimiter tag.
Returns:
str- Text with all substrings betweenstart_tagandend_tag(inclusive) removed.Description:
General utility method to strip out tagged sections from text using regex.
_remove_query_tags
@staticmethod
def _remove_query_tags(text: str) -> str
Parameters:
text(str): Text possibly containing search query tags.Returns:
Text with all search query tags (defined byBEGIN_SEARCH_QUERYandEND_SEARCH_QUERY) removed.Usage:
Used to clean reasoning output by removing query-specific markup before concatenation or display.
_remove_result_tags
@staticmethod
def _remove_result_tags(text: str) -> str
Parameters:
text(str): Text possibly containing search result tags.Returns:
Text with all search result tags (defined byBEGIN_SEARCH_RESULTandEND_SEARCH_RESULT) removed.Usage:
Cleans extracted results for clearer presentation or further processing.
_generate_reasoning
def _generate_reasoning(self, msg_history)
Parameters:
msg_history(list): List of chat message dictionaries representing conversation history.Yields:
Streaming strings representing the LLM-generated reasoning content.Description:
Invokes the LLM to continue reasoning based on conversation history. Handles insertion of a prompting user message if last role is not "user". Filters out any prefix content before the</think>marker in the LLM response.Example Usage:
for reasoning_step in deep_researcher._generate_reasoning(msg_history): print(reasoning_step)
_extract_search_queries
def _extract_search_queries(self, query_think, question, step_index)
Parameters:
query_think(str): The reasoning text containing embedded search queries.question(str): The original user question.step_index(int): Current reasoning step index.
Returns:
list[str]- Extracted search queries enclosed betweenBEGIN_SEARCH_QUERYandEND_SEARCH_QUERY.
If no queries found on the first step, returns a list containing the original question.Description:
Parses reasoning output to find embedded search queries for information retrieval.
_truncate_previous_reasoning
def _truncate_previous_reasoning(self, all_reasoning_steps)
Parameters:
all_reasoning_steps(list[str]): List of all prior reasoning step strings.Returns:
str- A truncated string representing a summarized or condensed version of previous reasoning steps, maintaining important tagged steps and limiting length.Purpose:
Prevents excessive prompt length when feeding prior reasoning into the LLM by selectively including key steps and inserting ellipses (...) to indicate omitted content.
_retrieve_information
def _retrieve_information(self, search_query)
Parameters:
search_query(str): A query string for information retrieval.Returns:
dict- A dictionary with keys"chunks"(list of content chunks) and"doc_aggs"(list of document aggregations) containing retrieved information.Description:
Retrieves relevant information from multiple sources:Knowledge base via
self._kb_retrieve.Web via the Tavily API if configured.
Knowledge graph via
self._kg_retrieveif enabled.
Error Handling:
Logs errors without interrupting the retrieval process.
_update_chunk_info
def _update_chunk_info(self, chunk_info, kbinfos)
Parameters:
chunk_info(dict): Existing chunk info dictionary to update.kbinfos(dict): Newly retrieved info dictionary.
Returns:
None (updateschunk_infoin place).Purpose:
Merges newly retrieved chunks and document aggregations into existing data, avoiding duplicates based onchunk_idanddoc_id.
_extract_relevant_info
def _extract_relevant_info(self, truncated_prev_reasoning, search_query, kbinfos)
Parameters:
truncated_prev_reasoning(str): Condensed prior reasoning text.search_query(str): The current search query string.kbinfos(dict): Retrieved content chunks and documents.
Yields:
Streaming strings of summarized relevant information extracted by the LLM.Description:
Uses the LLM to analyze retrieved documents against the search query and prior reasoning, extracting useful information to progress reasoning.
thinking
def thinking(self, chunk_info: dict, question: str)
Parameters:
chunk_info(dict): Dictionary accumulating retrieved chunks and document aggregates. Modified in place.question(str): The user's original question.
Yields:
Streaming dictionaries containing partial or final reasoning outputs, in the form:{ "answer": str, # The current reasoning text including <think> tags "reference": dict, # Placeholder for citation/reference data (currently empty) "audio_binary": None # Placeholder (currently None) }Description:
The core iterative workflow method that performs the following in a loop until a maximum step count (MAX_SEARCH_LIMIT) or stopping condition:Generate reasoning steps via LLM.
Extract search queries from reasoning.
For each query, retrieve relevant information.
Update local chunk info with retrieved data.
Extract and summarize relevant results.
Stream intermediate results and reasoning states.
Important Details:
Prevents duplicate searches by tracking executed queries.
Truncates previous reasoning to keep prompts manageable.
Stops if no new queries are found or maximum steps are reached.
Streaming yields enable interactive or asynchronous consumption of reasoning progress.
Example Usage:
chunk_info = {"chunks": [], "doc_aggs": []} question = "What are the latest advances in quantum computing?" for step in deep_researcher.thinking(chunk_info, question): print(step["answer"]) # Stream reasoning progress and final answer
Implementation Details and Algorithms
Tag-Based Text Processing:
The class uses explicit begin/end tags (e.g.,BEGIN_SEARCH_QUERY,END_SEARCH_QUERY) embedded in LLM outputs to segment reasoning into queries and results. Methods_remove_tags,_remove_query_tags, and_remove_result_tagsclean these tags when presenting or processing content.Iterative Reasoning and Retrieval:
Thethinkingmethod implements a loop capped byMAX_SEARCH_LIMIT, where each iteration:Generates reasoning steps from conversation history.
Extracts search queries from reasoning output.
Retrieves multi-source data for each query.
Summarizes relevant info and adds it back into the reasoning context.
This cycle enables a form of agentic reasoning, where the LLM acts as an internal researcher dynamically querying and analyzing evidence.
Streaming Interface:
Both reasoning generation and relevant info extraction usechat_streamlymethod of the LLM bundle, yielding partial outputs as they're generated. This design supports real-time UI updates or progressive answer building.Deduplication of Retrieved Content:
When merging newly retrieved chunks and documents into existing collections, duplicates are avoided by checking unique IDs (chunk_id,doc_id).Truncation Strategy for Prompt Management:
To keep the prompt size manageable for the LLM,_truncate_previous_reasoningselectively keeps the first step, the last four steps, and any steps containing query/result tags, inserting ellipses otherwise.Integration with External APIs and Services:
Uses a knowledge base retrieval function possibly backed by an internal database or vector store.
Optionally integrates with Tavily API for web-based chunk retrieval if
tavily_api_keyis configured.Optionally queries a knowledge graph if enabled and a retrieval function is provided.
Interaction with Other System Components
LLMBundle (
chat_mdl)
The class heavily depends on an LLM interface (LLMBundle) for generating reasoning, extracting relevant info, and parsing queries. This bundle is likely an abstraction over OpenAI or similar LLM APIs customized for chat streaming.Retrieval Services
Knowledge Base retrieval (
kb_retrieve): Expected to be a function returning document chunks and metadata given a query.Knowledge Graph retrieval (
kg_retrieve): Optional function for querying a knowledge graph.Tavily Web Retrieval: Utilizes the
Tavilyclass for web search and chunk retrieval, integrated via the API key in prompt configuration.
Prompts and Constants
Uses predefined prompt templates and tag constants (BEGIN_SEARCH_QUERY,END_SEARCH_QUERY, etc.) imported fromagentic_reasoning.promptsandrag.prompts.Utility Functions
Usesextract_betweenfromrag.nlpfor substring extraction between tags andkb_promptfromrag.promptsto format knowledge base chunks.
Visual Diagram
classDiagram
class DeepResearcher {
-chat_mdl: LLMBundle
-prompt_config: dict
-_kb_retrieve: partial
-_kg_retrieve: partial
+__init__(chat_mdl, prompt_config, kb_retrieve=None, kg_retrieve=None)
+thinking(chunk_info: dict, question: str)
-_generate_reasoning(msg_history)
-_extract_search_queries(query_think, question, step_index)
-_truncate_previous_reasoning(all_reasoning_steps)
-_retrieve_information(search_query)
-_update_chunk_info(chunk_info, kbinfos)
-_extract_relevant_info(truncated_prev_reasoning, search_query, kbinfos)
+_remove_query_tags(text: str)
+_remove_result_tags(text: str)
-_remove_tags(text: str, start_tag: str, end_tag: str)
}
DeepResearcher ..> LLMBundle : uses
DeepResearcher ..> "partial" : kb_retrieve, kg_retrieve functions
DeepResearcher ..> Tavily : for web retrieval
Summary
The deep_research.py file encapsulates a sophisticated reasoning and retrieval engine that incrementally builds answers to complex questions by:
Leveraging LLMs for reasoning and query generation.
Querying multiple data sources.
Summarizing and integrating findings.
Streaming intermediate results for responsiveness.
It is designed for extensibility (plugging in different retrieval backends) and real-time interaction, making it suitable for advanced AI assistants, research bots, or explainable AI systems.