duckduckgo.py
Overview
The duckduckgo.py file provides an integration component for performing search queries using the DuckDuckGo search engine through the duckduckgo_search Python library. It is designed as a tool within a larger agent or AI assistant framework (likely InfiniFlow), enabling privacy-focused web and news searches. The component supports configurable parameters, such as search query keywords and search channel (general web or news), and returns structured search results for downstream consumption.
Key features:
Executes privacy-preserving search queries via DuckDuckGo.
Supports two search channels:
general(web search) andnews(news articles).Limits results to a configurable number (
top_n).Implements retry and error handling logic.
Formats and exposes search results in JSON and formalized content form.
Integrates with the broader agent tool infrastructure through base classes and parameter validation.
Classes, Methods, and Functions
Class: DuckDuckGoParam
Defines and validates input parameters specific to the DuckDuckGo search component.
Description
DuckDuckGoParam extends the base parameter class ToolParamBase to specify the configuration schema and validation for the DuckDuckGo search tool.
Properties
meta: ToolMetaMetadata describing the component, including name, description, and parameter schema.
Parameters include:
query(string, required): Keywords for the search query.channel(string, optional): Search category; either"general"or"news".
top_n: intNumber of top search results to retrieve. Default is 10.
channel: strInternal representation of the search channel. Default is
"text"(likely an internal mapping, overridden in runtime).
Methods
init(self)Initializes the parameters, sets default values, and configures metadata.
check(self) -> NoneValidates parameter values:
Ensures
top_nis a positive integer.Ensures
channelis either"text"or"news".
get_input_form(self) -> dict[str, dict]Returns a dictionary describing the UI/input form layout expected for this tool's parameters:
{ "query": { "name": "Query", "type": "line" }, "channel": { "name": "Channel", "type": "options", "value": "general", "options": ["general", "news"] } }
Usage Example
param = DuckDuckGoParam()
param.top_n = 5
param.channel = "news"
param.check() # Validates parameters
input_form = param.get_input_form()
print(input_form)
Class: DuckDuckGo
Main DuckDuckGo search tool component extending ToolBase and implementing an abstract base class.
Description
DuckDuckGo encapsulates the logic to perform DuckDuckGo searches, handle retries, parse results, and output them in usable formats.
Class Attributes
component_name = "DuckDuckGo"Identifies the component name within the agent/tool framework.
Methods
_invoke(self, **kwargs) -> strCore method that performs the search query.
Parameters:
query(str, required): Search keywords.topic(str, optional): Channel type, either"general"or"news". Defaults to"general".
Behavior:
If no query is provided, sets an empty output and returns empty string.
Attempts the search up to
max_retries + 1times.Depending on the
topic, calls either:DDGS().text()for general text/web search.DDGS().news()for news search.
Processes results by extracting title, URL, and content.
Stores results as JSON output and also provides a "formalized_content" output.
Implements delay and logging on exceptions.
Returns error message if all retries fail.
Return Value:
On success: formatted string content representing search results.
On failure: error message string.
Usage Example:
duck_tool = DuckDuckGo() duck_tool._param = DuckDuckGoParam() duck_tool._param.top_n = 5 output = duck_tool._invoke(query="latest AI news", topic="news") print(output)thoughts(self) -> strReturns a string representing internal "thoughts" or reasoning about the current query, useful for debug/logging or AI reasoning chains.
Example output:
Keywords: latest AI news Looking for the most relevant articles.
Implementation Details and Algorithms
Retry Logic: The
_invokemethod attempts to execute the search multiple times (max_retries + 1) on failure, with delay intervals (delay_after_error), to improve robustness against transient errors.Result Processing: Uses the
DDGScontext manager from theduckduckgo_searchlibrary to perform queries. The results are expected as lists of dictionaries with keys liketitle,href/url, andbody.Flexible Channel Selection: Supports two channels —
"general"for regular web search and"news"for news articles — by calling corresponding DDGS methods (textvsnews).Output Formatting: The tool stores the raw JSON from DDGS and also produces a "formalized_content" output, presumably a cleaned or summarized text representation for downstream use.
Timeout Handling: The
_invokemethod is decorated with@timeoutto limit execution time, configurable via environment variableCOMPONENT_EXEC_TIMEOUT.
Interaction with Other Parts of the System
Inherits from
ToolBaseand usesToolParamBaseandToolMetafrom theagent.tools.basemodule, indicating it is part of a modular agent toolkit architecture.Uses
timeoutdecorator fromapi.utils.api_utilsto enforce execution time limits.Relies on the external
duckduckgo_searchlibrary for actual search queries.Communicates results and errors via
self.set_output()andself.output()methods, presumably managed byToolBasefor integration with the larger system.Designed to be invoked by higher-level agent components or orchestrators that handle user queries, parameter passing, and result consumption.
Mermaid Class Diagram
classDiagram
class DuckDuckGoParam {
+meta: ToolMeta
+top_n: int
+channel: str
+__init__()
+check()
+get_input_form() dict
}
class DuckDuckGo {
+component_name: str
+_invoke(**kwargs) str
+thoughts() str
}
DuckDuckGoParam <|-- ToolParamBase
DuckDuckGo <|-- ToolBase
DuckDuckGo ..|> ABC
Summary
The duckduckgo.py file implements a privacy-focused search tool component that integrates DuckDuckGo's web and news search capabilities into an AI assistant framework. It provides a well-defined parameter structure, robust invocation logic with retries and timeouts, and structured output formatting. This component is fundamental for enabling external knowledge retrieval in applications prioritizing user data privacy and real-time information access.