agent_with_tools.py
Overview
The agent_with_tools.py file implements an Agent component designed to perform complex task-oriented conversations by leveraging a Large Language Model (LLM) combined with multiple tool plugins. This agent orchestrates multi-turn dialogue sessions, interacts with external tools or microservices, manages conversation memory, and generates context-aware responses for user prompts.
Key functionalities include:
Managing and invoking a collection of tools (including external MCP servers and internal components).
Orchestrating multi-round conversations with reasoning, context, and user prompts.
Streaming incremental responses while invoking tools dynamically based on model-generated function calls.
Generating citations and references based on retrieved knowledge chunks.
Maintaining conversation memory and optimizing multi-turn interactions.
Handling errors gracefully and supporting configurable execution timeouts.
This file integrates tightly with other system components such as LLM services, tenant-specific LLM configurations, MCP servers, and the canvas framework that manages component graphs and memory.
Detailed Explanation of Classes and Methods
Class: AgentParam(LLMParam, ToolParamBase)
Defines the parameters and metadata for configuring an Agent instance.
Attributes:
meta(ToolMeta): Metadata dictionary describing the agent's purpose, expected parameters, and their types.function_name(str): Default function name identifier, set to"agent".tools(list): List of tool component configurations associated with this agent.mcp(list): List of MCP (microservice) configurations.max_rounds(int): Maximum number of reasoning rounds the agent should execute.description(str): Optional description of the agent.
Usage Example:
param = AgentParam()
param.user_prompt = "Explain quantum computing."
param.reasoning = "User wants a detailed explanation."
param.context = "Relevant scientific background."
param.max_rounds = 7
Class: Agent(LLM, ToolBase)
The main agent class that extends LLM capabilities and integrates tool-based function calling.
Initialization: __init__(self, canvas, id, param: LLMParam)
canvas: The component canvas managing the execution graph and shared state.id: Unique identifier for this agent instance.param: An instance ofAgentParamor compatible LLMParam subclass.
Initialization Steps:
Loads tool components specified in the agent parameters.
Builds an LLM bundle (
LLMBundle) configured for tenant-specific LLM usage.Loads MCP server tools and registers their metadata.
Creates a callback function for tool usage logging.
Sets up a
LLMToolPluginCallSessionfor managing tool invocation.
Method: _load_tool_obj(self, cpn: dict) -> object
Loads a tool component object given a component configuration dictionary.
cpn: Dictionary specifying a tool component, including its name and parameters.Returns: An instantiated tool component object.
Raises exceptions if the configuration is invalid.
Method: get_meta(self) -> dict[str, Any]
Returns the metadata dictionary for this agent, including function parameters and descriptions. Injects the current user prompt if available.
Method: get_input_form(self) -> dict[str, dict]
Returns a dictionary describing the input form schema this agent accepts, including inputs from all sub-tools that are LLM-based.
Method: _invoke(self, **kwargs)
Main entry point to execute the agent with the given inputs.
Accepts keyword arguments matching input parameters (e.g.,
user_prompt,reasoning,context).Constructs a combined system prompt incorporating reasoning and context.
If no tools configured, delegates to base LLM invocation.
Otherwise, prepares prompt variables and manages streaming output with tool-use support.
Handles multi-turn conversation with tool calls.
Sets outputs such as
"content"(final response) and"use_tools"(list of invoked tools).
Returns the final answer string.
Timeout: Decorated with a configurable timeout (default 20 minutes).
Method: stream_output_with_tools(self, prompt, msg, user_defined_prompt={})
Streams agent responses incrementally while supporting tool calls.
Yields partial response text chunks.
Handles tool invocation errors by yielding error messages or default fallback values.
Updates outputs
"content"and"use_tools"accordingly.
Method: _gen_citations(self, text)
Generates citation text for the provided text using retrieved knowledge chunks.
Uses the knowledge base prompt generator to create formatted references.
Streams citation content incrementally.
Method: _react_with_tools_streamly(self, prompt, history: list[dict], use_tools, user_defined_prompt={})
Core interaction loop for multi-round reasoning with dynamic tool invocation.
Arguments:
prompt: Initial system prompt.history: List of past message dicts (role,content).use_tools: List to collect tool usage metadata.user_defined_prompt: Additional user-defined prompts or overrides.
Implements the ReAct pattern:
Generates next-step responses using the LLM.
Parses JSON function call lists from the response.
Executes tool calls concurrently using thread pool.
Reflects on tool outputs and appends further user content.
Handles task completion logic, including citation generation.
Enforces a maximum round limit.
Yields tuples of
(response_delta_text, token_count)incrementally.
Method: get_useful_memory(self, goal: str, sub_goal:str, topn=3, user_defined_prompt:dict={}) -> str
Retrieves and ranks conversational memories relevant to the current goal and sub-goal.
Uses a ranking prompt to identify top relevant memory snippets.
Returns concatenated formatted memory text as strings.
Returns error message if ranking fails.
Important Implementation Details and Algorithms
ReAct Framework: The agent implements a ReAct-style interaction where the LLM generates JSON-encoded function calls (tool invocations). The agent parses these, calls the tools concurrently, and feeds back the results into the conversation for iterative reasoning.
Streaming Generation: Responses, including tool calls and citations, are streamed incrementally to enable real-time consumption by clients.
Tool Abstraction: Tools can be local components or remote MCP microservices. They are wrapped in a unified calling session (
LLMToolPluginCallSessionorMCPToolCallSession) to standardize invocation.Multi-turn Memory and Reflection: The agent maintains conversation history and applies reflective summarization after tool responses to guide the next reasoning step.
Citation Generation: If enabled, the agent appends citations referencing retrieved knowledge chunks, improving answer credibility.
Concurrency: Uses
ThreadPoolExecutorto parallelize multiple tool calls per reasoning step.Timeout Handling: Uses a decorator to enforce maximum execution time for the agent invocation.
System Interaction and Integration
Canvas Framework: The agent executes within a
canvasenvironment that manages component graphs, conversation memory, and references.LLM Services: Utilizes tenant-specific LLM bundles (
LLMBundle) configured via database services (TenantLLMService).MCP Servers: Integrates external MCP servers via
MCPServerServiceand their tool metadata.Tool Plugins: Loads internal tool components dynamically at runtime via
component_classfactory.RAG Prompts and Utils: Uses retrieval-augmented generation prompts and utilities for message length fitting, citation formatting, and memory ranking.
API Utility: Uses a timeout utility from
api.utils.api_utilsfor execution control.Logging and Error Handling: Employs extensive logging for debugging and error traceability.
Usage Example
from agent_with_tools import Agent, AgentParam
# Initialize agent parameters
param = AgentParam()
param.user_prompt = "Summarize the latest market trends."
param.reasoning = "User requires a concise and up-to-date summary."
param.context = "Provide background on recent stock movements."
param.max_rounds = 3
# Simulated canvas and component ID
canvas = ... # Provided by execution environment
component_id = "agent_001"
# Create agent instance
agent = Agent(canvas, component_id, param)
# Invoke the agent with input parameters
response = agent._invoke(
user_prompt=param.user_prompt,
reasoning=param.reasoning,
context=param.context
)
print("Agent response:", response)
Mermaid Class Diagram
classDiagram
class AgentParam {
+meta: ToolMeta
+function_name: str
+tools: list
+mcp: list
+max_rounds: int
+description: str
+__init__()
}
class Agent {
+component_name: str
+tools: dict
+chat_mdl: LLMBundle
+tool_meta: list
+callback: function
+toolcall_session: LLMToolPluginCallSession
+__init__(canvas, id, param)
+_load_tool_obj(cpn) object
+get_meta() dict
+get_input_form() dict
+_invoke(**kwargs) string
+stream_output_with_tools(prompt, msg, user_defined_prompt) generator
+_gen_citations(text) generator
+_react_with_tools_streamly(prompt, history, use_tools, user_defined_prompt) generator
+get_useful_memory(goal, sub_goal, topn, user_defined_prompt) string
}
AgentParam <|-- Agent
Agent --|> LLM
Agent --|> ToolBase
Summary
The agent_with_tools.py file provides a powerful and extensible Agent implementation that integrates LLM-based reasoning with tool invocation capabilities. It is designed for complex multi-turn tasks requiring access to external knowledge, tools, and services, while maintaining a conversational memory and delivering streaming results. The architecture supports modular tool plugins, MCP microservices, and tenant-aware LLM configurations, making it a key component in the InfiniFlow AI orchestration platform.