AI Agent Framework

The AI Agent Framework forms the foundational layer for creating, managing, and executing AI agents within the system. It provides a standardized interface and core implementations that enable agents to process inputs, invoke sub-agents, interact with session state, manage artifacts, and respond asynchronously through event streams. This framework abstracts the complexities involved in agent lifecycle management, invocation context propagation, callback handling, and multi-agent compositions.

Core Concepts and Purpose

At its core, the AI Agent Framework defines the Agent interface, which is the primary contract for all AI agents in the system. This interface and its supporting types enable developers to:

Create custom agents with well-defined names, descriptions, and behaviors.
Compose hierarchical agent trees by defining sub-agents that can be delegated tasks.
Manage the invocation lifecycle of agents, including pre- and post-run callbacks.
Provide context-rich invocation environments that carry session, memory, artifact, and user input data.
Emit events asynchronously as agents process input, call LLMs or tools, and produce outputs.
Support advanced control flow such as transfer of control between agents and early termination of invocations.

The framework solves the problem of standardizing how AI agents are constructed and invoked, ensuring consistent integration with sessions, artifacts, memory, and tools. It also promotes composability by allowing agents to delegate tasks to sub-agents with isolated contexts.

Agent Interface and Implementation

The Agent Interface

The central interface is Agent (defined in agent.go), which requires the following:

Name() string: Returns the unique name of the agent.
Description() string: A brief description of the agent’s capabilities.
Run(InvocationContext) iter.Seq2[*session.Event, error]: Runs the agent logic, producing a stream of session events or errors.
SubAgents() []Agent: Returns child agents to which this agent may delegate.
internal() *agent: Provides internal access (primarily for framework use).

Agents are created via the New constructor, which accepts a Config struct specifying their name, description, sub-agents, callbacks, and the core run function defining the agent's behavior.

Agent Configuration and Lifecycle Callbacks

The Config struct includes:

Name and Description: Used for identification and informing LLMs about agent capabilities.
SubAgents: Enables hierarchical composition, allowing agents to transfer control to child agents.
BeforeAgentCallbacks: Functions executed sequentially before the main agent run logic. If any callback returns a non-nil response or error, the main run is skipped.
Run function: Defines the main agent behavior, invoked with an InvocationContext.
AfterAgentCallbacks: Functions executed after the main run completes, similarly capable of producing additional events.

This callback mechanism allows injection of custom logic around agent execution, such as pre-processing user input, validation, or post-processing output.

Event Streaming Model

The Run method returns an iterator (iter.Seq2[*session.Event, error]) that yields multiple session events asynchronously. This design supports multi-step invocations where an agent can emit partial results, invoke tools or LLMs multiple times, and transfer control gracefully.

Example of a simplified run sequence:

for event, err := range agent.Run(ctx) {
    if err != nil {
        // handle error
    }
    // process event (e.g., send to client, update session)
}

The framework automatically sets the event author to the agent's name if not specified.

Invocation Context

Agents receive an InvocationContext (detailed below) encapsulating all runtime data and services needed during a single invocation call. This context includes session state, artifacts, memory, user content, and flags for controlling invocation flow.

Invocation Context

The InvocationContext interface (defined in context.go) represents all contextual information for an agent invocation. It abstracts:

Agent: The agent being invoked.
Session: The current session object, allowing access to conversation history and state.
Artifacts: Interface to save, list, and load session artifacts (e.g., files, blobs).
Memory: Interface for accessing user-scoped memory across sessions.
InvocationID: A unique identifier for the invocation.
Branch: A string representing the invocation branch path in hierarchical or parallel agent executions (e.g., "agent1.agent2").
UserContent: The initial user message that triggered this invocation.
RunConfig: Configuration parameters controlling runtime behavior.
EndInvocation / Ended: Methods to terminate the current invocation early and check if it has ended.

Agents call EndInvocation() on the context to signal that no further steps should be executed.

Agent Lifecycle and Callbacks

The framework manages the agent lifecycle in three main stages:

BeforeAgentCallbacks: Executed sequentially before the agent's main Run logic. If any callback returns content or an error, the agent run is skipped, and the callback output is emitted as an event.
Run: The core agent logic runs, yielding events as it processes the invocation.
AfterAgentCallbacks: Executed sequentially after the main run completes. If any callback returns new content or error, it generates and yields an additional event.

Callbacks receive a CallbackContext which allows them to access the session state, artifacts, and invocation metadata. They can also modify the session state by returning state deltas that are applied as part of the event actions.

This lifecycle control enables flexible insertion of custom logic before and after agent execution, supporting use cases like validation, logging, or augmenting responses.

Sub-Agent Composition and Invocation Branching

Agents can be composed into trees by specifying sub-agents in their configuration. The framework automatically manages parent-child relationships and allows agents to delegate parts of their tasks to sub-agents.

Each agent invocation carries a branch string identifying its position in the agent hierarchy. This branching mechanism is especially important for agents that run sub-agents in parallel, ensuring isolated conversation histories per sub-agent.

Branches are represented as dot-separated strings (e.g., agent1.agent2.agent3), supporting nested sub-agent calls and context isolation.

Interaction with Sessions, Artifacts, and Memory

The framework tightly integrates with the session and artifact management systems:

Session: Agents operate within a session context, maintaining conversation state and event history.
Artifacts: Agents can save, list, and load artifacts related to the session.
Memory: Agents can access user-scoped memory for cross-session knowledge retrieval.

This integration allows agents to maintain continuity, enrich responses with stored data, and share knowledge across multiple invocations.

Agent Loading and Management

The framework provides a Loader interface (in loader.go) to manage collections of agents. A loader can:

List all available agents by name.
Load a specific agent by name.
Provide a root agent to start invocation chains.

Two main implementations are provided:

SingleLoader: For projects with a single root agent.
MultiLoader: For projects with multiple agents, enforcing unique names and managing a map of agents.

This abstraction supports dynamic agent discovery and runtime flexibility.

Example: Agent Run Flow

The following flowchart illustrates the step-by-step execution of an agent invocation within the framework, highlighting callback execution and event yielding:

flowchart TD
Start[Start Invocation]
BeforeCB[Run BeforeAgentCallbacks]
CheckBefore{Callback returned content or error?}
RunAgent[Run Agent Run Function]
AfterCB[Run AfterAgentCallbacks]
EmitEvent[Yield Event]
End[End Invocation]
Start --> BeforeCB --> CheckBefore
CheckBefore -- Yes --> EmitEvent --> End
CheckBefore -- No --> RunAgent --> AfterCB --> EmitEvent --> End

The invocation begins by executing all before-agent callbacks.
If any before callback produces content or error, the main agent run is skipped, and the callback event is yielded.
Otherwise, the main agent Run function is executed, streaming events.
After completion, after-agent callbacks are executed, and any resulting event is yielded.
Invocation ends after all steps complete or if EndInvocation() is called during execution.

Code Snippet: Agent Creation and Run Invocation

// Create a new custom agent with before and after callbacks.
agent, err := agent.New(agent.Config{
    Name:        "example_agent",
    Description: "An example AI agent",
    Run: func(ctx agent.InvocationContext) iter.Seq2[*session.Event, error] {
        // Agent logic here...
        return func(yield func(*session.Event, error) bool) {
            // Yield an initial event.
            yield(&session.Event{
                Author: "example_agent",
                LLMResponse: model.LLMResponse{
                    Content: genai.NewContentFromText("Hello from example agent", genai.RoleModel),
                },
            }, nil)
        }
    },
    BeforeAgentCallbacks: []agent.BeforeAgentCallback{
        func(ctx agent.CallbackContext) (*genai.Content, error) {
            // Optional pre-run logic
            return nil, nil
        },
    },
    AfterAgentCallbacks: []agent.AfterAgentCallback{
        func(ctx agent.CallbackContext) (*genai.Content, error) {
            // Optional post-run logic
            return nil, nil
        },
    },
})

// Running the agent with an invocation context.
for event, err := range agent.Run(invocationCtx) {
    if err != nil {
        // Handle error
    }
    // Process event (e.g., send to client)
}

Interactions with Other Modules

The AI Agent Framework interacts closely with several other parts of the system:

Session Management: Provides the session and its state for each invocation.
Artifact Management: Enables agents to persist and retrieve session-related artifacts.
Memory Service: Supplies user-scoped memory for knowledge retrieval.
LLM Integration and Agents: Specialized LLM-based agents extend this framework to perform natural language understanding and generation.
Agent Workflow Management: Workflow agents orchestrate execution of multiple sub-agents, leveraging the sub-agent composition capabilities of this framework.
Agent Execution Runner: Coordinates the full lifecycle of agent execution within sessions.
Remote Agent Communication (A2A): Remote agents implement the Agent interface, allowing distributed multi-agent scenarios.

Summary of Key Types

Type	Description
`Agent`	Interface defining agent behavior and composition.
`InvocationContext`	Provides context and services for a single agent call.
`BeforeAgentCallback`	Callback invoked before the main agent run.
`AfterAgentCallback`	Callback invoked after the main agent run.
`Loader`	Interface for loading and managing agents.
`Artifacts`	Interface for artifact operations within a session.
`Memory`	Interface for user-scoped memory access.

This framework enables modular, composable AI agents with rich lifecycle management and contextual awareness, serving as the backbone for all subsequent specialized agent implementations within the system such as LLM agents, workflow agents, and remote agents.

For detailed usage of invocation context and lifecycle callbacks, see the subtopics:

[Agent Invocation Context](/Agent Invocation Context)
[Agent Lifecycle and Callbacks](/Agent Lifecycle and Callbacks)