test_agent_runner.go

Overview

The test_agent_runner.go file provides utilities for testing agents that interact with large language models (LLMs) within a session-based framework. It defines the TestAgentRunner type to simplify creating, managing, and running agent sessions with configurable initial states, as well as a MockModel to simulate LLM model behavior for controlled testing scenarios. The file also includes helper functions to collect events, parts, and text parts from streamed session events.

This code is crucial for unit and integration testing of agents in the system, facilitating controlled execution flows and validation of agent behavior without depending on live LLM backends or persistent storage.

Detailed Explanation

TestAgentRunner

TestAgentRunner is a helper struct designed to run agent sessions in tests with minimal boilerplate. It encapsulates an agent, session service, runner, and session state management.

Structure

type TestAgentRunner struct {
    agent            agent.Agent
    sessionService   session.Service
    lastSession      session.Session
    initSessionState map[string]any
    appName          string
    runner           *runner.Runner
}

agent: The agent under test.
sessionService: The session service to create and get sessions; here, an in-memory service is used.
lastSession: Caches the last session used for efficiency.
initSessionState: Initial state for new sessions.
appName: Application namespace for sessions.
runner: The agent execution runner that drives the agent logic.

Methods

session(t *testing.T, appName, userID, sessionID string) (session.Session, error)
Returns an existing session if cached and matching sessionID, otherwise creates a new session with initSessionState. Uses the in-memory sessionService. This method ensures consistent session retrieval or creation during tests.
SetInitSessionState(state map[string]any)
Sets the initial session state used when creating new sessions. Useful for configuring test scenarios with specific starting conditions.
Run(t *testing.T, sessionID, newMessage string) iter.Seq2[*session.Event, error]
Runs the agent with a new user message string. Converts the message into genai.Content with RoleUser and delegates to RunContent. Returns a stream of session events.
RunContent(t *testing.T, sessionID string, content *genai.Content) iter.Seq2[*session.Event, error]
Runs the agent with a specified LLM content message. Uses default run configuration. Returns a stream of session events.
RunContentWithConfig(t *testing.T, sessionID string, content *genai.Content, cfg agent.RunConfig) iter.Seq2[*session.Event, error]
Runs the agent with a specified LLM content and run configuration (agent.RunConfig). Retrieves or creates a session, then calls the internal runner to execute the agent logic.

Constructor

NewTestAgentRunner(t *testing.T, agent agent.Agent) *TestAgentRunner
Creates a new TestAgentRunner instance for the given agent. It sets up an in-memory session service and a new internal runner with the app name "test_app". Any initialization error fails the test immediately.

Usage Example

func TestAgent(t *testing.T) {
    ag := NewMyAgent() // agent.Agent implementation
    runner := NewTestAgentRunner(t, ag)

    eventsStream := runner.Run(t, "session1", "Hello, agent!")
    events, err := CollectEvents(eventsStream)
    if err != nil {
        t.Fatal(err)
    }
    // Assert on events
}

MockModel

MockModel is a mock implementation of the model.LLM interface, simulating an LLM for testing without external calls.

Structure

type MockModel struct {
    Requests             []*model.LLMRequest
    Responses            []*genai.Content
    StreamResponsesCount int
}

Requests: Stores all LLM requests received.
Responses: Predefined LLM content responses to return when generating content.
StreamResponsesCount: Number of stream responses to simulate in streaming mode.

Methods

GenerateContent(ctx context.Context, req *model.LLMRequest, stream bool) iter.Seq2[*model.LLMResponse, error]
Implements model.LLM. If stream is true, delegates to GenerateStream, otherwise calls Generate and yields a single response.
Generate(ctx context.Context, req *model.LLMRequest) (*model.LLMResponse, error)
Records the request and returns the first predefined response or an error if none are available.
GenerateStream(ctx context.Context, req *model.LLMRequest) iter.Seq2[*model.LLMResponse, error]
Yields multiple streaming LLM responses based on StreamResponsesCount. Uses an internal aggregator (llminternal.NewStreamingResponseAggregator) to process responses before yielding.
Name() string
Returns the model name "mock".

Errors

Returns errNoModelData if no responses are preset when generating content.

Usage

mock := &MockModel{
    Responses: []*genai.Content{
        genai.NewContentFromText("Hello from mock", genai.RoleAssistant),
    },
}

resp, err := mock.Generate(ctx, req)

Utility Functions for Event Collection

These functions simplify collecting session events or content parts from streaming session responses.

CollectEvents(stream iter.Seq2[*session.Event, error]) ([]*session.Event, error)
Collects all session events from the stream until an error occurs or the stream ends. Validates non-empty event content.
CollectParts(stream iter.Seq2[*session.Event, error]) ([]*genai.Part, error)
Collects all genai.Part objects from the content of each event.
CollectTextParts(stream iter.Seq2[*session.Event, error]) ([]string, error)
Collects textual parts from the streamed events' content parts.

Important Implementation Details and Algorithms

Session Caching in TestAgentRunner: The session method caches the last session to avoid unnecessary session creation or retrieval. It uses fixed app name and user ID ("test_app", "test_user") for simplicity in tests.
Streaming Response Aggregation: The MockModel uses llminternal.NewStreamingResponseAggregator() to simulate incremental streaming responses, processing each generated chunk before yielding.
In-memory Session Service: The runner uses session.InMemoryService() for session management, suitable for ephemeral testing scenarios without persistence.
Use of iter.Seq2: Streaming interface abstracting over sequences yielding results or errors, consistent with the rest of the system for asynchronous or streaming operations.

Interactions with Other Parts of the System

Agent and Runner Integration: TestAgentRunner wraps the runner.Runner from the Agent Execution Runner topic. It orchestrates session creation/retrieval, message processing, and agent invocation.
Session Management: Uses the Session Management in-memory implementation to maintain session state and events during tests.
Model Interface: MockModel implements the model.LLM interface from LLM Integration and Agents, enabling mock LLM behavior for testing without external calls.
LLM Content and Roles: Utilizes genai.Content and roles for message representation aligned with the system's messaging model.
Streaming Utilities: The file relies on iter.Seq2 streaming abstraction for handling event streams, consistent with the system-wide pattern.

Mermaid Diagram

classDiagram
class TestAgentRunner {
-agent: Agent
-sessionService: SessionService
-lastSession: Session
-initSessionState: map[string]any
-appName: string
-runner: Runner
+session()
+SetInitSessionState()
+Run()
+RunContent()
+RunContentWithConfig()
}
class MockModel {
-Requests: []*LLMRequest
-Responses: []*Content
-StreamResponsesCount: int
+GenerateContent()
+Generate()
+GenerateStream()
+Name()
}
class Utilities {
+CollectEvents()
+CollectParts()
+CollectTextParts()
}
TestAgentRunner --> "1" agent.Agent
TestAgentRunner --> "1" session.Service
TestAgentRunner --> "1" runner.Runner
MockModel ..|> model.LLM
Utilities ..> session.Event
Utilities ..> genai.Part

This documentation references concepts such as the agent and runner system (Agent Execution Runner), session management (Session Management), and LLM integration (LLM Integration and Agents) for detailed background on those components.