long_running_function_test.go

Overview

This file contains automated tests for validating the behavior and integration of long-running function tools within the agent framework. Specifically, it focuses on the creation, execution, and event lifecycle of tools flagged as long-running operations via the functiontool package. The tests simulate interactions with a mock LLM model and an LLM agent (llmagent), verifying that function calls, responses, and event streams correctly reflect the expected long-running flow.

The tests employ generics, mock utilities, and event collectors to verify the correctness of function invocation, ID propagation, and multi-step workflows typical in asynchronous or deferred processing scenarios.

The file primarily tests the interaction between:

functiontool — for creating and managing function tools marked as long-running,
llmagent — to run agents that use these tools,
testutil — for mocking and validating model responses and agent runs,
genai — for representing conversational and function call content.

Detailed Explanation of Classes, Functions, and Methods

1. `TestNewLongRunningFunctionTool(t *testing.T)`

Purpose:
Verifies that a function tool configured as long-running is created correctly with the expected name, description, and long-running flag.

Key Steps:

Defines input/output structs (SumArgs, SumResult) for the tool handler.
Creates a handler function that returns a fixed "Processing sum" result.
Calls functiontool.New with IsLongRunning: true.
Validates:
- Tool's name and description,
- IsLongRunning flag is true,
- The internal function tool's declaration description contains a note about long-running operations.

Usage Example:

handler := func(ctx tool.Context, input SumArgs) (SumResult, error) {
    return SumResult{Result: "Processing sum"}, nil
}
sumTool, err := functiontool.New(functiontool.Config{
    Name: "sum",
    Description: "sums two integers",
    IsLongRunning: true,
}, handler)

2. `NewContentFromFunctionResponseWithID(name string, response map[string]any, id, role string) *genai.Content`

Purpose:
Constructs a genai.Content object from a function response, manually assigning an ID and role to the response part.

Parameters:

name: Function name.
response: Map representing the function response data.
id: The unique ID to assign to the function response.
role: The role string (e.g., "user", "model").

Returns:
A pointer to a genai.Content instance with the ID set on the function response part.

Usage: Used primarily in tests to simulate function responses with specific IDs, aiding in event stream validation.

3. `TestLongRunningFunctionFlow(t *testing.T)`

Purpose:
Tests the general workflow of a long-running function tool whose handler returns a map[string]string result indicating a status.

Details:

Defines increaseByOne handler returning a map with status "pending".
Uses a helper testLongRunningFunctionFlow to execute and validate the flow.
Checks that the function is called exactly once and that LLM requests and event streams match expectations.

4. `TestLongRunningStringFunctionFlow(t *testing.T)`

Purpose:
Similar to TestLongRunningFunctionFlow, but tests a long-running function tool whose handler returns a string result instead of a map.

Details:

The handler returns "pending" as a string.
Delegates to the same helper testLongRunningFunctionFlow with adjusted expected keys.

5. `testLongRunningFunctionFlow[Out any](t testing.T, increaseByOne func(ctx tool.Context, x IncArgs) (Out, error), resultKey string, callCount int)`

Purpose:
Generic test helper to execute and verify the workflow of a long-running function tool with any output type.

Parameters:

t: Test handle.
increaseByOne: The function handler to test.
resultKey: The key in the function response map or string result to check ("status" or "result").
callCount: Pointer to an integer tracking number of handler invocations.

Workflow:

Creates a slice of mock genai.Content responses simulating function call and subsequent LLM text responses.
Creates a mock LLM model (testutil.MockModel) initialized with these responses.
Creates a long-running function tool with the given handler.
Instantiates an LLM agent (llmagent) with the mock model and tool.
Runs the agent via a test runner (testutil.NewTestAgentRunner).
Collects and validates the initial event stream and LLM requests.
Extracts the function call event ID to simulate follow-up function responses with different result contents.
Runs multiple subtests simulating continued polling of the function result, checking:
- Number of LLM requests issued,
- Event stream contents,
- Proper merging and propagation of function call IDs and responses.
Verifies the handler is only called once, confirming no redundant executions.

Key Assertions:

Correct request sequence and content sent to the LLM.
Proper event parts generated and streamed.
Long-running function IDs are preserved across event updates.
Function call handler invoked only once despite emulated polling.

6. `TestLongRunningToolIDsAreSet(t *testing.T)`

Purpose:
Verifies that the long-running function tool correctly sets unique IDs on function call events and that these IDs are propagated properly in the agent event stream.

Workflow:

Sets up a mock model with prepared function call and text response.
Defines a handler returning a "pending" status.
Creates a long-running function tool and an LLM agent using the mock model.
Runs the agent once with an initial prompt.
Collects the generated events and checks:
- The function call event includes a non-nil LongRunningToolIDs slice with exactly one ID.
- The function response and LLM events have appropriate LongRunningToolIDs (none or empty).
- The ID in LongRunningToolIDs matches the function call part's generated ID.

Important Implementation Details and Algorithms

Long-running flag handling:
The tests ensure that the IsLongRunning flag in the function tool config results in a tool description containing a note about long-running operations. This influences tool behavior and documentation.
Function call ID propagation:
The tests check that function call events have unique IDs that are preserved in subsequent function response events. This is critical for correctly tracking asynchronous or multi-step function executions within the agent event stream.
Generic testing function with type parameter:
The testLongRunningFunctionFlow uses a generic type parameter [Out any] to support testing handlers returning different output types (maps or strings). This allows reuse of test logic for various function signatures.
Mock LLM model with predefined responses:
The mock model returns a sequence of contents representing function calls and user/model text responses, simulating an LLM conversation involving function execution and intermediate results.
Event collection and assertions:
The tests rely on utility functions (testutil.CollectParts, testutil.CollectEvents) to gather streamed event parts or full events, which are then compared using the go-cmp package for structural equality, ignoring non-critical fields like IDs where appropriate.
Incremental function response simulation:
The test simulates a polling pattern for long-running functions by sending updated function responses with different statuses or results. It checks that the agent correctly handles these updates and does not re-invoke the function handler unnecessarily.
Interaction with llmagent and functiontool:
The tests exercise the integration between the generic function tool wrapper and the LLM agent, which manages tool execution, message passing, and event propagation.

Interaction with Other Parts of the System

functiontool Package:
The primary subject under test, responsible for wrapping Go functions as callable tools with JSON schema inference and long-running operation support.
llmagent Package:
Manages the lifecycle of LLM-based agents that can invoke tools including long-running function tools. It orchestrates message exchange with the LLM and tool execution.
testutil Package:
Provides utilities for mocking LLM models (MockModel), running agents in tests (TestAgentRunner), and collecting events or parts from event streams.
genai Package:
Defines content types for representing function calls, responses, and text messages within the agent-LMM interaction.
toolinternal Package:
Used internally to access function tool declarations for validation.

This file verifies and validates how these components cooperate during the execution of long-running functions in an agent-driven environment, ensuring that IDs, event streams, and tool invocations behave as expected.

Visual Diagram of File Structure and Workflow

flowchart TD
TestNewLongRunningFunctionTool -->|Creates| functiontool.New
TestLongRunningFunctionFlow --> testLongRunningFunctionFlow
TestLongRunningStringFunctionFlow --> testLongRunningFunctionFlow
testLongRunningFunctionFlow -->|Uses| MockModel
testLongRunningFunctionFlow -->|Creates| longRunningTool
testLongRunningFunctionFlow -->|Creates| llmagent.New
testLongRunningFunctionFlow -->|Runs| TestAgentRunner.Run
testLongRunningFunctionFlow -->|Validates| EventStream
TestLongRunningToolIDsAreSet -->|Creates| longRunningTool
TestLongRunningToolIDsAreSet -->|Creates| llmagent.New
TestLongRunningToolIDsAreSet -->|Runs| TestAgentRunner.Run
TestLongRunningToolIDsAreSet -->|Validates| EventsWithIDs
NewContentFromFunctionResponseWithID -->|Helper for| testLongRunningFunctionFlow

Summary of Key Functions and Their Relationships

Test Functions:
- TestNewLongRunningFunctionTool: Validates tool creation.
- TestLongRunningFunctionFlow and TestLongRunningStringFunctionFlow: Test workflows with different handler return types.
- TestLongRunningToolIDsAreSet: Tests ID propagation on events.
Helper Functions:
- testLongRunningFunctionFlow: Core generic test logic for long-running function flow.
- NewContentFromFunctionResponseWithID: Utility to create contents with assigned IDs.
Supporting Types:
- IncArgs: Empty struct used as input argument type for test handlers.

This file ensures that the long-running function tools integrate properly with LLM agents and that event streams reflect the correct states and IDs through consecutive interactions. It plays a crucial role in validating asynchronous function invocation patterns in the agent framework.

long_running_function_test.go

Overview

Detailed Explanation of Classes, Functions, and Methods

1. TestNewLongRunningFunctionTool(t *testing.T)

2. NewContentFromFunctionResponseWithID(name string, response map[string]any, id, role string) *genai.Content

3. TestLongRunningFunctionFlow(t *testing.T)

4. TestLongRunningStringFunctionFlow(t *testing.T)

5. testLongRunningFunctionFlow[Out any](t *testing.T, increaseByOne func(ctx tool.Context, x IncArgs) (Out, error), resultKey string, callCount *int)

6. TestLongRunningToolIDsAreSet(t *testing.T)