long_running_function_test.go
Overview
This file contains automated tests for validating the behavior and integration of long-running function tools within the agent framework. Specifically, it focuses on the creation, execution, and event lifecycle of tools flagged as long-running operations via the functiontool package. The tests simulate interactions with a mock LLM model and an LLM agent (llmagent), verifying that function calls, responses, and event streams correctly reflect the expected long-running flow.
The tests employ generics, mock utilities, and event collectors to verify the correctness of function invocation, ID propagation, and multi-step workflows typical in asynchronous or deferred processing scenarios.
The file primarily tests the interaction between:
functiontool— for creating and managing function tools marked as long-running,llmagent— to run agents that use these tools,testutil— for mocking and validating model responses and agent runs,genai— for representing conversational and function call content.
Detailed Explanation of Classes, Functions, and Methods
1. TestNewLongRunningFunctionTool(t *testing.T)
Purpose:
Verifies that a function tool configured as long-running is created correctly with the expected name, description, and long-running flag.
Key Steps:
Defines input/output structs (
SumArgs,SumResult) for the tool handler.Creates a handler function that returns a fixed "Processing sum" result.
Calls
functiontool.NewwithIsLongRunning: true.Validates:
Tool's name and description,
IsLongRunningflag istrue,The internal function tool's declaration description contains a note about long-running operations.
Usage Example:
handler := func(ctx tool.Context, input SumArgs) (SumResult, error) {
return SumResult{Result: "Processing sum"}, nil
}
sumTool, err := functiontool.New(functiontool.Config{
Name: "sum",
Description: "sums two integers",
IsLongRunning: true,
}, handler)
2. NewContentFromFunctionResponseWithID(name string, response map[string]any, id, role string) *genai.Content
Purpose:
Constructs a genai.Content object from a function response, manually assigning an ID and role to the response part.
Parameters:
name: Function name.response: Map representing the function response data.id: The unique ID to assign to the function response.role: The role string (e.g., "user", "model").
Returns:
A pointer to a genai.Content instance with the ID set on the function response part.
Usage: Used primarily in tests to simulate function responses with specific IDs, aiding in event stream validation.
3. TestLongRunningFunctionFlow(t *testing.T)
Purpose:
Tests the general workflow of a long-running function tool whose handler returns a map[string]string result indicating a status.
Details:
Defines
increaseByOnehandler returning a map with status "pending".Uses a helper
testLongRunningFunctionFlowto execute and validate the flow.Checks that the function is called exactly once and that LLM requests and event streams match expectations.
4. TestLongRunningStringFunctionFlow(t *testing.T)
Purpose:
Similar to TestLongRunningFunctionFlow, but tests a long-running function tool whose handler returns a string result instead of a map.
Details:
The handler returns "pending" as a string.
Delegates to the same helper
testLongRunningFunctionFlowwith adjusted expected keys.
5. testLongRunningFunctionFlow[Out any](t *testing.T, increaseByOne func(ctx tool.Context, x IncArgs) (Out, error), resultKey string, callCount *int)
Purpose:
Generic test helper to execute and verify the workflow of a long-running function tool with any output type.
Parameters:
t: Test handle.increaseByOne: The function handler to test.resultKey: The key in the function response map or string result to check ("status" or "result").callCount: Pointer to an integer tracking number of handler invocations.
Workflow:
Creates a slice of mock
genai.Contentresponses simulating function call and subsequent LLM text responses.Creates a mock LLM model (
testutil.MockModel) initialized with these responses.Creates a long-running function tool with the given handler.
Instantiates an LLM agent (
llmagent) with the mock model and tool.Runs the agent via a test runner (
testutil.NewTestAgentRunner).Collects and validates the initial event stream and LLM requests.
Extracts the function call event ID to simulate follow-up function responses with different result contents.
Runs multiple subtests simulating continued polling of the function result, checking:
Number of LLM requests issued,
Event stream contents,
Proper merging and propagation of function call IDs and responses.
Verifies the handler is only called once, confirming no redundant executions.
Key Assertions:
Correct request sequence and content sent to the LLM.
Proper event parts generated and streamed.
Long-running function IDs are preserved across event updates.
Function call handler invoked only once despite emulated polling.
6. TestLongRunningToolIDsAreSet(t *testing.T)
Purpose:
Verifies that the long-running function tool correctly sets unique IDs on function call events and that these IDs are propagated properly in the agent event stream.
Workflow:
Sets up a mock model with prepared function call and text response.
Defines a handler returning a "pending" status.
Creates a long-running function tool and an LLM agent using the mock model.
Runs the agent once with an initial prompt.
Collects the generated events and checks:
The function call event includes a non-nil
LongRunningToolIDsslice with exactly one ID.The function response and LLM events have appropriate
LongRunningToolIDs(none or empty).The ID in
LongRunningToolIDsmatches the function call part's generated ID.
Important Implementation Details and Algorithms
Long-running flag handling:
The tests ensure that theIsLongRunningflag in the function tool config results in a tool description containing a note about long-running operations. This influences tool behavior and documentation.Function call ID propagation:
The tests check that function call events have unique IDs that are preserved in subsequent function response events. This is critical for correctly tracking asynchronous or multi-step function executions within the agent event stream.Generic testing function with type parameter:
ThetestLongRunningFunctionFlowuses a generic type parameter[Out any]to support testing handlers returning different output types (maps or strings). This allows reuse of test logic for various function signatures.Mock LLM model with predefined responses:
The mock model returns a sequence of contents representing function calls and user/model text responses, simulating an LLM conversation involving function execution and intermediate results.Event collection and assertions:
The tests rely on utility functions (testutil.CollectParts,testutil.CollectEvents) to gather streamed event parts or full events, which are then compared using thego-cmppackage for structural equality, ignoring non-critical fields like IDs where appropriate.Incremental function response simulation:
The test simulates a polling pattern for long-running functions by sending updated function responses with different statuses or results. It checks that the agent correctly handles these updates and does not re-invoke the function handler unnecessarily.Interaction with
llmagentandfunctiontool:
The tests exercise the integration between the generic function tool wrapper and the LLM agent, which manages tool execution, message passing, and event propagation.
Interaction with Other Parts of the System
functiontoolPackage:
The primary subject under test, responsible for wrapping Go functions as callable tools with JSON schema inference and long-running operation support.llmagentPackage:
Manages the lifecycle of LLM-based agents that can invoke tools including long-running function tools. It orchestrates message exchange with the LLM and tool execution.testutilPackage:
Provides utilities for mocking LLM models (MockModel), running agents in tests (TestAgentRunner), and collecting events or parts from event streams.genaiPackage:
Defines content types for representing function calls, responses, and text messages within the agent-LMM interaction.toolinternalPackage:
Used internally to access function tool declarations for validation.
This file verifies and validates how these components cooperate during the execution of long-running functions in an agent-driven environment, ensuring that IDs, event streams, and tool invocations behave as expected.
Visual Diagram of File Structure and Workflow
flowchart TD
TestNewLongRunningFunctionTool -->|Creates| functiontool.New
TestLongRunningFunctionFlow --> testLongRunningFunctionFlow
TestLongRunningStringFunctionFlow --> testLongRunningFunctionFlow
testLongRunningFunctionFlow -->|Uses| MockModel
testLongRunningFunctionFlow -->|Creates| longRunningTool
testLongRunningFunctionFlow -->|Creates| llmagent.New
testLongRunningFunctionFlow -->|Runs| TestAgentRunner.Run
testLongRunningFunctionFlow -->|Validates| EventStream
TestLongRunningToolIDsAreSet -->|Creates| longRunningTool
TestLongRunningToolIDsAreSet -->|Creates| llmagent.New
TestLongRunningToolIDsAreSet -->|Runs| TestAgentRunner.Run
TestLongRunningToolIDsAreSet -->|Validates| EventsWithIDs
NewContentFromFunctionResponseWithID -->|Helper for| testLongRunningFunctionFlow
Summary of Key Functions and Their Relationships
Test Functions:
TestNewLongRunningFunctionTool: Validates tool creation.TestLongRunningFunctionFlowandTestLongRunningStringFunctionFlow: Test workflows with different handler return types.TestLongRunningToolIDsAreSet: Tests ID propagation on events.
Helper Functions:
testLongRunningFunctionFlow: Core generic test logic for long-running function flow.NewContentFromFunctionResponseWithID: Utility to create contents with assigned IDs.
Supporting Types:
IncArgs: Empty struct used as input argument type for test handlers.
This file ensures that the long-running function tools integrate properly with LLM agents and that event streams reflect the correct states and IDs through consecutive interactions. It plays a crucial role in validating asynchronous function invocation patterns in the agent framework.