gemini.go

Overview

The gemini.go file provides an implementation of the [model.LLM] interface tailored for Gemini language models using the Gemini API. It encapsulates the logic for interacting with Gemini models via the genai.Client, handling synchronous and streaming text generation requests. This file establishes a bridge between the generic LLM interface used throughout the system and the specifics of the Gemini API, including request construction, header management, response conversion, and streaming aggregation.

Key responsibilities include:

Initializing a Gemini client for a specified model.
Managing request content, including appending user prompts when necessary.
Adding required HTTP headers for API calls.
Providing synchronous and streaming model generation capabilities.
Converting Gemini API responses to the system's LLM response format.

This implementation is foundational for integrating Gemini models as pluggable LLM backends within the broader AI agent and tooling framework.

Types and Functions

Type: `geminiModel`

type geminiModel struct {
	client             *genai.Client
	name               string
	versionHeaderValue string
}

Description:
Represents a Gemini LLM model instance. It holds the Gemini API client, the model's name identifier, and a precomputed version string for HTTP headers.
Fields:
- client: The Gemini API client used for requests.
- name: The specific Gemini model name (e.g., "gemini-2.5-flash").
- versionHeaderValue: A user-agent string formatted with client and Go runtime versions.

Function: `NewModel`

func NewModel(ctx context.Context, modelName string, cfg *genai.ClientConfig) (model.LLM, error)

Purpose:
Constructs and returns a new geminiModel instance implementing [model.LLM]. It initializes the underlying genai.Client with the provided configuration and prepares the version header string.
Parameters:
- ctx: Context for cancellation and timeouts.
- modelName: Name of the Gemini model to target (e.g., "gemini-2.5-flash").
- cfg: Configuration options for the Gemini client.
Returns:
- A model.LLM instance backed by Gemini.
- An error if client initialization fails.

Usage Example:

model, err := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{...})
if err != nil {
    // handle error
}

Method: `(*geminiModel) Name`

func (m *geminiModel) Name() string

Purpose:
Returns the Gemini model's name identifier.
Returns:
The model name string.

Method: `(*geminiModel) GenerateContent`

func (m *geminiModel) GenerateContent(ctx context.Context, req *model.LLMRequest, stream bool) iter.Seq2[*model.LLMResponse, error]

Purpose:
Executes a text generation request on the Gemini model. Supports both synchronous (non-streaming) and streaming modes.
Parameters:
- ctx: Execution context.
- req: The LLM request containing prompt contents and config.
- stream: If true, returns a streaming sequence of responses; if false, returns a single response.
Returns:
An iter.Seq2 which is a function yielding either a single or multiple LLM responses paired with errors.
Implementation Details:
1. Calls maybeAppendUserContent to ensure the request contains user content prompts for model continuation.
2. Initializes default configuration objects if nil.
3. Adds required HTTP headers to the request config.
4. Delegates to generate for synchronous or generateStream for streaming execution.

Method: `(*geminiModel) addHeaders`

func (m *geminiModel) addHeaders(headers http.Header)

Purpose:
Adds the x-goog-api-client and user-agent headers with the precomputed version string to the HTTP headers map.
Parameters:
- headers: The HTTP headers map to modify.
Usage:
Called internally before API requests to inject client version info, aiding telemetry and compatibility.

Method: `(*geminiModel) generate`

func (m *geminiModel) generate(ctx context.Context, req *model.LLMRequest) (*model.LLMResponse, error)

Purpose:
Performs a synchronous content generation request against the Gemini API.
Parameters:
- ctx: Context for request lifecycle.
- req: The LLM request with prompt contents and generation config.
Returns:
- The first candidate response converted to [model.LLMResponse].
- An error if the API call or response processing fails.
Implementation Details:
- Calls m.client.Models.GenerateContent with the model name, request contents, and config.
- Checks for an empty candidate list, which is unexpected.
- Converts Gemini API response to the system's LLM response format using converters.Genai2LLMResponse.

Method: `(*geminiModel) generateStream`

func (m *geminiModel) generateStream(ctx context.Context, req *model.LLMRequest) iter.Seq2[*model.LLMResponse, error]

Purpose:
Returns a stream of partial responses from the Gemini API, suitable for real-time or chunked generation scenarios.
Parameters:
- ctx: Context for managing the streaming lifecycle.
- req: The LLM request.
Returns:
- An iterator function yielding incremental [model.LLMResponse] objects and errors.
Implementation Details:
- Creates a StreamingResponseAggregator to handle incremental assembly of streaming responses.
- Consumes the GenerateContentStream channel from the Gemini client.
- Processes each partial response through the aggregator, yielding assembled LLM responses.
- Yields an aggregated closing response when the stream ends.

Method: `(*geminiModel) maybeAppendUserContent`

func (m *geminiModel) maybeAppendUserContent(req *model.LLMRequest)

Purpose:
Ensures that the request content includes a final user role message to prompt the model to continue or finalize output.
Parameters:
- req: The LLM request whose contents may be modified.
Implementation Details:
- If req.Contents is empty, appends a default user content instructing the model to handle requests per system instructions.
- If the last content's role is not "user", appends a user content prompting continuation or exit.

Important Implementation Details

Version Header Construction:
The user-agent and API client header string is constructed once during model creation using the system version and Go runtime version. This is added to all outgoing requests for identification and telemetry.
Streaming Response Aggregation:
Streaming responses from Gemini are incremental and require processing to assemble partial outputs into coherent LLM responses. This is managed by llminternal.NewStreamingResponseAggregator, which buffers partial data and yields aggregated results.
Request Content Management:
The model ensures that user prompts are always present at the end of the request contents to prevent the model from prematurely terminating output generation. This is a heuristic to maintain conversational continuity.
Error Handling:
The code robustly checks for empty responses and propagates errors from underlying API calls, wrapping them with context for easier debugging.
Dependencies:
This file depends on internal packages such as llminternal for streaming aggregation and converters for response format translation, as well as the external genai package for Gemini API client interactions.

Interactions with Other Components

Implements the [model.LLM] interface, making it compatible with systems that consume generic LLMs, including agents and tooling frameworks (LLM Integration and Agents).
Utilizes the genai.Client to communicate with Gemini API endpoints, abstracting network and protocol details.
Uses internal converters to translate Gemini API responses to the system's expected LLM response format, allowing seamless integration with other components expecting [model.LLMResponse] objects.
Streaming response aggregation leverages internal streaming utilities to provide a consistent streaming interface compatible with other LLM implementations.
The model's output and streaming capabilities can be used by higher-level agents, workflows, or tools that require Gemini model responses.

Diagram: geminiModel Structure and Method Relationships

classDiagram
class geminiModel {
-client: genai.Client
-name: string
-versionHeaderValue: string
+Name()
+GenerateContent()
+addHeaders()
+generate()
+generateStream()
+maybeAppendUserContent()
}

This class diagram illustrates the geminiModel struct with its private fields and public methods. The methods coordinate to fulfill the [model.LLM] interface, supporting both synchronous and streaming content generation for Gemini models.

gemini.go

Overview

Types and Functions

Type: geminiModel

Function: NewModel

Method: (*geminiModel) Name

Method: (*geminiModel) GenerateContent

Method: (*geminiModel) addHeaders

Method: (*geminiModel) generate

Method: (*geminiModel) generateStream

Method: (*geminiModel) maybeAppendUserContent