OpenTelemetry Integration

Purpose

The OpenTelemetry Integration subtopic addresses the need for detailed observability into the execution of large language model (LLM) calls and tool invocations within the agent framework. While the parent topic (Telemetry and Observability) covers the overall collection and management of tracing data, this subtopic specifically focuses on registering span processors and instrumenting the tracing of span events related to LLM and tool interactions. This enables developers and operators to monitor, analyze, and debug agent behavior and performance with rich, structured telemetry data.

This integration is essential for understanding the internal workings of agent execution flows, such as when an LLM is called or when a tool function is executed. It supports the broader goal of the telemetry system to provide end-to-end traceability of AI agent operations.

Functionality

Span Processor Registration

Before any telemetry data is generated, span processors must be registered to the local OpenTelemetry tracer provider. This ensures that span data—representing units of work such as LLM calls or tool executions—are processed and exported correctly.

The function AddSpanProcessor allows adding custom span processors to the local tracer configuration.
The RegisterTelemetry function initializes the tracer provider with all registered span processors, ensuring thread-safe one-time setup.

// AddSpanProcessor adds a span processor to the local tracer config.
func AddSpanProcessor(processor sdktrace.SpanProcessor) { ... }

// RegisterTelemetry sets up the local tracer with registered processors.
func RegisterTelemetry() { ... }

Dual Tracer Usage

To accommodate both local and global telemetry configurations, two tracers are used in parallel when starting a trace:

A local tracer configured with span processors specific to this integration.
The global tracer provided by OpenTelemetry, which may be configured externally.

The method StartTrace returns two spans, one from each tracer, allowing simultaneous tracing in both contexts. This dual approach provides flexibility and ensures compatibility with different telemetry setups.

Tracing LLM Calls

The TraceLLMCall function instruments spans with detailed attributes about the LLM request and response:

Model name and configuration parameters (e.g., TopP, MaxOutputTokens).
Unique IDs for invocation, session, and event.
Serialized LLM request and response content, filtered to exclude sensitive inline data.
These attributes provide rich context for performance monitoring and debugging at the LLM call level.

Tracing Tool Calls

Similarly, tool executions are traced via TraceToolCall and TraceMergedToolCalls functions.

Attributes capture tool name, description, execution arguments, and response data.
For merged tool calls, a special span is created with consolidated metadata.
The trace spans for tool calls complement the LLM call spans to provide comprehensive observability of agent workflows that combine model invocations and external tool executions.

Serialization and Filtering

To avoid exposing sensitive or irrelevant information, the telemetry integration carefully serializes and filters trace data:

The safeSerialize method guards against serialization errors.
The llmRequestToTrace method removes inline data parts from LLM content before tracing.

This careful handling ensures that telemetry remains useful without compromising data integrity or privacy.

Integration

This subtopic integrates tightly with the parent topic (Telemetry and Observability) by providing the concrete implementation for registering span processors and instrumenting trace spans specifically for LLM and tool calls.

It complements other telemetry components that may trace agent lifecycle events, session interactions, or remote agent communications.
It relies on interfaces and data types from the agent, session, tool, and model packages to extract the context and details needed for tracing.
The registration functions are called during system initialization, ensuring that all subsequent LLM and tool invocations are traced automatically.
Tools and LLM agents leverage these tracing functions to create spans around their operations, linking telemetry data to the agent's execution context.

By registering span processors and generating spans with detailed attributes, this subtopic enables a unified telemetry pipeline that feeds observability back into the system for monitoring and diagnostics.

Diagram

sequenceDiagram
participant Agent as Agent
participant Telemetry as OpenTelemetry Integration
participant LLM as LLM Model
participant Tool as Tool Execution
participant SpanProcessor as Span Processor
Agent->>Telemetry: StartTrace(context, "llm_call")
Telemetry->>Telemetry: Create local and global spans
Agent->>LLM: Invoke LLM call
LLM-->>Agent: Return LLM response
Agent->>Telemetry: TraceLLMCall(spans, invocationContext, llmRequest, event)
Telemetry->>SpanProcessor: Set span attributes and end spans
alt Tool call detected
Agent->>Telemetry: StartTrace(context, "tool_call")
Telemetry->>Telemetry: Create spans
Agent->>Tool: Execute tool function
Tool-->>Agent: Return tool response
Agent->>Telemetry: TraceToolCall(spans, tool, args, event)
Telemetry->>SpanProcessor: Set attributes and end spans
end

This sequence diagram visualizes the core process whereby the agent initiates tracing spans for LLM and tool calls, which are then annotated and finalized by the OpenTelemetry span processors registered in this subtopic.