metrics.rs

Overview

This file defines and implements comprehensive metrics collection and reporting facilities related to block production, networking, routing, and asynchronous task execution within the node. It primarily focuses on the BlockProductionMetrics struct, which encapsulates detailed performance and operational metrics for block production activities. The metrics are built using the OpenTelemetry API and include counters, gauges, histograms, and up-down counters to capture a wide variety of statistics such as timing, queue sizes, event counts, and error rates.

The file also declares several constants representing channel names and Aerospike object types used elsewhere in the system for telemetry tagging and metrics correlation. Additionally, it defines the top-level Metrics struct aggregating metrics from multiple subsystems (NetMetrics, BlockProductionMetrics, RoutingMetrics, and TokioMetrics), providing a centralized entry point for metrics instrumentation across the node.


Structs and Their Functionality

BlockProductionMetrics

A cloneable wrapper around an Arc to an internal struct BlockProductionMetricsInner that holds the actual OpenTelemetry metric instruments. This struct provides methods to report various block production-related metrics.

Internal Structure: BlockProductionMetricsInner

Contains fields for all the metric instruments related to block production, including:

These metric instruments are created and configured in the BlockProductionMetrics::new constructor using the OpenTelemetry Meter.

Key Methods

Trait Implementations


Metrics

This struct aggregates metrics from different subsystems:

Methods


Constants

Channel Names

Used as identifiers/tags for telemetry related to inter-component communication channels:

Aerospike Object Types

Used to tag Aerospike-related metrics with specific object types:


Important Implementation Details and Algorithms


Interactions with Other Parts of the System


Usage Examples

use opentelemetry::metrics::Meter;
use crate::metrics::{Metrics, BlockProductionMetrics};
use crate::types::ThreadIdentifier;

fn example_usage(meter: &Meter, thread_id: ThreadIdentifier) {
    // Initialize metrics
    let metrics = Metrics::new(meter);

    // Report block production time with correction
    metrics.node.report_block_production_time_and_correction(250, -5, &thread_id);

    // Report a finalized block with transaction count
    metrics.node.report_finalization(1000, 15, &thread_id);

    // Increment aborted transaction count
    metrics.node.report_tx_aborted(&thread_id);

    // Report Aerospike write duration for internal messages
    metrics.node.report_aerospike_write(1200.5, "int_messages");

    // Record internal message queue length
    metrics.node.report_internal_message_queue_length(42);
}

Visual Diagram

classDiagram
class Metrics {
+net: NetMetrics
+node: BlockProductionMetrics
+routing: RoutingMetrics
+tokio: TokioMetrics
+new(meter)
}
class BlockProductionMetrics {
+new(meter)
+report_block_production_time_and_correction()
+report_block_apply_time()
+report_finalization()
+report_thread_count()
+report_error()
...
}
class BlockProductionMetricsInner {
-thread_load: Gauge
-block_production_time: Histogram
-block_apply_time: Histogram
-finalization_time: Histogram
-block_finalized: Counter
-tx_finalized: Counter
-tx_aborted: Counter
-ext_tx_aborted: Counter
-thread_count: UpDownCounter
-... (many more metrics fields)
}
Metrics "1" *-- "1" BlockProductionMetrics
BlockProductionMetrics "1" *-- "1" BlockProductionMetricsInner

Helper Functions


This file is central to detailed operational telemetry for the block production subsystem and integrates with the system-wide metrics collection framework, facilitating performance monitoring, troubleshooting, and analytics. It aligns with instrumentation practices described in Telemetry and Monitoring and interacts closely with network and routing metrics as in Network Metrics and Routing Metrics.