overview.json
Overview
`overview.json` is a Grafana dashboard configuration file designed to provide a comprehensive monitoring overview for a multi-blockchain infrastructure platform, specifically the ShapeShift Unchained platform. This JSON file defines a rich set of visual panels and layouts within Grafana to visualize the health, performance, and resource utilization of various blockchain coinstacks (e.g., Arbitrum, Avalanche, Bitcoin, Ethereum, etc.) and their related Kubernetes-managed services.
The dashboard integrates Prometheus as the primary data source, querying a wide range of metrics such as pod readiness, CPU and memory usage, request counts, WebSocket connection counts, volume storage utilization, and container restarts. It offers operators and developers an at-a-glance view of system status, enabling proactive monitoring and troubleshooting.
Purpose and Functionality
Multi-Coinstack Monitoring: Provides granular metrics per blockchain coinstack, including APIs, StatefulSets, daemons, and indexers.
Health Indicators: Uses gauges to show pod readiness percentages, highlighting service health (Healthy, Degraded, Down).
Resource Utilization: Displays timeseries graphs for CPU and memory usage relative to resource limits.
Operational Metrics: Tracks request counts, average request durations, WebSocket connections, and pod restart counts.
Storage Monitoring: Visualizes volume space usage for persistent storage associated with daemons and indexers.
Annotations: Supports Grafana built-in annotations and alerts integration for event correlation.
Dynamic Panels: Uses repeat features and templating (e.g.,
$coinstack) to dynamically adapt panels to different coinstacks.
Structure and Key Components
The dashboard JSON is structured as follows:
Top-Level Fields
annotations: Contains built-in annotations for alerts.editable: Allows dashboard editing.fiscalYearStartMonth: Configures fiscal year start (0 = January).graphTooltip: Tooltip behavior setting.links: External links (empty here).liveNow: Live streaming status.panels: Array of panel objects defining dashboard content.
Panels
Panels are grouped mainly into **rows**, each corresponding to a blockchain coinstack or overview section:
Row Panels: These are containers grouping related panels visually (e.g., "Overview," "Arbitrum One," "Avalanche," "Bitcoin," etc.).
Gauge Panels: Show health metrics like deployment or StatefulSet pod readiness percentages.
Timeseries Panels: Display resource metrics such as CPU usage, memory usage, request counts, request durations, and restarts.
Stat Panels: Show summary statistics like WebSocket connection counts.
Field Configurations: Define visual styling, threshold colors, units (percent, none, seconds), and mappings for different metric states.
Targets: Prometheus queries that fetch the data visualized by the panels.
Detailed Explanation of Important Panels and Their Configuration
1. Overview Row (id: 19)
Type: Row
Purpose: Serves as a header or section divider titled "Overview".
Contains: Empty nested panels array, mainly for layout.
2. Health Gauge Panels
Example: Panel with
id: 6(Gauge for coinstack API/STS readiness)Datasource: Prometheus
Gauge Metric Expression:
For deployments:
(kube_deployment_status_replicas_ready{namespace="$namespace", deployment=~"$coinstack-api-.*"} / kube_deployment_status_replicas{namespace="$namespace", deployment=~"$coinstack-api-.*"}) * 100For StatefulSets:
(kube_statefulset_status_replicas_ready{namespace="$namespace", statefulset=~"$coinstack-sts"} / kube_statefulset_replicas{namespace="$namespace", statefulset=~"$coinstack-sts"}) * 100
Purpose: Show the percentage of ready replicas for API and StatefulSet components.
Color Mappings:
0% = Red ("Down")
1-99% = Orange ("Degraded")
100% = Green ("Healthy")
Repeat Feature: Repeats the panel horizontally for each coinstack.
3. CPU Usage Timeseries Panel
Example: Panel
id: 157Expression:
avg(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod=~".*$coinstack-.*"}[5m])) by (pod, container) / avg(kube_pod_container_resource_limits{namespace="$namespace",pod=~".*$coinstack-.*",resource="cpu"}) by (pod, container) * 100Purpose: Displays CPU usage as a percentage relative to pod container resource limits.
Legend: Shows min, mean, max CPU usage per pod/container.
Unit: Percent
4. Memory Usage Timeseries Panel
Example: Panel
id: 158Expression:
avg(container_memory_working_set_bytes{namespace="$namespace",pod=~".*$coinstack-.*"}) by (pod, container) / sum(kube_pod_container_resource_limits{namespace="$namespace",pod=~".*$coinstack-.*",resource="memory"}) by (pod, container) * 100Purpose: Displays memory usage as a percentage of allocated limits.
Legend: Shows min, mean, max memory usage per pod/container.
Unit: Percent
5. Restarts Timeseries Panel
Example: Panel
id: 171Expression:
sum(increase(kube_pod_container_status_restarts_total{namespace="$namespace", pod=~".*$coinstack-.*"}[$__rate_interval])) by (pod, container)Purpose: Shows the count of container restarts to track stability issues.
Legend: Shows last restart count.
Unit: None
6. WebSocket Connection Count Stat Panel
Example: Panel
id: 311(in "Arbitrum One" row)Expression:
sum(unchained_ws_client_count{coinstack="arbitrum", namespace="$namespace"})Purpose: Displays the number of active WebSocket connections per coinstack.
Unit: None
7. Request Count Timeseries Panel
Shows the count of HTTP requests by endpoint, segmented by status codes and routes.
Uses Prometheus
increase()function onunchained_http_request_countmetrics with label replacements to group by API endpoints.
8. Average Request Duration Timeseries Panel
Shows average latency of HTTP API requests by endpoint.
Calculated by dividing the rate of request duration sums by counts.
9. Volume Space Usage Gauge Panels
Shows percentage used of persistent storage volumes associated with daemons and indexers.
Uses Kubernetes metrics scraped from
kubelet_volume_stats_*metrics.
Implementation Details and Algorithms
Dynamic Panel Repetition: The dashboard uses Grafana's
repeatfeature on the coinstack variable to generate similar panels for each blockchain coinstack dynamically.Prometheus Query Expressions: Queries use Prometheus aggregation functions (
sum,avg,rate,increase) combined with label selectors and regex matching to filter metrics per coinstack and deployment type.Thresholds and Mappings: Panels define color thresholds and text mappings to visually indicate status states like Healthy, Degraded, and Down.
Use of Label Replace: Advanced regex label replacements in Prometheus expressions group metrics by endpoint patterns, enabling detailed request analysis.
Null and NaN Handling: Special value mappings ensure missing or invalid data is treated as "Down" or zero, preventing misleading visualization.
Interaction with Other System Components
Prometheus: The dashboard queries metrics from a Prometheus server configured to scrape Kubernetes and application metrics.
Kubernetes Cluster: Metrics relate to Kubernetes resources such as Deployments, StatefulSets, Pods, PersistentVolumeClaims, and containers.
Blockchain Coinstacks: Each coinstack (like Arbitrum, Avalanche, Bitcoin, etc.) runs multiple services exposing Prometheus metrics. The dashboard aggregates these metrics per coinstack.
Grafana: This JSON file is imported into Grafana, which renders the dashboard UI for users.
Alerting System: The dashboard supports annotations and alerts from Grafana's alerting framework integrated with Prometheus Alertmanager.
Infrastructure as Code: The dashboard is deployed via Pulumi scripts that provision Prometheus, Grafana, and configure data sources and dashboards.
Usage Example
To use this dashboard:
Deploy Prometheus and Grafana with the appropriate configurations (including this dashboard file).
Set template variables in Grafana like
$namespace(Kubernetes namespace) and$coinstack(blockchain coinstack name).The dashboard will render health and performance metrics for all configured coinstacks dynamically.
Operators can drill down into detailed resource usage, request metrics, and storage utilization.
Alerts and annotations provide context for observed anomalies.
Mermaid Diagram: Structure and Relationships of Main Functions in overview.json
Since `overview.json` is a configuration file defining dashboard panels and their Prometheus query targets (not code classes or functions), a **flowchart** representing the main panel categories and their relationships is most appropriate.
flowchart TD
A[overview.json Dashboard]
A --> B[Overview Row]
A --> C[Arbitrum One Row]
A --> D[Avalanche Row]
A --> E[Bitcoin Row]
A --> F[Bitcoin Cash Row]
A --> G[BNB Smart Chain Row]
A --> H[Dogecoin Row]
A --> I[Ethereum Row]
A --> J[Gnosis Row]
A --> K[Cosmos Row]
A --> L[Litecoin Row]
A --> M[Optimism Row]
subgraph Panel Types
X1[Health Gauges]
X2[CPU Usage Timeseries]
X3[Memory Usage Timeseries]
X4[Request Count Timeseries]
X5[Average Request Duration Timeseries]
X6[WebSocket Connection Stat]
X7[Volume Space Usage Gauges]
X8[Restarts Timeseries]
end
B --> X1
B --> X2
B --> X3
B --> X4
B --> X5
B --> X6
B --> X7
B --> X8
C --> X1
C --> X2
C --> X3
C --> X4
C --> X5
C --> X6
C --> X7
C --> X8
D --> X1
D --> X2
D --> X3
D --> X4
D --> X5
D --> X6
D --> X7
D --> X8
%% Similarly for other coin stacks...
Summary
overview.jsonis a comprehensive Grafana dashboard JSON configuration for monitoring multiple blockchain coinstacks running in Kubernetes.It provides detailed health, resource utilization, and operational metrics visualized through gauges, timeseries, and stat panels.
Uses Prometheus queries extensively to extract and aggregate metrics per coinstack.
Supports dynamic rendering via repeated panels and template variables.
Integrated with alerting and annotations for operational context.
Key for providing centralized observability across a complex multi-blockchain infrastructure.
This file is a critical part of the monitoring subsystem in the ShapeShift Unchained platform, enabling effective tracking and analysis of system performance and health.