readiness.sh
Overview
The `readiness.sh` script is a **readiness probe** designed to monitor the synchronization and connectivity status of a Tendermint-based blockchain node daemon. This script is intended to be executed periodically by Kubernetes to determine if the daemon is fully synchronized, connected to peers, and ready to serve requests.
By performing multi-layered checks — including local node sync status, peer connectivity, and cross-validation against external reference nodes — the script ensures that traffic is only routed to healthy, fully synced blockchain nodes. This protects the system from premature traffic routing to lagging or disconnected nodes, enhancing overall stability and reliability.
Detailed Explanation
Script Purpose
Checks if the blockchain node is currently syncing or caught up.
Ensures the node has peers connected.
Validates the node's latest block height against reference nodes, allowing a configurable tolerance.
Supports disabling readiness checks via a special file for operational flexibility.
Exits with status code
0if ready; otherwise exits with1.
Key Functional Sections
1. Disable Readiness Probe Check
DISABLE_READINESS_PROBE=/root/disable_readiness
if [[ -f "$DISABLE_READINESS_PROBE" ]]; then
echo "readiness probe disabled"
exit 0
fi
Purpose: Allows operators to temporarily disable readiness checks by placing a file at
/root/disable_readiness.Behavior: If the file exists, the script immediately exits with success (
0), indicating readiness is disabled.
2. Environment Setup
source /tendermint.sh
BLOCK_HEIGHT_TOLERANCE=5
Sources the
/tendermint.shscript which provides utility functions, most notablyget_best_reference_block_height_evalandreference_validation.Defines a block height tolerance of 5 blocks, allowing minor lag behind reference nodes.
3. Local Node Status Queries
SYNCING=$(curl -sf http://localhost:1317/cosmos/base/tendermint/v1beta1/syncing) || exit 1
NET_INFO=$(curl -sf http://localhost:27147/net_info) || exit 1
STATUS=$(curl -sf http://localhost:27147/status) || exit 1
Queries the local node's REST and RPC endpoints:
/syncingendpoint indicates if the node is currently syncing./net_infoprovides network information including peer count./statusreturns the node status with detailed sync info.
If any query fails, exits immediately with code
1indicating probe failure.
4. Parsing Status Values
IS_SYNCING=$(echo $SYNCING | jq -r '.syncing')
CATCHING_UP=$(echo $STATUS | jq -r '.result.sync_info.catching_up')
NUM_PEERS=$(echo $NET_INFO | jq -r '.result.n_peers')
Extracts:
IS_SYNCING: Boolean indicating if Tendermint is syncing blocks.CATCHING_UP: Boolean indicating if the node is catching up.NUM_PEERS: Number of connected peers.
5. External Reference Nodes Setup
status_curls=(
"curl -sf -m 3 https://rpc.ninerealms.com/status"
"curl -sf -m 3 -H \"Referer: https://app.thorswap.finance\" https://rpc.thorswap.net/status"
)
Defines an array of curl commands to query trusted external RPC nodes.
These nodes act as reference points to validate the local node's reported block height.
6. Readiness Decision Logic
if [[ $IS_SYNCING == false && $CATCHING_UP == false ]]; then
if (( $NUM_PEERS > 0 )); then
latest_block_height=$(echo $STATUS | jq -r '.result.sync_info.latest_block_height')
best_reference_block_height=$(get_best_reference_block_height_eval "${status_curls[@]}")
reference_validation $latest_block_height $best_reference_block_height $BLOCK_HEIGHT_TOLERANCE
echo "daemon is synced with $NUM_PEERS peers"
exit 0
fi
echo "daemon is synced, but has no peers"
exit 1
fi
echo "daemon is still syncing"
exit 1
If the node is not syncing or catching up:
Checks if it has at least 1 peer connected.
Retrieves the node’s latest block height.
Fetches the best block height from reference nodes using
get_best_reference_block_height_eval.Calls
reference_validationto verify the local block height is within the allowed tolerance.If all checks pass, outputs success message and exits with
0.
If no peers connected: outputs warning and exits with
1.If syncing or catching up: outputs syncing status and exits with
1.
Functions (from sourced /tendermint.sh)
The script sources `/tendermint.sh` which provides the following key functions:
get_best_reference_block_height_eval
Executes multiple curl commands passed as arguments, extracts the block height from each reference node, and returns the highest block height observed.reference_validation
Compares the local node's block height against the best reference block height, allowing for a block height tolerance. If the local height is behind beyond the tolerance, the function causes the script to exit with failure.
These functions encapsulate the logic for external reference validation, critical for avoiding false readiness signals due to local node misreporting or network delays.
Usage Example
The script is designed to be used as a Kubernetes readiness probe command:
readinessProbe:
exec:
command:
- /bin/bash
- /path/to/readiness.sh
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
On invocation, the script performs all checks and exits with appropriate status codes to inform Kubernetes whether the node is ready.
Important Implementation Details
Uses curl with silent and fail flags (
-sf) to perform HTTP queries.Uses jq for reliable JSON parsing.
Uses exit codes (
0success,1failure) to integrate smoothly with Kubernetes probe semantics.Implements a fail-fast strategy where any inability to contact the node or external references results in immediate probe failure.
Includes a disable file mechanism for manual override.
Employs block height tolerance to accommodate minor synchronization lags.
Cross-validates node status with external reference nodes to enhance robustness.
Interaction with Other System Components
Kubernetes: The script runs inside the blockchain node container as a readiness probe, controlling pod availability in the cluster.
Tendermint Node Daemon: Directly queries Tendermint RPC and REST endpoints running locally.
Reference RPC Nodes: Queries external RPC endpoints of trusted nodes for block height validation.
Utility Script (
/tendermint.sh): Relies on sourced utility functions for external block height fetching and validation.Operators: Can disable readiness checks temporarily by placing a file at
/root/disable_readiness.
Mermaid Flowchart: Readiness Probe Workflow
flowchart TD
A[Kubernetes Probe Triggered] --> B{Disable File Present?}
B -- Yes --> C[Exit 0: Probe Disabled]
B -- No --> D[Query Local Syncing Status]
D --> E[Query Local Peer Info]
E --> F[Query Local Node Status]
F --> G{Is Node Synced?}
G -- No --> H[Exit 1: Node Syncing]
G -- Yes --> I{Peers > 0?}
I -- No --> J[Exit 1: No Peers]
I -- Yes --> K[Get Latest Local Block Height]
K --> L[Query External Reference Nodes]
L --> M[Get Best Reference Block Height]
M --> N[Validate Local Height vs Reference]
N -- Fail --> O[Exit 1: Validation Failed]
N -- Pass --> P[Exit 0: Node Ready]
Summary
`readiness.sh` is a robust readiness probe script customized for Tendermint blockchain nodes. It combines local sync checks, peer connectivity validation, and cross-node block height verification using external references to confidently determine node readiness. This script is critical in Kubernetes environments to prevent routing traffic to out-of-sync or isolated nodes, thereby maintaining service stability and integrity.
Appendix: Key Variables and Constants
Name | Description |
|---|---|
`DISABLE_READINESS_PROBE` | File path to disable readiness checks when present |
`BLOCK_HEIGHT_TOLERANCE` | Maximum allowed block height difference between local and reference nodes (5) |
`SYNCING` | JSON response indicating if node is syncing |
`NET_INFO` | JSON response containing network and peer info |
`STATUS` | General node status containing sync info |
`IS_SYNCING` | Boolean extracted from `SYNCING` |
`CATCHING_UP` | Boolean extracted from `STATUS` indicating catch-up state |
`NUM_PEERS` | Number of connected peers |
`status_curls` | Array of curl commands for querying external reference nodes |
If you require further details on the sourced utility functions in `/tendermint.sh` or related probe scripts for other chains, please refer to their respective documentation.