liveness.sh
Overview
`liveness.sh` is a Bash script designed to act as a **liveness probe** for a daemon process that interacts with an Ethereum node via JSON-RPC. Its primary function is to determine if the daemon is actively processing new Ethereum blocks or if it has stalled.
It achieves this by querying the Ethereum node's current block number and comparing it with a previously recorded block number stored on disk. If the block number has increased since the last check, the daemon is considered "running." If not, the daemon is deemed "stalled." Additionally, the script supports disabling the liveness check through a flag file.
This script is typically used in containerized environments (e.g., Kubernetes) or monitoring systems that rely on liveness probes to restart or alert on unresponsive services.
Detailed Breakdown
Variables and Files
DISABLE_LIVENESS_PROBE: Path to a file (/data/disable_liveness) that, if present, disables the liveness probe check.FILE: Path to a file (/data/.block_number) that stores the last observed Ethereum block number.
Script Execution Flow
Check for Disabled Probe
if [[ -f "$DISABLE_LIVENESS_PROBE" ]]; then echo "liveness probe disabled" exit 0 fiIf the file
/data/disable_livenessexists, the script reports that the liveness probe is disabled and exits with status 0 (success).This allows manual disabling of the liveness check without changing the script or container configuration.
Fetch Current Ethereum Block Number
ETH_BLOCK_NUMBER=$(curl -sf -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' -H 'Content-Type: application/json' http://localhost:8545) || exit 1Sends a JSON-RPC request to the local Ethereum node at
http://localhost:8545usingcurl.The request method is
eth_blockNumberwhich returns the latest block number in hexadecimal.The
-sflag suppresses progress and error messages;-fcauses curl to fail silently on server errors.If the request fails, the script exits with status 1 (failure).
Parse the Block Number
CURRENT_BLOCK_NUMBER_HEX=$(echo $ETH_BLOCK_NUMBER | jq -r '.result') CURRENT_BLOCK_NUMBER=$(($CURRENT_BLOCK_NUMBER_HEX))Uses
jqto parse the JSON response and extract the.resultfield, which contains the block number as a hex string (e.g., "0x10d4f").Converts the hex string to a decimal integer using Bash arithmetic expansion.
Initialize or Read Previous Block Number
if [[ ! -f "$FILE" ]]; then echo $CURRENT_BLOCK_NUMBER > $FILE exit 1 fi PREVIOUS_BLOCK_NUMBER=$(cat $FILE) echo $CURRENT_BLOCK_NUMBER > $FILEIf the file
/data/.block_numberdoes not exist, it creates it by writing the current block number and exits with status 1.Otherwise, reads the previous block number from the file.
Updates the file with the current block number for the next probe.
Compare Block Numbers to Determine Liveness
if (( $CURRENT_BLOCK_NUMBER > $PREVIOUS_BLOCK_NUMBER )); then echo "daemon is running" exit 0 fi echo "daemon is stalled" exit 1If the current block number is greater than the previous one, the daemon is considered active (
exit 0).Otherwise, it is considered stalled, and the script exits with failure (
exit 1).
Usage Examples
Run the script manually to check daemon status:
./liveness.shDisable the liveness probe by creating the disable file:
touch /data/disable_liveness ./liveness.sh # Output: liveness probe disabledTypical use in Kubernetes as a liveness probe command:
livenessProbe: exec: command: - /bin/bash - -c - /path/to/liveness.sh initialDelaySeconds: 30 periodSeconds: 10
Implementation Details and Considerations
Hex to Decimal Conversion: The script converts the hex block number returned by Ethereum's JSON-RPC into decimal for numeric comparison.
File-Based State Persistence: Block number tracking uses a simple file (
/data/.block_number) to remember state between runs. This requires persistent storage shared between probe executions.Fail-Fast Behavior: If the Ethereum node is unreachable or returns error, the script exits with failure immediately, signaling potential issues.
Disabling Mechanism: The presence of
/data/disable_livenessfile allows operators to temporarily disable the probe without changing deployment specs.Exit Codes:
0: Liveness probe succeeded (daemon active or disabled).1: Liveness probe failed (daemon stalled or unable to fetch block number).
Interaction with Other System Components
Ethereum Node: The script depends on a local Ethereum node JSON-RPC endpoint running on
localhost:8545. This node must be reachable and responsive.Persistent Storage Volume: The
/datadirectory is expected to be a persistent volume mounted inside the container, used for storing the block number and disable flag.Monitoring / Orchestration Systems: This script is designed to be called repeatedly by Kubernetes or other container orchestration systems as a liveness probe. Based on the exit status, the orchestrator can restart the pod or alert operators.
Dependencies:
curl: For HTTP requests.jq: For JSON parsing.
Both must be available in the container environment.
Mermaid Flowchart Diagram
The following flowchart illustrates the main logic flow and decision points within `liveness.sh`:
flowchart TD
Start([Start])
CheckDisable{Does /data/disable_liveness exist?}
Disabled["Print 'liveness probe disabled'\nExit 0"]
FetchBlock["Fetch current block number\nfrom Ethereum node (curl)"]
FetchFail{"Did curl fail?"}
ParseBlock["Parse hex block number\nconvert to decimal"]
CheckFile{Does /data/.block_number exist?}
InitFile["Write current block number\nto /data/.block_number\nExit 1"]
ReadPrev["Read previous block number\nfrom /data/.block_number"]
WriteCurr["Write current block number\nto /data/.block_number"]
CompareBlocks{"Is current block number > previous?"}
Running["Print 'daemon is running'\nExit 0"]
Stalled["Print 'daemon is stalled'\nExit 1"]
Start --> CheckDisable
CheckDisable -- Yes --> Disabled
CheckDisable -- No --> FetchBlock
FetchBlock --> FetchFail
FetchFail -- Yes --> Stalled
FetchFail -- No --> ParseBlock
ParseBlock --> CheckFile
CheckFile -- No --> InitFile
CheckFile -- Yes --> ReadPrev
ReadPrev --> WriteCurr
WriteCurr --> CompareBlocks
CompareBlocks -- Yes --> Running
CompareBlocks -- No --> Stalled
Summary
Aspect | Description |
|---|---|
**Purpose** | Detect if Ethereum daemon is live or stalled |
**Input** | Ethereum JSON-RPC block number via localhost |
**State Persistence** | `/data/.block_number` file |
**Disable Mechanism** | Presence of `/data/disable_liveness` file |
**Exit Codes** | 0 = live or disabled, 1 = stalled or error |
**Dependencies** | `curl`, `jq` |
**Typical Usage** | Kubernetes liveness probe |
**Storage Requirement** | Persistent volume mounted at `/data` |
This documentation should provide a clear understanding of `liveness.sh` for developers, operators, and system integrators who need to maintain or integrate Ethereum daemon liveness checks within their infrastructure.