liveness.sh


Overview

`liveness.sh` is a lightweight Bash script designed to serve as a **liveness probe** for an Ethereum node running inside a Kubernetes pod. Its primary function is to verify that the Ethereum node daemon is actively progressing by confirming that the node’s current block number increases over time. This ensures Kubernetes can detect if the node has stalled and take corrective action such as restarting the container.

The script interacts with the Ethereum JSON-RPC interface (`eth_blockNumber` method) exposed locally (typically on `http://localhost:8545`), compares the current block number to a previously recorded value stored on disk, and determines if the node is alive (block height advancing) or stalled (no block progress).

It also supports a manual override mechanism via a disable flag file that allows temporarily bypassing the liveness check, useful during maintenance or debugging.


Detailed Explanation

Script Behavior Summary

  1. Disable Flag Check
    Checks if the file /data/disable_liveness exists. If yes, the script prints "liveness probe disabled" and exits successfully (exit 0), skipping further checks.

  2. Fetch Current Block Number
    Queries the Ethereum node JSON-RPC endpoint at http://localhost:8545 for the current block number using the eth_blockNumber RPC method.

  3. Block Number Persistence and Comparison

    • Reads the last recorded block number from /data/.block_number.

    • If the file does not exist, writes the current block number to it and exits with failure (exit 1). This triggers Kubernetes to consider the node not yet alive on the first probe.

    • If the file exists, compares the current block number with the previous one.

      • If the current block number is greater, the node is considered alive (exit 0).

      • Otherwise, the node is considered stalled (exit 1).


File-level Variables

Variable Name

Description

`DISABLE_LIVENESS_PROBE`

Path to the disable flag file `/data/disable_liveness`. If this file exists, liveness checking is bypassed.

`FILE`

Path to the file `/data/.block_number` storing the last observed Ethereum block number.


Key Commands and Logic


Usage Example

This script is typically referenced in the Kubernetes pod spec for the Ethereum node container as the liveness probe command:

livenessProbe:
  exec:
    command:
      - /bin/bash
      - /path/to/liveness.sh
  initialDelaySeconds: 30
  periodSeconds: 15
  failureThreshold: 3

Implementation Details and Algorithms


Interaction with Other Parts of the System


Mermaid Diagram: Flowchart of Script Logic

flowchart TD
    Start[Start Script]
    CheckDisable{Disable flag file\n(/data/disable_liveness) exists?}
    FetchBlock[Fetch current block number\nvia eth_blockNumber RPC]
    ReadFile{Does /data/.block_number exist?}
    WriteFile[Write current block number to /data/.block_number]
    ReadPrev[Read previous block number from file]
    CompareBlocks{Is current > previous?}
    ExitSuccess[Print "daemon is running"\nExit 0 (healthy)]
    ExitFailFirstRun[Exit 1 (first run - not ready)]
    ExitFailStalled[Print "daemon is stalled"\nExit 1 (stalled)]

    Start --> CheckDisable
    CheckDisable -- Yes --> DisableExit[Print "liveness probe disabled"\nExit 0]
    CheckDisable -- No --> FetchBlock
    FetchBlock --> ReadFile
    ReadFile -- No --> WriteFile --> ExitFailFirstRun
    ReadFile -- Yes --> ReadPrev
    ReadPrev --> CompareBlocks
    CompareBlocks -- Yes --> WriteFile --> ExitSuccess
    CompareBlocks -- No --> WriteFile --> ExitFailStalled

Summary

The `liveness.sh` script is a concise, efficient liveness probe implementation tailored for Ethereum nodes in Kubernetes environments. By tracking block number progression through the Ethereum JSON-RPC, it provides a reliable indicator of node health and synchronization progress. It integrates seamlessly with Kubernetes pod lifecycle management to improve overall system stability by automatically detecting and recovering from node stalls.


If you need further assistance integrating or extending this script, or adapting it for other blockchain nodes, please let me know!