liveness.sh
Overview
`liveness.sh` is a Bash script designed to serve as a liveness probe for a daemon process, typically running in a containerized environment such as Kubernetes. The script checks whether the daemon is actively progressing by querying its latest block height from a local HTTP status endpoint. If the block height is increasing over time, the daemon is considered alive and healthy. If the block height stalls or the daemon cannot be reached, the script signals failure, prompting container orchestration systems to take remedial action (e.g., restart the container).
Additionally, the script supports a mechanism to disable the liveness probe dynamically by the presence of a specific file (`/root/disable_liveness`).
Detailed Explanation
Script Workflow
Check for Liveness Probe Disabling File
If the file/root/disable_livenessexists, the script immediately exits successfully, effectively disabling the liveness check.Fetch Current Block Height
The script sends an HTTP GET request tohttp://localhost:27147/statusto retrieve the current status of the daemon. It parses the JSON response to extract thelatest_block_heightusing thejqtool.Initialize or Update Block Height Record
The script maintains a file/root/.latest_block_heightthat stores the previously observed block height.If the file does not exist, this indicates the first run; the script writes the current block height to the file and exits with a failure code (1), since no comparison can be made yet.
If the file exists, it reads the previous block height and updates the file with the current block height.
Compare Block Heights
The script compares the current and previous block heights:If the current block height is greater than the previous, the daemon is considered alive and the script exits successfully (0).
Otherwise, the daemon is considered stalled, and the script exits with failure (1).
Key Variables and Files
Variable/File | Description |
|---|---|
Path to the file `/root/disable_liveness` to disable probe. | |
Path to the file `/root/.latest_block_height` storing last block height. | |
`STATUS` | JSON response from the daemon's status HTTP endpoint. |
Current block height parsed from `STATUS`. | |
Block height stored from the previous check. |
Commands and Tools Used
curl: Used to fetch daemon status from a local HTTP endpoint.
Flags: -s (silent),
-f(fail silently on server errors).
jq: Command-line JSON processor used to parse the JSON and extract
latest_block_height.Standard Bash Constructs:
[[ -f ... ]]to check for file existence.Arithmetic comparison
(( ... ))to compare numeric values.
Parameters and Exit Codes
This script does not take any command-line arguments. It relies on fixed file paths and network endpoints.
Exit Code | Meaning |
|---|---|
`0` | Liveness probe succeeded: daemon is alive or probe is disabled. |
`1` | Liveness probe failed: daemon is unreachable, stalled, or first run without prior data. |
Usage Example
This script is typically used as a liveness probe command in Kubernetes container definitions:
livenessProbe:
exec:
command:
- /bin/bash
- /path/to/liveness.sh
initialDelaySeconds: 30
periodSeconds: 10
The orchestration system runs this script periodically to determine if the daemon is healthy.
Implementation Details and Algorithms
State Persistence via File:
The script uses a simple file/root/.latest_block_heightto persist the last known block height between invocations. This allows the script to detect progress over time.Stall Detection Algorithm:
The core logic checks if the block height is increasing. If the block height remains the same or decreases, the daemon is considered stalled.Probe Disabling Mechanism:
By placing a file at/root/disable_liveness, administrators or other processes can disable the probe without modifying the script or container configuration.Robustness:
The script exits with failure if the HTTP request fails or the required JSON field is not retrievable, signaling an unhealthy state.
Interaction with Other System Components
Daemon:
The script depends on the daemon exposing an HTTP status API atlocalhost:27147/status. The daemon must produce JSON output containingresult.sync_info.latest_block_height.Container Orchestration:
The script is typically invoked by container runtime or orchestration tools (such as Kubernetes) as a liveness probe to determine container health.File System:
Persistence of the previous block height is done via a file stored in/root/, which must be writable by the script.Utilities:
The script requires thecurlandjqutilities to be present in the container or environment.
Mermaid Flowchart Diagram
The following diagram illustrates the script's main flow and decision points:
flowchart TD
Start["Start Script"]
CheckDisable["Check if /root/disable_liveness exists"]
Disabled["Liveness probe disabled\n(exit 0)"]
FetchStatus["Fetch daemon status\n(curl http://localhost:27147/status)"]
FetchFail["Failed to fetch status\n(exit 1)"]
ParseHeight["Parse latest_block_height using jq"]
CheckFileExists["Check if /root/.latest_block_height exists"]
FileNotExist["File does not exist\nWrite current height and exit 1"]
ReadPrevHeight["Read previous block height from file"]
UpdateFile["Write current height to file"]
CompareHeights["Is current height > previous height?"]
Alive["Daemon is running\n(exit 0)"]
Stalled["Daemon is stalled\n(exit 1)"]
Start --> CheckDisable
CheckDisable -->|Yes| Disabled
CheckDisable -->|No| FetchStatus
FetchStatus -->|Fail| FetchFail
FetchStatus -->|Success| ParseHeight
ParseHeight --> CheckFileExists
CheckFileExists -->|No| FileNotExist
CheckFileExists -->|Yes| ReadPrevHeight
ReadPrevHeight --> UpdateFile
UpdateFile --> CompareHeights
CompareHeights -->|Yes| Alive
CompareHeights -->|No| Stalled
Summary
`liveness.sh` is a simple yet effective liveness probe script that monitors a daemon's progress via its block height metric. It integrates tightly with container orchestration systems to ensure automated health checks and recovery mechanisms, improving system reliability. The script uses basic shell tools and a file-based persistence strategy to detect whether the daemon is stalled or running smoothly.