Snapshot & Restore

Purpose

Snapshot & Restore addresses the critical need for **data durability and recovery** within the Kubernetes-managed blockchain node and indexer services. StatefulSets in these coinstacks rely on persistent volumes (EBS volumes in AWS) to store blockchain data, indexes, and node state. This subtopic provides the **procedures and tooling** to create reliable snapshots of these volumes and restore them in case of data corruption, failure, or migration. Ensuring consistent snapshots and accurate restores minimizes downtime and data loss risk, which is essential for maintaining blockchain service integrity and continuity.

Functionality

The Snapshot & Restore process involves **manual but well-defined steps** to safely capture and recover the persistent state of blockchain services running in Kubernetes. The key workflows are:

Snapshot Workflow

  1. Scale down the target StatefulSet
    Reducing replicas to zero (or one less) ensures no active writes to the persistent volumes, preventing snapshot corruption.

  2. Create EBS volume snapshots in AWS
    Using the AWS console, snapshots are taken of the now "available" volumes attached to the scaled-down StatefulSet pods.

  3. Scale up the StatefulSet
    Restoring the service to its original replica count allows normal operation to resume immediately after snapshot initiation.

This ensures point-in-time capture of persistent storage without impacting service availability longer than necessary.

Restore Workflow

  1. Scale down the StatefulSet
    Stops pods to safely detach persistent volumes.

  2. Delete existing PersistentVolumeClaim (PVC) and PersistentVolume (PV)
    Removes references to old volumes, preparing for volume replacement.

  3. Create new EBS volumes from snapshots
    New volumes are instantiated from the previously taken snapshots, preserving data state.

  4. Edit and apply Kubernetes YAML (restore-pvc.yaml)
    This YAML defines the PV and PVC, binding the new volume to the StatefulSet pod.

  5. Scale up the StatefulSet
    Pods restart, now using the restored volumes, rehydrating the blockchain node or indexer state.

These steps must be executed carefully to match volume sizes, availability zones, and naming conventions to ensure Kubernetes correctly binds the new volumes.

Integration

Snapshot & Restore is an essential complement to the **Health and Readiness Probes** subtopic, which monitors service health but cannot recover data corruption or volume failure. It also integrates closely with the **Deployment Automation** subtopic by providing a manual fallback for state recovery outside automated CI/CD pipelines.

While Deployment Automation manages stateful service rollout, Snapshot & Restore ensures **state persistence and recovery**, critical for blockchain nodes and indexers holding large datasets. It supports all blockchain coinstacks by abstracting AWS EBS volume management and Kubernetes volume binding, providing a unified manual recovery mechanism.

Unlike automated probes and deployments, this subtopic introduces **manual intervention procedures** not covered elsewhere, giving operators control over backup and restore operations.

Code Interaction Highlights

Example snippet from `restore-pvc.yaml` template (placeholders replaced per restore operation):

kind: PersistentVolume
metadata:
  name: data-ethereum-sts-0-pv-dev
spec:
  capacity:
    storage: 200Gi
  csi:
    driver: ebs.csi.aws.com
    volumeHandle: vol-0abcd1234efgh5678
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
          - key: topology.ebs.csi.aws.com/zone
            values:
              - us-east-2a

This manifest ensures Kubernetes correctly attaches the restored volume to the pod replica.

Diagram

flowchart TD
  A[Scale Down StatefulSet] --> B[Detach EBS Volumes]
  B --> C[Create EBS Snapshots (AWS Console)]
  C --> D[Scale Up StatefulSet]
  D --> E[Normal Operation]

  subgraph Restore Process
    F[Scale Down StatefulSet] --> G[Delete PVC & PV]
    G --> H[Create EBS Volumes from Snapshots]
    H --> I[Edit & Apply restore-pvc.yaml]
    I --> J[Scale Up StatefulSet]
    J --> K[Restored StatefulSet Running]
  end

This flowchart visualizes the sequential steps for snapshot creation and restoration, emphasizing the critical scaling operations and volume management stages needed for data consistency and recovery.


Snapshot & Restore provides a robust manual mechanism to safeguard blockchain node and indexer data integrity within Kubernetes by leveraging AWS EBS snapshots and careful volume management, complementing automated health checks and deployments.