elasticsearch.yaml
Overview
The elasticsearch.yaml file is a Kubernetes manifest template used to deploy and manage an Elasticsearch instance within a Kubernetes cluster. It is designed specifically to be rendered by Helm, a Kubernetes package manager, and is conditionally included only when the DOC_ENGINE environment variable is set to "elasticsearch" in the Helm values.
This file provisions the following Kubernetes resources essential for running Elasticsearch:
PersistentVolumeClaim (PVC): For durable storage of Elasticsearch data.
StatefulSet: To manage the Elasticsearch pods with stable network identities and persistent storage.
Service: To expose the Elasticsearch pods internally for communication and external access based on the specified service type.
The manifest utilizes Helm templating syntax to dynamically configure resource names, labels, storage classes, resource requests, container images, and environment variables based on user-provided Helm values.
Detailed Explanation of Resources
1. PersistentVolumeClaim (PVC)
Purpose
Defines persistent storage for Elasticsearch data to ensure data durability across pod restarts or rescheduling.
Key Fields
metadata.name: Constructed using Helm helper
ragflow.fullnamewith suffix-es-data.annotations: Includes "helm.sh/resource-policy": keep to prevent PVC deletion during Helm uninstall.
labels: Standard application labels plus
app.kubernetes.io/component: elasticsearch.spec.storageClassName: Conditionally set from
.Values.elasticsearch.storage.className.spec.accessModes: Set to
ReadWriteOnceallowing the volume to be mounted as read-write by a single node.spec.resources.requests.storage: Storage size requested, from
.Values.elasticsearch.storage.capacity.
Usage Example
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myapp-es-data
spec:
storageClassName: fast-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
2. StatefulSet
Purpose
Manages Elasticsearch pods, providing stable network IDs and persistent storage, critical for Elasticsearch clustering and data integrity.
Key Fields
metadata.name: Helm-generated name with suffix
-es.spec.replicas: Defaults to 1; can be customized.
spec.selector.matchLabels: Matches pod labels to manage the StatefulSet.
spec.strategy: Deployment strategy, optionally configured via Helm.
spec.template:
metadata.labels and annotations: Include checksums of config files for pod restart triggers.
spec.imagePullSecrets: Supports image pull secrets from global or Elasticsearch-specific Helm values.
spec.initContainers:
fix-data-volume-permissions: Ensures correct file ownership on the data volume (UID 1000).sysctl: Setsvm.max_map_count=262144required by Elasticsearch for memory mapping.
spec.containers:
elasticsearch container:
Uses configurable image and tag.
Environment variables sourced from secrets and configmaps.
Ports: 9200 (HTTP) and 9300 (transport).
Volume mounts for persistent data.
Resource requests/limits configurable.
Security context restricts privileges and adds
IPC_LOCKcapability.
spec.volumes: Mounts the PVC created above.
Usage Example Snippet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: myapp-es
spec:
replicas: 1
selector:
matchLabels:
app: myapp
component: elasticsearch
template:
metadata:
labels:
app: myapp
component: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.10.0
ports:
- containerPort: 9200
- containerPort: 9300
volumeMounts:
- name: es-data
mountPath: /usr/share/elasticsearch/data
volumes:
- name: es-data
persistentVolumeClaim:
claimName: myapp-es-data
3. Service
Purpose
Exposes Elasticsearch pods to other services or external clients for HTTP and transport communication.
Key Fields
metadata.name: Helm-generated with suffix
-es.spec.selector: Matches the Elasticsearch pods.
spec.ports: Maps TCP port 9200 to container port named
http.spec.type: Configurable service type (ClusterIP, NodePort, LoadBalancer) from Helm values.
Usage Example
apiVersion: v1
kind: Service
metadata:
name: myapp-es
spec:
selector:
app: myapp
component: elasticsearch
ports:
- protocol: TCP
port: 9200
targetPort: http
type: ClusterIP
Important Implementation Details and Algorithms
Conditional Rendering: The entire file is wrapped in a conditional Helm statement checking if
.Values.env.DOC_ENGINEequals"elasticsearch". This enables modular deployment configurations.Hash-based Pod Restart Triggers: The StatefulSet pod template metadata contains annotations with SHA256 checksums of configuration files (
elasticsearch-config.yamlandenv.yaml). Changing these files triggers pod restarts to ensure configuration updates are applied.Init Containers:
fix-data-volume-permissions: Uses Alpine Linux container to modify ownership of the persistent data directory to UID 1000, matching Elasticsearch’s user.sysctl: Runs privileged BusyBox container to setvm.max_map_countkernel parameter required by Elasticsearch for proper memory mapping.
Security Context: Elasticsearch container runs as UID 1000 with limited privileges, including
IPC_LOCKcapability to lock memory segments, preventing swapping.Volume Management: Uses PersistentVolumeClaim for data durability and explicitly mounts the volume inside the container.
Resource Configuration: Resource requests and limits can be configured via Helm values under
.Values.elasticsearch.deployment.resources.
Interaction with Other System Components
Helm Chart Integration: This file is part of a Helm chart and integrates with other templates and value files:
Uses Helm helper templates like
ragflow.fullname,ragflow.labels, andragflow.selectorLabelsfor consistent naming and labeling.Depends on external configmaps (
-es-config) and secrets (-env-config) for environment variables and Elasticsearch configuration.References other YAML files like
elasticsearch-config.yamlandenv.yamlfor configuration data.
Kubernetes Ecosystem: Deploys Elasticsearch as a stateful service with persistent storage, enabling it to serve as the document engine within the larger application stack.
Application Stack: Likely serves as the indexing and search backend, storing documents and enabling querying capabilities for the application.
Visual Diagram
flowchart TD
subgraph PVC
PVC[PersistentVolumeClaim: es-data]
end
subgraph StatefulSet
STS[StatefulSet: elasticsearch]
IC1[InitContainer: fix-data-volume-permissions]
IC2[InitContainer: sysctl]
C[Container: elasticsearch]
VolumeMount[VolumeMount: es-data]
Env[EnvFrom: secret & configMap]
Ports[Ports: 9200 (http), 9300 (transport)]
end
subgraph Service
SVC[Service: elasticsearch]
end
PVC -->|Provides Storage| VolumeMount
VolumeMount --> C
IC1 --> STS
IC2 --> STS
Env --> C
Ports --> C
STS --> SVC
SVC -->|Routes Traffic| C
Summary
The elasticsearch.yaml file is a Helm-templated Kubernetes manifest that deploys an Elasticsearch cluster with persistent storage, proper initialization, and secure runtime configuration. It integrates tightly with Helm's templating system and external configuration sources, ensuring a flexible, consistent, and production-ready Elasticsearch deployment within a Kubernetes environment. The file defines key Kubernetes resources (PVC, StatefulSet, Service) and includes best practices such as init containers for environment setup and security context restrictions.