opensearch.yaml
Overview
The opensearch.yaml file is a Helm template used for deploying an OpenSearch service within a Kubernetes cluster as part of the Ragflow application. It orchestrates the creation and configuration of multiple Kubernetes resources, including:
A PersistentVolumeClaim (PVC) for data storage
A StatefulSet for managing the OpenSearch pods with stable network IDs and storage
A Service to expose the OpenSearch HTTP API within the cluster
This template is conditionally rendered only if the .Values.env.DOC_ENGINE Helm value is set to "opensearch". It leverages Helm templating features such as includes, conditionals, and value injections to customize resource specifications dynamically.
Detailed Explanation of the Resources
1. PersistentVolumeClaim (PVC)
Purpose:
Allocates persistent storage for OpenSearch data to ensure data durability across pod restarts.
Key Fields:
metadata.name: Uses a Helm include to generate a unique name based on the release.annotations: Contains a Helm annotation "helm.sh/resource-policy": keep to instruct Helm to retain the PVC on uninstall rather than delete it.labels: Includes standard labels from Ragflow plus a component labelapp.kubernetes.io/component: opensearch.spec.storageClassName: Dynamically assigned if .Values.opensearch.storage.className is set.spec.accessModes: Set toReadWriteOncemeaning the volume can be mounted as read-write by a single node.spec.resources.requests.storage: Requests storage capacity defined by.Values.opensearch.storage.capacity.
2. StatefulSet
Purpose:
Manages a set of OpenSearch pods with persistent identities and stable storage, enabling reliable cluster behavior and data persistence.
Key Fields:
metadata.name: Helm-generated full name with -opensearch suffix.labels: Standard Ragflow labels plus the OpenSearch component label.spec.replicas: Fixed to 1 replica (single-node OpenSearch deployment).spec.selector.matchLabels: Matches pods by Ragflow selector labels and component label.spec.strategy: Optional deployment strategy injected from.Values.opensearch.deployment.strategy.spec.template.metadata.labels: Pod labels consistent with StatefulSet selector.spec.template.metadata.annotations: Checksums for config maps (opensearch-config.yamlandenv.yaml) to trigger pod restarts on config changes.
Pod Spec:
Image Pull Secrets:
Populated from both global .Values.imagePullSecrets and.Values.opensearch.image.pullSecrets.Init Containers:
fix-data-volume-permissions: Uses an Alpine image tochownthe data directory to user 1000, ensuring proper permissions for OpenSearch.sysctl: Uses a privileged BusyBox container to set the kernel parametervm.max_map_count=262144, required by OpenSearch for memory mapping.
OpenSearch Container:
image: Configured from .Values.opensearch.image.repository and.Values.opensearch.image.tag.imagePullPolicy: Optional.envFrom: Loads environment variables from a secret and a config map.ports: Exposes container port 9201 for HTTP API.volumeMounts: Mounts the persistent volume claim at/usr/share/opensearch/data.resources: Optional resource requests and limits from.Values.opensearch.deployment.resources.securityContext: Adds IPC_LOCK capability, runs as user 1000, disables privilege escalation.livenessProbe: Executes a curl command to check OpenSearch health on port 9201 with admin credentials, starting after 30 seconds with 10-second intervals.
Volumes:
Defines the volume opensearch-data linked to the PVC created above.
3. Service
Purpose:
Creates a Kubernetes Service to expose OpenSearch pods internally within the cluster, allowing other components to communicate with OpenSearch.
Key Fields:
metadata.name: Helm-generated full name with -opensearch suffix.labels: Standard Ragflow labels plus OpenSearch component label.spec.selector: Selects pods with Ragflow selector labels and OpenSearch component label.spec.ports: Exposes TCP port 9201, targeting the container port namedhttp.spec.type: Configurable service type (e.g., ClusterIP, NodePort) from.Values.opensearch.service.type.
Important Implementation Details and Algorithms
Conditional Rendering:
The entire manifest is wrapped in a conditional block that only deploys these resources if the .Values.env.DOC_ENGINE equals"opensearch". This enables flexible switching between different document engines.Pod Restart on Config Change:
The pod template uses annotations with SHA256 checksums of config maps (opensearch-config.yamlandenv.yaml) to trigger rolling restarts when configuration changes, ensuring OpenSearch pods always run with the latest config.Init Containers for Environment Preparation:
Two init containers prepare the environment before the main OpenSearch container starts: one fixes volume permissions, the other sets required kernel parameters.Security Context:
The OpenSearch container runs as non-root user1000with specific capabilities and no privilege escalation, enhancing security.Liveness Probe:
Active liveness probe with curl checks that OpenSearch is responsive, which helps Kubernetes detect and restart unhealthy pods automatically.
Interaction with Other System Components
Helm Values Injection:
The file depends heavily on Helm values for customization, including storage classes, image repositories, tags, pull policies, resource allocations, and service types. It uses Helm helpers (e.g.,ragflow.fullname,ragflow.labels) to maintain consistent naming and labeling conventions across the application.ConfigMaps and Secrets:
The OpenSearch container loads environment variables and configuration from Kubernetes Secrets and ConfigMaps created elsewhere in the Ragflow Helm chart. These external resources provide sensitive credentials and configuration settings.Persistent Storage:
The PVC ensures that OpenSearch data persists beyond pod lifecycle, allowing data durability and recovery after pod restarts or rescheduling.Other Ragflow Components:
The OpenSearch service is intended to integrate with other Ragflow components that require search or document indexing capabilities. The service exposes OpenSearch on port 9201 for internal cluster communication.
Usage Example
To deploy OpenSearch with this Helm template, ensure your values.yaml includes:
env:
DOC_ENGINE: "opensearch"
opensearch:
storage:
className: "fast-ssd" # Optional storage class
capacity: "10Gi" # Storage size for PVC
image:
repository: "opensearchproject/opensearch"
tag: "2.9.0"
pullPolicy: "IfNotPresent"
initContainers:
alpine:
repository: "alpine"
tag: "3.15"
pullPolicy: "IfNotPresent"
busybox:
repository: "busybox"
tag: "1.35"
pullPolicy: "IfNotPresent"
deployment:
strategy:
type: RollingUpdate
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
service:
type: ClusterIP
Then run:
helm install ragflow ./ragflow-chart -f values.yaml
Mermaid Diagram
flowchart TD
PVC[PersistentVolumeClaim]
StatefulSet[StatefulSet: opensearch]
Service[Service: opensearch]
PVC --> StatefulSet
StatefulSet --> Service
subgraph StatefulSet Pods
direction TB
InitContainer1["fix-data-volume-permissions"]
InitContainer2["sysctl (vm.max_map_count)"]
Container["opensearch container"]
end
StatefulSet --> InitContainer1
StatefulSet --> InitContainer2
StatefulSet --> Container
Summary
The opensearch.yaml Helm template defines a robust and configurable OpenSearch deployment within the Ragflow Kubernetes environment. It ensures persistent storage, secure and properly configured container execution, and seamless integration with the rest of the system through services and config management. The use of init containers and liveness probes reflects best practices for stateful service deployments in Kubernetes.