entrypoint.sh
Overview
entrypoint.sh is a Bash script designed as the main startup entry point for a distributed task execution and web service environment. It orchestrates the initialization and launching of multiple components in a software system:
A web server stack consisting of
nginxand theragflow_serverPython API server.A set of task executor workers that continuously run Python task executors, each identified by consumer IDs.
An optional MCP server (a management/control plane server) that can be enabled and configured at runtime.
The script parses command-line arguments to enable or disable components, configure runtime parameters such as consumer ID ranges, worker counts, host identifiers, and MCP server connection details. It also dynamically generates a service configuration file from a template by interpolating environment variables.
This script is intended to run inside a Linux environment with Python 3 and necessary dependencies installed. It is typically used as the container or host entrypoint to bootstrap the entire runtime environment for a Ragflow-based distributed compute platform.
Detailed Explanation
Global Variables and Defaults
ENABLE_WEBSERVER (default
1): Enables or disables launching the web server (nginx + ragflow_server).ENABLE_TASKEXECUTOR (default
1): Enables or disables launching task executor workers.ENABLE_MCP_SERVER (default
0): Enables or disables the MCP server.CONSUMER_NO_BEG and CONSUMER_NO_END (default
0): Define the inclusive start and exclusive end of consumer ID range for task executors.WORKERS (default
1): Number of task executor workers to run if no consumer ID range is specified.HOST_ID: Unique identifier for the host. Defaults to the system hostname if ≤ 32 characters or else the MD5 hash of the hostname.
MCP_HOST, MCP_PORT, MCP_BASE_URL, MCP_SCRIPT_PATH, MCP_MODE, MCP_HOST_API_KEY: Configuration parameters for the MCP server.
MCP_TRANSPORT_SSE_FLAG, MCP_TRANSPORT_STREAMABLE_HTTP_FLAG, MCP_JSON_RESPONSE_FLAG: Flags controlling MCP server transport and response behavior.
Functions
usage()
Displays usage instructions and command line options, then exits the script.
Parameters: None
Returns: Exits with status 1.
Usage Example:
./entrypoint.sh --disable-taskexecutor
Prints usage and exits if unknown arguments are passed.
task_exe(consumer_id, host_id)
Starts a single task executor worker in an infinite loop under LD_PRELOAD of jemalloc library for memory optimization.
Parameters:
consumer_id(string|int): ID of the consumer instance.host_id(string): Host identifier string.
Returns: Does not return; runs task executor indefinitely.
Behavior:
Runs the Python scriptrag/svr/task_executor.pypassing a combined identifier${host_id}_${consumer_id}. Uses jemalloc for memory management by settingLD_PRELOAD.Usage Example:
task_exe 3 myhost123
Starts a task executor for consumer ID 3 on host "myhost123".
start_mcp_server()
Starts the MCP server in the background with configured host, port, base URL, mode, API key, and transport flags.
Parameters: None (reads global shell variables for config)
Returns: Runs MCP server Python script in background.
Usage Example:
start_mcp_server
Starts the MCP server on configured parameters.
Command-Line Argument Parsing
The script supports the following command-line options:
Argument | Description |
|---|---|
| Disables starting nginx and ragflow_server. |
| Disables starting task executor workers. |
| Enables starting the MCP server. |
| Specifies start of consumer ID range for task executors. |
| Specifies end of consumer ID range for task executors. |
| Specifies the number of task executors to run (if consumer range not set). |
| Sets the unique host ID string. Defaults to hostname or hashed hostname. |
| MCP server host IP or DNS name. |
| MCP server port. |
| Base URL for MCP server. |
| MCP server mode (e.g., self-host). |
| API key for MCP host authorization. |
| Path to MCP server Python script. |
| Disable SSE transport flag for MCP server. |
| Disable streamable HTTP transport flag for MCP server. |
| Disable JSON response flag for MCP server. |
Unknown arguments will trigger the usage() function and exit.
Configuration File Generation
The script generates a final service_conf.yaml configuration file from a template service_conf.yaml.template located in /ragflow/conf/ by performing environment variable substitution. This allows dynamic configuration based on runtime environment and arguments.
Component Startup Logic
Web Server (nginx + ragflow_server):
If enabled, starts nginx daemon and continuously runsragflow_server.pyin a loop in the background.MCP Server:
If enabled via--enable-mcpserver, starts the MCP server Python script with configured parameters.Task Executors:
If enabled, runs task executors either:For a range of consumer IDs if
consumer-no-begandconsumer-no-endare set (start inclusive, end exclusive), orFor a fixed number of workers otherwise.
Each task executor runs in the background.
Finally, the script waits for all background jobs to finish (which is normally indefinite).
Important Implementation Details
Uses
set -eto terminate immediately on any command failure.Uses
LD_PRELOADto preloadjemallocfor improved memory management in task executor processes.Uses infinite loops (
while true) for resiliency, restarting Python servers if they exit.Parses command-line arguments using a
forloop withcasestatements, shifting arguments on each recognized option.Dynamically generates the service config file by reading a template line-by-line and evaluating variables with
eval "echo "$line"".
Interaction with Other System Components
nginx: The script launches the nginx web server using the system binary
/usr/sbin/nginx.ragflow_server.py: The Python API server providing HTTP endpoints; this script runs it continuously if webserver is enabled.
task_executor.py: Script for executing distributed tasks; launched multiple times with different consumer IDs.
MCP server Python script: Management/control plane server, optionally started with configurable networking and security details.
Configuration Files: Reads
/ragflow/conf/service_conf.yaml.templateand generates/ragflow/conf/service_conf.yamldynamically, so other components can consume updated config.Python interpreter: Uses Python 3 (
python3) for all Python scripts.
Usage Examples
Disable task executor workers:
./entrypoint.sh --disable-taskexecutor
Launch webserver disabled, and start task executors for consumer IDs 0 through 4:
./entrypoint.sh --disable-webserver --consumer-no-beg=0 --consumer-no-end=5
Start two task executors on host with ID
myhost123:
./entrypoint.sh --disable-webserver --workers=2 --host-id=myhost123
Enable MCP server with default settings:
./entrypoint.sh --enable-mcpserver
Mermaid Flowchart Diagram
The following flowchart represents the main functions and workflow relationships in entrypoint.sh:
flowchart TD
A[Start Script] --> B[Parse Command-Line Arguments]
B --> C[Generate service_conf.yaml from template]
C --> D{ENABLE_WEBSERVER == 1?}
D -- Yes --> E[Start nginx]
E --> F[Start ragflow_server.py in infinite loop]
D -- No --> G[Skip webserver]
B --> H{ENABLE_MCP_SERVER == 1?}
H -- Yes --> I[start_mcp_server()]
H -- No --> J[Skip MCP server]
B --> K{ENABLE_TASKEXECUTOR == 1?}
K -- Yes --> L{CONSUMER_NO_END > CONSUMER_NO_BEG?}
L -- Yes --> M[Start task_exe() for consumer IDs in range]
L -- No --> N[Start fixed number of task_exe() workers]
K -- No --> O[Skip task executors]
F & I & M & N --> P[wait for background processes]
Summary
entrypoint.sh is a robust, configurable startup script that serves as the main orchestration point for running a Ragflow-based distributed compute environment. It flexibly enables or disables components, dynamically configures the environment, and ensures that critical services and workers are launched and managed reliably. This script is central to bootstrapping the system and managing the lifecycle of web, task executor, and MCP server processes.