build_docker_image.mdx
Overview
The build_docker_image.mdx file serves as a comprehensive user guide and tutorial on how to build and deploy Docker images for the RAGFlow project. It is designed primarily for developers and testers who need to create local Docker images for debugging, development, or testing purposes. The documentation covers two main Docker build options: one without embedding models (smaller image) and another including embedding models (larger image). It also provides instructions on launching the RAGFlow service on macOS platforms, specifically targeting Apple M1/M2 ARM64 architectures.
This file is a Markdown/MDX document used in the RAGFlow documentation website and includes interactive tabs to separate build instructions, making it easy for users to follow based on their needs.
Detailed Content Description
Purpose and Functionality
Purpose: To guide users through building RAGFlow Docker images from source, tailored either for a lightweight build or a build including heavy embedding models.
Functionality: Provides step-by-step commands, prerequisites, and configuration tips to successfully build and run RAGFlow in Docker containers.
Target Audience: Developers adding features or debugging, ARM64 platform users, and testers exploring the latest RAGFlow features.
Sections & Usage
1. Target Audience
Defines who benefits from this document:
Developers modifying source code who need to test changes.
Developers targeting ARM64 platforms.
Testers exploring new features with a Dockerized environment.
2. Prerequisites
Lists minimum system requirements to build and run RAGFlow Docker images:
Resource | Minimum Requirement |
|---|---|
CPU | 4 cores or more |
RAM | 16 GB or more |
Disk Space | 50 GB or more |
Docker | Version 24.0.0 or higher |
Docker Compose | Version 2.26.1 or higher |
3. Build a Docker image
This section uses and <TabItem> components to separate instructions into two options:
a) Build a Docker image without embedding models (Lightweight, ~2GB)
Relies on external LLM and embedding services.
Suitable for faster builds and smaller image size.
Includes an important note for ARM64 builds:
Upgrade
xgboostto version1.6.0inpyproject.toml.Ensure
unixODBCis installed.
Commands to build:
git clone https://github.com/infiniflow/ragflow.git cd ragflow/ uv run download_deps.py docker build -f Dockerfile.deps -t infiniflow/ragflow_deps . docker build --build-arg LIGHTEN=1 -f Dockerfile -t infiniflow/ragflow:nightly-slim .
b) Build a Docker image including embedding models (~9GB)
Includes embedding models bundled inside the image.
Only external LLM services are required.
Same ARM64 notes as above.
Build commands:
git clone https://github.com/infiniflow/ragflow.git cd ragflow/ uv run download_deps.py docker build -f Dockerfile.deps -t infiniflow/ragflow_deps . docker build -f Dockerfile -t infiniflow/ragflow:nightly .
4. Launch a RAGFlow Service from Docker for MacOS
Provides instructions to deploy the built Docker image alongside dependent services such as Elasticsearch, MySQL, MinIO, and Redis using Docker Compose.
Example for Apple M2 Pro:
Edit Docker Compose Configuration
Modify the
docker/.envfile to point theRAGFLOW_IMAGEvariable to the newly built image tag (e.g.,infiniflow/ragflow:nightly-slim).Launch the Service
cd docker docker compose -f docker-compose-macos.yml up -dAccess the Service
Access via browser at
http://127.0.0.1or the respective server IP on port 80.
Important Implementation Details
ARM64 Support:
RAGFlow does not officially maintain ARM64 images but supports building them on ARM64 host machines (linux/arm64ordarwin/arm64). Users must manually upgrade certain dependencies (xgboostto 1.6.0 and installunixODBC) to satisfy ARM64 compatibility.Image Size Considerations:
The lightweight image (~2GB) leaves embedding models external, ideal for development cycles where smaller size and faster builds are preferable.
The full image (~9GB) includes embedding models, suitable when an all-in-one container is desired.
Docker Layers:
A separate Docker image (
infiniflow/ragflow_deps) is built first usingDockerfile.deps, which caches dependencies.The main image is then built on top with or without the
LIGHTEN=1build argument controlling embedding model inclusion.
Use of
uv run download_deps.py:
This command downloads necessary Python dependencies before building the image, ensuring the environment is prepared.
Interaction with Other System Components
Docker Compose Services:
The Docker image built here is designed to run alongside multiple backend services:Elasticsearch for document indexing and search.
MySQL for metadata and persistent storage.
MinIO as an object storage alternative.
Redis for caching and message brokering.
External LLM and Embedding Services:
The lightweight Docker image assumes embedding models are hosted externally, so it interacts with external APIs or services for language model functionalities.Configuration Files:
The documentation refers to environment configuration files (docker/.env) and Docker Compose YAML (docker-compose-macos.yml) that customize deployment parameters.
Usage Examples
Build lightweight Docker image
git clone https://github.com/infiniflow/ragflow.git
cd ragflow/
uv run download_deps.py
docker build -f Dockerfile.deps -t infiniflow/ragflow_deps .
docker build --build-arg LIGHTEN=1 -f Dockerfile -t infiniflow/ragflow:nightly-slim .
Build full Docker image with embedding models
git clone https://github.com/infiniflow/ragflow.git
cd ragflow/
uv run download_deps.py
docker build -f Dockerfile.deps -t infiniflow/ragflow_deps .
docker build -f Dockerfile -t infiniflow/ragflow:nightly .
Launch RAGFlow service on macOS (Apple M2 Pro)
# Edit docker/.env to update RAGFLOW_IMAGE
cd docker
docker compose -f docker-compose-macos.yml up -d
Open http://127.0.0.1 to access the RAGFlow UI.
Visual Diagram
Below is a flowchart illustrating the main workflow and relationships between the key functions/steps described in this documentation file.
flowchart TD
A[Clone RAGFlow Repository] --> B[Download Dependencies\n(uv run download_deps.py)]
B --> C[Build Dependencies Image\n(Dockerfile.deps)]
C --> D{Build Options}
D -->|Lightweight Image| E[Build Docker Image\nwith LIGHTEN=1 (nightly-slim)]
D -->|Full Image with Embeddings| F[Build Docker Image\nwithout LIGHTEN arg (nightly)]
E --> G[Edit docker/.env\nSet RAGFLOW_IMAGE to nightly-slim]
F --> G
G --> H[Launch Docker Compose\n(docker-compose-macos.yml)]
H --> I[Access RAGFlow UI\nhttp://127.0.0.1]
Summary
The build_docker_image.mdx file is a critical piece of documentation for anyone looking to build and deploy RAGFlow Docker images. It carefully balances instructions for different build configurations and hardware architectures, emphasizing Docker best practices and system requirements. The interactive tab layout and clear stepwise instructions make it easy for developers and testers to get started with RAGFlow in containerized environments quickly.
This file forms the bridge between RAGFlow’s source code and running, testable Docker containers, integrating closely with Docker Compose and system services like Elasticsearch and Redis to provide a full-stack local development or testing setup.