tracing.mdx
Overview
This documentation page introduces and explains the Tracing functionality integrated within the RAGFlow application via Langfuse, a specialized observability and tracing platform. The file outlines how RAGFlow users can enable detailed tracing of retrieval-augmented generation (RAG) pipelines to inspect, debug, and analyze every step—from retrieval through generation—in near real-time.
The primary purpose of this file is to guide users through:
Setting up Langfuse credentials for their projects.
Configuring RAGFlow to use those credentials.
Understanding how to execute pipelines and view trace data.
Leveraging Langfuse’s advanced visualizations and filtering features for trace analysis.
This page is written in Markdown with embedded HTML and images, intended to be part of RAGFlow’s documentation site.
Detailed Content Breakdown
1. Introduction to Langfuse Tracing Integration
What it does:
RAGFlow integrates with Langfuse to automatically emit trace data for every retrieval and generation step performed in pipelines.Why it matters:
Trace data includes spans, prompts, retrieved documents, and LLM responses, enabling developers and operators to debug pipeline behavior and performance bottlenecks effectively.Requirements:
RAGFlow version ≥ 0.20.5 (must include Langfuse connector)
A Langfuse workspace with Project Public and Secret Keys
Langfuse can be cloud-hosted or self-hosted.
2. Credentials Collection for Langfuse
Purpose: To obtain the authentication credentials required to connect RAGFlow to Langfuse.
Process:
Sign in to Langfuse dashboard.
Navigate to Settings ▸ Projects.
Create/select a project to get the Public Key and Secret Key.
Note the Langfuse host base URL (e.g.,
https://cloud.langfuse.com).
Key points:
Keys are scoped per project, not per environment.
One key pair is sufficient if multiple environments write to the same Langfuse project.
3. Adding Langfuse Credentials to RAGFlow
Purpose: To configure RAGFlow with Langfuse credentials so it can send trace data.
How to configure:
Log in to RAGFlow UI.
Access API ▸ Langfuse Configuration section.
Enter Host, Public Key, and Secret Key.
Save the configuration.
Result:
RAGFlow begins emitting tracing data automatically without requiring code changes.
4. Running Pipelines and Viewing Traces
Purpose: To demonstrate how to observe trace data generated by RAGFlow pipelines.
Steps:
Run any chat or retrieval pipeline in RAGFlow (e.g., Quickstart demo).
Open the Langfuse project and navigate to Traces.
Filter traces by name prefix:
ragflow-*.
What you see:
A trace representing the entire user request.
Multiple spans corresponding to retrieval, ranking, and generation steps.
Metadata including prompt payloads, retrieved documents, and generated LLM responses.
Tip:
Use Langfuse’s diff views and drill-down capabilities to compare prompt versions and identify bottlenecks.
Implementation Details and Algorithms
This documentation file does not contain executable code or implement algorithms directly. Instead, it provides user guidance on configuring and using Langfuse tracing with RAGFlow. The underlying Langfuse integration leverages OpenTelemetry or a similar tracing standard to emit spans and traces from the RAGFlow backend.
The tracing data model includes:
Traces: High-level request units (e.g., a user query).
Spans: Sub-operations such as retrieval, ranking, or generation.
Metadata: Contextual data like prompts and documents attached at span level.
The workflow is:
RAGFlow executes a pipeline →
Tracing SDK captures relevant operations as spans →
Spans and trace data sent to Langfuse backend →
Langfuse UI visualizes and allows filtering/analysis.
Interaction with Other System Components
RAGFlow Application:
The tracing integration is embedded in the RAGFlow backend components responsible for pipeline execution. It automatically emits trace data without additional user code.Langfuse Backend:
Receives, stores, and indexes trace data for observability. Provides a web UI for trace exploration.RAGFlow Web UI:
Provides the interface for users to input Langfuse project credentials, enabling trace data emission.RAG Pipelines:
The retrieval and generation pipelines produce the data that is traced.
Usage Examples
Configuring Langfuse Credentials in RAGFlow UI
Navigate to API section in RAGFlow web UI.
Scroll to Langfuse Configuration.
Enter:
Host:
https://cloud.langfuse.comPublic Key:
abc123publickeySecret Key:
def456secretkey
Click Save.
Viewing Traces
Run a retrieval pipeline:
POST /pipelines/run with a query.Open Langfuse UI → Traces.
Filter:
name ~ "ragflow-*"Inspect trace spans for retrieval and generation details.
Visual Diagram
The file primarily describes a workflow involving user actions and data flow between RAGFlow and Langfuse. Below is a flowchart showing the key functions and interactions described in this documentation.
flowchart TD
A[User] -->|1. Get Langfuse keys| B[Langfuse Dashboard]
B -->|Project Public & Secret Keys| A
A -->|2. Configure keys| C[RAGFlow Web UI]
C -->|Save Keys| D[RAGFlow Backend]
D -->|Emit Trace Data| E[Langfuse Backend]
A -->|3. Run Pipeline| D
A -->|4. View Traces| E
E -->|Visualize & Analyze| A
Summary
This documentation file, tracing.mdx, serves as a comprehensive user guide for enabling and using Langfuse tracing within the RAGFlow platform. It covers credential setup, configuration, usage, and benefits of tracing for observability of retrieval-augmented generation pipelines. The integrated workflow empowers users to monitor, debug, and optimize their AI pipelines with minimal setup effort.
End of tracing.mdx documentation.