retrieval.mdx

Overview

The Retrieval component is a critical module within the RAGFlow system designed to extract relevant information from one or more configured knowledge bases (datasets). It plays a central role in Retrieval-Augmented Generation (RAG) workflows by querying datasets to gather contextually relevant data that is then passed to large language models (LLMs) for generating responses or content.

This component can function in two distinct modes:

Standalone module: As an independent workflow component that takes user queries and returns retrieved data.
Tool for an Agent component: Invoked autonomously by an Agent to perform retrieval as part of a broader decision-making or content generation workflow.

Detailed Explanation of Key Concepts and Configuration

Purpose and Functionality

Extracts relevant chunks of information from selected knowledge bases based on user queries.
Utilizes a hybrid similarity search method combining keyword similarity and vector similarity, optionally enhanced with a rerank model.
Supports cross-language retrieval by translating queries when necessary.
Can leverage knowledge graphs for advanced multi-hop question answering.

Core Configuration Parameters

Configuration	Description	Default / Notes
Query variables	Specifies the input query variable(s) used to search knowledge bases.	Defaults to `sys.query`.
Knowledge bases	One or multiple datasets selected as retrieval targets. Must use the same embedding model if multiple are selected.	Mandatory unless empty response is blank.
Similarity threshold	Minimum combined similarity score (keyword + vector) for chunks to be considered relevant.	0.2
Keyword similarity weight	Weight assigned to the keyword similarity component in the combined similarity score. Must sum to 1 with the vector similarity weight.	0.7 (implying vector similarity weight is 0.3)
Top N	Number of top chunks to return to the LLM after retrieval.	8
Rerank model	Optional model to rerank retrieved chunks, replacing vector similarity with reranking scores. Improves quality but increases latency.	Disabled by default.
Empty response	Response to return if no relevant chunks are found. If blank, the LLM will improvise. Must be blank if no knowledge bases are selected.	Blank by default.
Cross-language search	Target languages for translating queries to ensure semantic matching across languages.	None by default.
Use knowledge graph	Enables multi-hop retrieval using pre-constructed knowledge graphs for complex queries. Significantly increases retrieval time.	Disabled by default.
Output	Global variable name for storing retrieval results, accessible by subsequent components.	User-defined.

Usage and Workflow

Typical Workflow

Input Query: The component receives user input query variables (sys.query by default).
Select Knowledge Bases: The system queries one or more datasets.
Compute Similarity Scores: A weighted combination of keyword similarity and vector similarity scores is calculated for each chunk.
Optional Reranking: If a rerank model is configured, chunks are reranked based on the reranking score combined with keyword similarity.
Filter by Threshold: Chunks below the similarity threshold are excluded.
Select Top N: The top N chunks are selected and returned.
Optional Cross-language Query Translation: Queries can be translated if cross-language search is enabled.
Optional Knowledge Graph Retrieval: Multi-hop retrieval is performed using knowledge graphs if enabled.
Output: Retrieved chunks are output to a global variable for downstream processing.

Important Implementation Details

Similarity Scoring Algorithm: The retrieval process uses a linear combination of keyword similarity and vector cosine similarity:
[
\text{Combined Score} = w_k \times \text{Keyword Similarity} + (1 - w_k) \times \text{Vector Similarity}
]
where ( w_k ) is the keyword similarity weight.
Rerank Model Integration: When configured, the vector similarity component is replaced with the weighted reranking score in the combined score calculation.
Cross-language Search: Implements query translation models to translate user queries into the languages of the target knowledge bases, improving semantic retrieval across multilingual datasets.
Knowledge Graph Usage: Supports iterative multi-hop retrieval by traversing entities, relationships, and community reports in knowledge graphs constructed from datasets. This greatly enhances complex query handling at the cost of increased retrieval latency.
Performance Considerations:
- Rerank models and knowledge graph usage significantly increase response time.
- For rerankers, SaaS-based models are recommended for faster performance.
- Locally deployed rerankers require GPU-enabled environments (e.g., docker-compose-gpu.yml).

Interaction with Other System Components

Begin Component: Provides the default sys.query variable as the user input query source.
Agent Component: Can invoke the Retrieval component as a tool, controlling query timing and handling returned data autonomously.
Knowledge Base Configuration: Retrieval depends upon properly configured knowledge bases with embeddings and optionally knowledge graphs.
Downstream Components: Outputs retrieval results to a global variable that other components, such as LLM generators or reasoning modules, consume for further processing.
User Interface: Retrieval component’s configuration panel allows customization of query variables, dataset selection, retrieval parameters, and advanced settings.

Examples

Example 1: Standalone Retrieval Module

components:
  - name: Begin
    type: Begin
  - name: Retrieval
    type: Retrieval
    input_variables: ["sys.query"]
    knowledge_bases: ["kb_finance"]
    similarity_threshold: 0.25
    keyword_similarity_weight: 0.6
    top_n: 5
    output: "retrieved_chunks"

In this example, the Retrieval component queries the kb_finance knowledge base using the user query from sys.query, applying a similarity threshold of 0.25 and combining keyword similarity at 0.6 weight, returning the top 5 chunks to retrieved_chunks.

Example 2: Retrieval as a Tool for Agent

components:
  - name: Agent
    type: Agent
    user_prompt: "sys.query"
    tools:
      - Retrieval

Here, the Agent component autonomously invokes the Retrieval component with the user prompt as a query to gather information before generating a response.

Visual Diagram

flowchart TD
    A[Start: Receive Query Variables] --> B[Select Knowledge Bases to Query]
    B --> C[Translate Query (if Cross-language Search enabled)]
    C --> D[Compute Keyword Similarity]
    D --> E[Compute Vector Cosine Similarity]
    E --> F{Rerank Model Configured?}
    F -- Yes --> G[Compute Reranking Score]
    G --> H[Calculate Combined Score (Keyword + Rerank)]
    F -- No --> I[Calculate Combined Score (Keyword + Vector)]
    H --> J[Filter Chunks by Similarity Threshold]
    I --> J
    J --> K[Select Top N Chunks]
    K --> L{Use Knowledge Graph?}
    L -- Yes --> M[Perform Multi-hop Retrieval via Knowledge Graph]
    L -- No --> N[Output Retrieved Chunks to Global Variable]
    M --> N

Frequently Asked Questions

How to reduce response time?

Avoid using a rerank model or disable it if enabled.
Prefer SaaS rerankers for lower latency.
Disable the Use knowledge graph option to avoid multi-hop retrieval overhead.
Ensure knowledge bases use consistent embedding models to avoid errors.

References

Configure Knowledge Base
Construct Knowledge Graph
Agent template example: Report Agent Using Knowledge Base

Summary

The Retrieval component is a foundational element in RAGFlow workflows, enabling precise and flexible querying of knowledge bases to provide relevant context for LLM-driven tasks. Its configurability for similarity scoring, reranking, cross-language search, and knowledge graph usage makes it adaptable to various use cases, from simple retrieval to complex multi-hop question answering. Proper configuration and understanding of its parameters are essential for balancing retrieval quality and system performance.