ragflow.py


Overview

ragflow.py defines the RAGFlow class, which serves as a Python client SDK for interacting with the InfiniFlow backend API. The primary purpose of this file is to provide an abstraction layer for making HTTP requests to the API endpoints related to datasets, chats, agents, and document retrieval functionalities.

This class encapsulates RESTful operations (POST, GET, PUT, DELETE) and exposes high-level methods for managing:

The file uses several domain entities imported from sibling modules (Agent, Chat, Chunk, DataSet) to represent and manipulate the data returned by the backend.


Classes and Methods

Class: RAGFlow

Purpose

Acts as an API client to communicate with the InfiniFlow backend, managing knowledge datasets, chat sessions, retrieval operations, and intelligent agents.


Initialization

def __init__(self, api_key, base_url, version="v1")

HTTP Request Methods

These methods wrap the requests library calls with appropriate headers and URL formatting.

def post(self, path, json=None, stream=False, files=None)
def get(self, path, params=None, json=None)
def delete(self, path, json)
def put(self, path, json)

Dataset Management

create_dataset

def create_dataset(self, name: str, avatar: Optional[str] = None, description: Optional[str] = None, embedding_model: Optional[str] = None, permission: str = "me", chunk_method: str = "naive", parser_config: Optional[DataSet.ParserConfig] = None) -> DataSet:

delete_datasets

def delete_datasets(self, ids: list[str] | None = None)

get_dataset

def get_dataset(self, name: str) -> DataSet

list_datasets

def list_datasets(self, page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str | None = None, name: str | None = None) -> list[DataSet]

Chat Management

create_chat

def create_chat(self, name: str, avatar: str = "", dataset_ids=None, llm: Chat.LLM | None = None, prompt: Chat.Prompt | None = None) -> Chat

delete_chats

def delete_chats(self, ids: list[str] | None = None)

list_chats

def list_chats(self, page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str | None = None, name: str | None = None) -> list[Chat]

Retrieval

retrieve

def retrieve(self, dataset_ids, document_ids=None, question="", page=1, page_size=30, similarity_threshold=0.2, vector_similarity_weight=0.3, top_k=1024, rerank_id: str | None = None, keyword: bool = False, cross_languages: list[str]|None = None, metadata_condition: dict | None = None)

Agent Management

list_agents

def list_agents(self, page: int = 1, page_size: int = 30, orderby: str = "update_time", desc: bool = True, id: str | None = None, title: str | None = None) -> list[Agent]

create_agent

def create_agent(self, title: str, dsl: dict, description: str | None = None) -> None

update_agent

def update_agent(self, agent_id: str, title: str | None = None, description: str | None = None, dsl: dict | None = None) -> None

delete_agent

def delete_agent(self, agent_id: str) -> None

Implementation Details and Algorithms


Interaction with Other Modules


Usage Summary

from ragflow import RAGFlow

# Initialize client
ragflow = RAGFlow(api_key="your_api_key", base_url="http://localhost:8000")

# Create dataset
dataset = ragflow.create_dataset(name="My Knowledge Base")

# Create chat linked to dataset
chat = ragflow.create_chat(name="Support Chat", dataset_ids=[dataset.id])

# Retrieve relevant chunks from dataset
chunks = ragflow.retrieve(dataset_ids=[dataset.id], question="What is the refund policy?")

# List agents
agents = ragflow.list_agents()

# Create an agent
ragflow.create_agent(title="SupportBot", dsl={"type": "faq_bot", "config": {}})

Visual Diagram

classDiagram
    class RAGFlow {
        - user_key: str
        - api_url: str
        - authorization_header: dict
        + __init__(api_key, base_url, version="v1")
        + post(path, json=None, stream=False, files=None)
        + get(path, params=None, json=None)
        + delete(path, json)
        + put(path, json)
        + create_dataset(name, avatar=None, description=None, embedding_model=None, permission="me", chunk_method="naive", parser_config=None) DataSet
        + delete_datasets(ids)
        + get_dataset(name) DataSet
        + list_datasets(page=1, page_size=30, orderby="create_time", desc=True, id=None, name=None) list~DataSet~
        + create_chat(name, avatar="", dataset_ids=None, llm=None, prompt=None) Chat
        + delete_chats(ids)
        + list_chats(page=1, page_size=30, orderby="create_time", desc=True, id=None, name=None) list~Chat~
        + retrieve(dataset_ids, document_ids=None, question="", page=1, page_size=30, similarity_threshold=0.2, vector_similarity_weight=0.3, top_k=1024, rerank_id=None, keyword=False, cross_languages=None, metadata_condition=None) list~Chunk~
        + list_agents(page=1, page_size=30, orderby="update_time", desc=True, id=None, title=None) list~Agent~
        + create_agent(title, dsl, description=None)
        + update_agent(agent_id, title=None, description=None, dsl=None)
        + delete_agent(agent_id)
    }

    RAGFlow --> DataSet : creates/manages
    RAGFlow --> Chat : creates/manages
    RAGFlow --> Chunk : retrieves
    RAGFlow --> Agent : creates/manages

Summary

ragflow.py implements a comprehensive client interface for the InfiniFlow backend API, focusing on knowledge dataset management, chat session handling, retrieval of relevant document chunks, and intelligent agent lifecycle management. It encapsulates HTTP communication, error handling, and domain entity manipulation, enabling developers to build applications that leverage InfiniFlow's retrieval-augmented generation capabilities with ease.