conftest.py
Overview
The conftest.py file provides pytest fixtures and utility functions designed to support automated testing within the InfiniFlow project. It primarily facilitates the setup, teardown, and management of test resources related to knowledge bases (KBs), documents, and document chunks in the RAGFlow system via Web API authentication.
Key functionalities include:
Creating a variety of test files in different formats (e.g., DOCX, PDF, JSON) for upload and parsing tests.
Managing lifecycle of datasets, documents, dialogs, and chunks by creating and cleaning them up automatically during tests.
Providing authenticated Web API access for test cases.
Waiting and polling mechanisms to verify asynchronous document parsing completion.
This file is essential for integration and functional tests that require interaction with the backend APIs of the InfiniFlow system, ensuring test isolation and repeatability.
Contents Detail
Imported Modules
time.sleep: Used to pause execution, mainly to avoid race conditions.pytest: Testing framework used for fixtures and test parameterization.common: Module containing API wrapper functions for dataset, document, chunk, and dialog manipulation.libs.auth.RAGFlowWebApiAuth: Handles authenticated Web API access.utils.wait_for: A decorator that implements polling/waiting logic.utils.file_utils: Helpers to create various file types used in tests.
Functions and Fixtures
condition(_auth, _kb_id)
Purpose:
Polls the backend to check if all documents within a knowledge base (kb_id) have completed parsing (indicated bydoc["run"] == "3").Parameters:
_auth: Authenticated API client instance._kb_id: Knowledge base identifier string.
Returns:
Trueif all documents haverunstatus"3"(parsing complete).Falseotherwise.
Usage:
Used as a wait condition with the@wait_fordecorator to repeatedly check parsing status for up to 30 seconds, polling every 1 second.
generate_test_files(request: FixtureRequest, tmp_path)
Purpose:
A parameterized pytest fixture that generates test files of various formats in a temporary directory for testing document uploads and parsing.Parameters:
request: Pytest's test request object, used to access parameter values.tmp_path: Temporary filesystem path provided by pytest.
Returns:
A dict mapping file types (e.g.,
'pdf','docx') to corresponding generated file paths.
Usage Example:
@pytest.mark.parametrize("generate_test_files", ["pdf"], indirect=True) def test_pdf_upload(generate_test_files): pdf_file_path = generate_test_files["pdf"] # Use pdf_file_path for upload test
ragflow_tmp_dir(request, tmp_path_factory)
Scope: Class-level fixture.
Purpose:
Creates a unique temporary directory for the test class, named after the class, to isolate test artifacts.Parameters:
request: Pytest request object with information about the requesting test class.tmp_path_factory: Pytest factory for creating temporary directories.
Returns:
A pathlib
Pathobject representing the created temporary directory.
WebApiAuth(auth)
Scope: Session-level fixture.
Purpose:
Provides an authenticated Web API client instance wrapping the givenauthobject.Parameters:
auth: Base authentication object (likely from a higher-level fixture).
Returns:
An instance of
RAGFlowWebApiAuthconfigured for the session.
clear_datasets(request: FixtureRequest, WebApiAuth: RAGFlowWebApiAuth)
Scope: Function-level fixture.
Purpose:
Cleans up all knowledge bases (datasets) after each test function runs to ensure test isolation.Parameters:
request: Pytest request object.WebApiAuth: Authenticated API client.
Implementation Detail:
Registers a finalizer that lists all KBs and removes them one by one.
clear_dialogs(request, WebApiAuth)
Scope: Function-level fixture.
Purpose:
Deletes all dialogs after each test function to maintain a clean test environment.Parameters:
request: Pytest request object.WebApiAuth: Authenticated API client.
Implementation Detail:
Registers a finalizer callingdelete_dialogsAPI.
add_dataset(request: FixtureRequest, WebApiAuth: RAGFlowWebApiAuth) -> str
Scope: Class-level fixture.
Purpose:
Creates a single knowledge base dataset before tests in a class and ensures cleanup after tests.Parameters:
request: Pytest request object.WebApiAuth: Authenticated API client.
Returns:
The ID (string) of the created dataset.
Implementation Detail:
Registers a finalizer to delete all datasets when tests complete.
add_dataset_func(request: FixtureRequest, WebApiAuth: RAGFlowWebApiAuth) -> str
Scope: Function-level fixture.
Purpose:
Similar toadd_datasetbut scoped per test function.Returns:
The ID of the created knowledge base dataset.
add_document(request, WebApiAuth, add_dataset, ragflow_tmp_dir)
Scope: Class-level fixture.
Purpose:
Adds documents to a dataset and uploads them in bulk for testing document-related APIs.Parameters:
request: Pytest request object.WebApiAuth: Authenticated API client.add_dataset: Dataset ID from theadd_datasetfixture.ragflow_tmp_dir: Temporary directory for file storage.
Returns:
Tuple of
(dataset_id, document_id)representing the dataset and newly uploaded document.
Note:
The cleanup code to remove documents is commented out, likely because datasets are cleaned up which cascades document removal.
add_chunks(request, WebApiAuth, add_document)
Scope: Class-level fixture.
Purpose:
Adds chunks to a document after parsing and manages cleanup after tests.Parameters:
request: Pytest request object.WebApiAuth: Authenticated API client.add_document: Tuple(kb_id, document_id)fromadd_documentfixture.
Returns:
Tuple
(kb_id, document_id, chunk_ids)wherechunk_idsis a list of IDs for the added chunks.
Implementation Detail:
Parses the document with
run="1".Waits for parsing completion using the
conditionfunction.Adds chunks in batch (4 chunks).
Sleeps 1 second to avoid race conditions (issue #6487).
Registers a finalizer to delete chunks after tests.
Important Implementation Details
The
@wait_fordecorator is used oncondition()to repeatedly poll document parsing status with a timeout of 30 seconds. This ensures tests can wait for asynchronous processing to complete before proceeding.Cleanup of datasets, dialogs, and chunks are consistently implemented via
request.addfinalizer()to guarantee resource removal after tests, preventing side effects between tests.The file creators in
generate_test_filesleverage external utility functions to create sample files in multiple formats, facilitating extensive testing of document ingestion pipelines.The
add_chunksfixture workflows demonstrate a typical test setup pattern: parse documents, wait for completion, add chunks, and clean up after testing, reflecting real-world usage of the RAGFlow backend.
Interaction with Other System Components
commonmodule: Provides API client functions to interact with backend services for datasets, documents, chunks, and dialogs.libs.auth.RAGFlowWebApiAuth: Wraps authentication and API request logic, ensuring secure access to backend endpoints.utils.wait_for: Enables the waiting/polling mechanisms critical for async operations.utils.file_utils: Supplies file generation utilities for creating various test input files.Test Suites: This file is imported automatically by pytest and its fixtures are injected into test functions across the project, enabling modular and reusable test setups.
Visual Diagram
flowchart TD
A[conftest.py] --> B[condition()]
A --> C[generate_test_files]
A --> D[ragflow_tmp_dir]
A --> E[WebApiAuth]
A --> F[clear_datasets]
A --> G[clear_dialogs]
A --> H[add_dataset]
A --> I[add_dataset_func]
A --> J[add_document]
A --> K[add_chunks]
J --> L[Uses add_dataset]
J --> M[Uses ragflow_tmp_dir]
K --> N[Uses add_document]
K --> O[Calls parse_documents]
K --> P[Calls condition (wait_for)]
K --> Q[Calls batch_add_chunks]
K --> R[Deletes chunks on cleanup]
F --> S[Calls list_kbs]
F --> T[Calls rm_kb for each KB]
G --> U[Calls delete_dialogs]
E --> V[Wraps auth with RAGFlowWebApiAuth]
C --> W[Creates multiple file types using utils.file_utils]
Summary
conftest.py is a pytest configuration file providing a comprehensive suite of fixtures and helper functions to setup, teardown, and manage knowledge bases, documents, dialogs, and chunks for automated testing of the InfiniFlow RAGFlow backend. It abstracts API interactions, manages asynchronous waits, and supplies diverse test data to ensure robust and isolated test cases. This file is a critical part of the testing infrastructure enabling reliable CI/CD workflows for the project.