conftest.py
Overview
conftest.py is a configuration and utility module used within the pytest testing framework for the InfiniFlow project. Its primary purpose is to provide reusable test fixtures and helper functions that support the testing of document parsing and chunk management functionalities. This file integrates with the system’s document handling API by utilizing common utilities for adding, listing, parsing, and deleting document chunks. It also implements a custom wait condition to ensure asynchronous document parsing completes before proceeding with tests.
Detailed Explanations
Imports
pytest: The core testing framework used to define fixtures and manage test execution.
common: A local module providing API wrappers for document and chunk operations:
add_chunkdelete_chunkslist_documnets(Note: likely a typo, should belist_documents)parse_documnets(Note: likely a typo, should beparse_documents)
libs.utils: Contains utility functions including
wait_for, which is used for polling with timeout.
Functions and Fixtures
condition(_auth, _dataset_id)
@wait_for(30, 1, "Document parsing timeout")
def condition(_auth, _dataset_id):
res = list_documnets(_auth, _dataset_id)
for doc in res["data"]["docs"]:
if doc["run"] != "DONE":
return False
return True
Purpose: A helper function used as a condition checker for the
wait_forutility. It repeatedly checks if all documents in the given dataset have finished parsing (runstatus is"DONE").Parameters:
_auth(object): Authentication credentials or token needed for API calls._dataset_id(str/int): Identifier for the dataset whose documents are being checked.
Returns:
Trueif all documents have status"DONE".Falseotherwise, causingwait_forto retry until timeout.
Usage:
Used as a decorator argument to
wait_for, which polls this condition every 1 second for up to 30 seconds.
Important Details:
Makes an API call via
list_documnetsto retrieve current document statuses.Ensures tests depending on fully parsed documents do not proceed prematurely.
add_chunks_func(request, get_http_api_auth, add_document)
@pytest.fixture(scope="function")
def add_chunks_func(request, get_http_api_auth, add_document):
dataset_id, document_id = add_document
parse_documnets(get_http_api_auth, dataset_id, {"document_ids": [document_id]})
condition(get_http_api_auth, dataset_id)
chunk_ids = []
for i in range(4):
res = add_chunk(get_http_api_auth, dataset_id, document_id, {"content": f"chunk test {i}"})
chunk_ids.append(res["data"]["chunk"]["id"])
from time import sleep
sleep(1)
def cleanup():
delete_chunks(get_http_api_auth, dataset_id, document_id, {"chunk_ids": chunk_ids})
request.addfinalizer(cleanup)
return dataset_id, document_id, chunk_ids
Purpose: Pytest fixture that sets up a test environment with a document having multiple chunks added to it. It also ensures cleanup after tests run.
Scope: Function-level (runs for each test function that uses this fixture).
Parameters:
request(pytest fixture): Provides access to test request context, used here to register cleanup finalizers.get_http_api_auth(pytest fixture): Supplies authentication credentials for API calls.add_document(pytest fixture): Creates and returns a(dataset_id, document_id)tuple representing a newly added document.
Functionality:
Retrieves dataset and document IDs from
add_document.Initiates parsing of the document via
parse_documnets.Waits until parsing is complete by calling
condition(which useswait_for).Adds 4 test chunks to the document using
add_chunk.Collects the chunk IDs in a list.
Sleeps for 1 second to address timing issues (see issue #6487).
Registers a
cleanupfunction to delete the added chunks after the test completes.
Returns:
Tuple
(dataset_id, document_id, chunk_ids)for use in tests.
Usage Example:
def test_chunk_addition(add_chunks_func):
dataset_id, document_id, chunk_ids = add_chunks_func
assert len(chunk_ids) == 4
# Further assertions and test logic here
Implementation Notes:
The sleep call after chunk addition is a workaround for a known timing issue (issue #6487).
Cleanup is deferred to test teardown to ensure test isolation and no leftover data.
Relies on the correctness of
add_documentfixture and API wrappers.
Implementation Details and Algorithms
The file leverages the
wait_forutility to implement polling behavior with a timeout. This ensures that asynchronous operations (document parsing) complete before tests proceed, preventing flaky tests due to timing issues.The
conditionfunction queries the document statuses and returns a boolean to drive the polling.The fixture
add_chunks_funcorchestrates a sequence of API calls to set up test data and registers a cleanup step to maintain test environment hygiene.The design is modular and uses pytest’s fixture mechanism to provide reusable setup logic that can be injected into multiple test functions.
Interaction with Other Parts of the System
common module: Provides API interaction functions for document and chunk management. This file depends on these functions to perform core operations (add, list, parse, delete).
libs.utils.wait_for: Provides polling functionality used to wait for asynchronous document parsing to complete.
pytest fixtures: Integrates with other fixtures like
get_http_api_authandadd_documentto obtain authentication and initial test data.Test suites: This file’s fixtures are intended to be imported and used by test modules within the InfiniFlow testing framework, providing a standardized way to manage document chunks in tests.
Mermaid Diagram
The following flowchart represents the workflow and relationships between the main functions and fixtures in conftest.py:
flowchart TD
A[get_http_api_auth, add_document fixtures] --> B[add_chunks_func fixture]
B --> C[parse_documnets API call]
C --> D[condition function with wait_for]
D -->|polls| E[list_documnets API call]
B --> F[add_chunk API calls (4 times)]
B --> G[sleep(1) for timing fix]
B --> H[register cleanup finalizer]
H --> I[delete_chunks API call]
Summary
conftest.pyfacilitates testing of document parsing and chunk management by providing fixtures and utility functions.It implements a polling condition to wait for document parsing completion.
The main fixture
add_chunks_funcsets up a document with multiple chunks and ensures cleanup.The file depends on common API wrappers and utilities for asynchronous waiting.
It is crucial for maintaining reliable and isolated tests around document chunk functionality in InfiniFlow.