test_list_documents.py
Overview
The test_list_documents.py file is a comprehensive test suite designed to validate the functionality, robustness, and correctness of the list_documents API endpoint in the InfiniFlow project. This endpoint is responsible for retrieving documents belonging to a specified dataset, supporting various query parameters such as pagination, sorting, filtering by document properties (like name or id), and authorization.
The tests are implemented using the pytest framework and cover a wide range of scenarios including authorization handling, parameter validation, sorting and filtering logic, concurrency, and error handling. The file ensures that the document listing API behaves as expected under normal and edge cases, helping maintain API reliability and correctness.
Detailed Explanation of Components
Imports
concurrent.futures.ThreadPoolExecutor, as_completed — Used for executing concurrent requests to test thread safety and concurrency handling.
pytest— The testing framework used for writing and running tests.list_documents(fromcommon) — The API function under test, responsible for listing documents.INVALID_API_TOKEN(fromconfigs) — A constant representing an invalid API token to test authorization failure.RAGFlowHttpApiAuth(fromlibs.auth) — Authentication class used to create API auth objects.is_sorted(fromutils) — Utility function to verify if a list of documents is sorted by a specific attribute.
TestAuthorization Class
This class contains tests specifically focused on authorization aspects of the list_documents API.
test_invalid_auth(self, invalid_auth, expected_code, expected_message)
Parameters:
invalid_auth: An authorization object orNone. Tests with no auth or invalid auth.expected_code(int): Expected error code returned by the API.expected_message(str): Expected error message.
Functionality:
Tests the behavior oflist_documentswhen called with invalid or empty authorization credentials.Usage Example:
test_auth = TestAuthorization() test_auth.test_invalid_auth(None, 0, "`Authorization` can't be empty")Expected Behavior:
If no authorization is provided (
None), the API should return code0with an error message indicating authorization can't be empty.If the API token is invalid, it should return error code
109indicating authentication failure.
TestDocumentsList Class
This class contains multiple test methods covering various parameters and conditions for the document listing functionality.
test_default(self, HttpApiAuth, add_documents)
Purpose: Verify that listing documents with valid authorization and dataset ID returns all documents correctly.
Parameters:
HttpApiAuth: Valid authorization object fixture.add_documents: Fixture that adds documents and returns(dataset_id, document_ids).
Assertions:
API returns code
0.Number of documents returned is
5.Total count matches number of documents.
test_invalid_dataset_id(self, HttpApiAuth, dataset_id, expected_code, expected_message)
Purpose: Test API response when invalid or empty dataset IDs are provided.
Parameters:
dataset_id(str): Dataset ID string, e.g., empty or invalid.expected_code(int): Expected response code.expected_message(str): Expected error message.
Behavior:
Empty dataset ID returns HTTP 405 method not allowed.
Invalid dataset ownership returns error code
102.
Pagination Tests
test_page(...)andtest_page_size(...)Purpose: Validate pagination behavior via
pageandpage_sizeparameters.Parameters:
params: Dictionary withpageand/orpage_size.expected_code: Expected API response code.expected_page_size: Number of documents expected in the response.expected_message: Expected error message on failure.
Notes:
Tests valid, boundary, and invalid values (including negative, non-integer) for pagination.
Some invalid cases are skipped due to known issues (
issues/5851).
Sorting Tests
test_orderby(...)andtest_desc(...)Purpose: Verify sorting functionality by
orderbyanddescparameters.Parameters:
params: Parameters dict specifying sorting fields and direction.expected_code: API response code.assertions: Callable to assert if results are sorted as expected.expected_message: Expected error message if any.
Details:
Supports sorting by
create_timeandupdate_time.descparam controls ascending/descending order.Invalid sorting parameters trigger errors.
Some tests are skipped due to current issues.
Filtering Tests
Tests cover filtering by:
keywords(text search within documents)name(specific document name)id(document identifier)
Each test verifies correct filtering behavior, including handling of non-existent documents or unauthorized access.
Combined Filters
test_name_and_id(...)Tests the API behavior when both
nameandidfilters are provided simultaneously.
Concurrency Test
test_concurrent_list(self, HttpApiAuth, add_documents)Purpose: Stress test the API with 100 concurrent requests.
Implementation:
Uses
ThreadPoolExecutorwith 5 workers.Submits 100 concurrent calls to
list_documents.
Assertions:
All calls complete successfully.
All return code
0.
Invalid Parameters Test
test_invalid_params(self, HttpApiAuth, add_documents)Purpose: Verify that unknown query parameters do not break the API and default behavior is maintained.
Details:
Passes a parameter
{"a": "b"}that is not recognized.Expects API to ignore and return all documents.
Important Implementation Details
Tests rely heavily on fixtures (
HttpApiAuth,add_documents) for setup:HttpApiAuthprovides authenticated access.add_documentsprepopulates the dataset with documents for testing.
Parametrization in pytest allows testing multiple input scenarios efficiently.
Some tests are marked with different priority levels (
p1,p2,p3) to indicate importance or execution order.Several tests are skipped due to known issues (
issues/5851), indicating ongoing development or bug tracking.The use of
is_sortedutility function verifies correct ordering without manually inspecting lists.Concurrency tests ensure thread safety and scalability of the API under load.
Interaction with Other System Components
list_documentsfunction: The core API call tested, likely interacts with backend data storage, authentication module, and query processors.Authentication (
RAGFlowHttpApiAuth): Used for generating auth tokens; invalid tokens are tested to ensure security.Configuration (
INVALID_API_TOKEN): Used to simulate invalid credentials.Utility (
is_sorted): Assists in verifying sorting correctness.Fixtures (
add_documents,HttpApiAuth): Not defined here but part of test infrastructure, responsible for creating dataset and documents.
Usage Examples
# Example: List all documents in a dataset with valid auth
auth = RAGFlowHttpApiAuth(valid_api_token)
dataset_id = "dataset_123"
response = list_documents(auth, dataset_id)
if response["code"] == 0:
for doc in response["data"]["docs"]:
print(doc["name"])
else:
print(f"Error: {response['message']}")
Mermaid Diagram: Class and Method Structure
classDiagram
class TestAuthorization {
+test_invalid_auth(invalid_auth, expected_code, expected_message)
}
class TestDocumentsList {
+test_default(HttpApiAuth, add_documents)
+test_invalid_dataset_id(HttpApiAuth, dataset_id, expected_code, expected_message)
+test_page(HttpApiAuth, add_documents, params, expected_code, expected_page_size, expected_message)
+test_page_size(HttpApiAuth, add_documents, params, expected_code, expected_page_size, expected_message)
+test_orderby(HttpApiAuth, add_documents, params, expected_code, assertions, expected_message)
+test_desc(HttpApiAuth, add_documents, params, expected_code, assertions, expected_message)
+test_keywords(HttpApiAuth, add_documents, params, expected_num)
+test_name(HttpApiAuth, add_documents, params, expected_code, expected_num, expected_message)
+test_id(HttpApiAuth, add_documents, document_id, expected_code, expected_num, expected_message)
+test_name_and_id(HttpApiAuth, add_documents, document_id, name, expected_code, expected_num, expected_message)
+test_concurrent_list(HttpApiAuth, add_documents)
+test_invalid_params(HttpApiAuth, add_documents)
}
Summary
The test_list_documents.py file is a critical part of the InfiniFlow testing framework, ensuring the list_documents API endpoint works correctly across a wide spectrum of scenarios. It tests authorization, parameter validation, sorting, filtering, and concurrency, providing confidence in the robustness and correctness of the document listing functionality. This test suite interacts primarily with the API function list_documents and relies on authentication and dataset setup fixtures to simulate real-world usage.
End of Documentation