test_list_documents.py
Overview
test_list_documents.py is a comprehensive test suite designed to validate the functionality, robustness, and security of the list_documents API endpoint within the InfiniFlow system. This endpoint retrieves documents from a specified knowledge base (KB) and supports features such as pagination, sorting, filtering by keywords, and authorization checks.
The file uses the pytest testing framework to organize and run tests. It covers positive test cases (valid inputs and expected results) as well as negative test cases (invalid inputs, unauthorized access) to ensure the endpoint behaves correctly under various scenarios.
Key functionalities tested include:
Authentication and authorization enforcement.
Pagination behavior with different page and page size parameters.
Sorting documents by creation or update time, with ascending or descending order.
Filtering documents using keyword searches.
Concurrent access to the endpoint to check thread safety and performance under load.
Detailed Explanation of Classes and Methods
Imports and Dependencies
ThreadPoolExecutor, as_completed (from
concurrent.futures): For running concurrent tests.pytest: Testing framework used to define and run tests.list_documents(fromcommon): The API function under test that fetches documents.INVALID_API_TOKEN(fromconfigs): A constant representing an invalid API token for negative authentication tests.RAGFlowWebApiAuth(fromlibs.auth): Authentication class wrapping API tokens.is_sorted(fromutils): Utility function to verify sorting order of document lists.
Class: TestAuthorization
Tests authorization and authentication for the list_documents API.
Method: test_invalid_auth
Parameters (pytest parametrize):
invalid_auth: Authentication object orNone.expected_code: Expected HTTP or API error code.expected_message: Expected error message string.
Description:
Tests the behavior oflist_documentswhen no authentication or invalid authentication is provided. The endpoint is expected to respond with HTTP 401 Unauthorized error.Usage:
res = list_documents(None, {"kb_id": "dataset_id"})
assert res["code"] == 401
assert res["message"] == "<Unauthorized '401: Unauthorized'>"
Class: TestDocumentsList
Contains tests validating the document listing functionality with various parameters and edge cases.
Method: test_default
Parameters:
WebApiAuth(valid auth fixture)add_documents(fixture that adds 5 documents to a KB and returns the KB ID)
Description:
Verifies that the default document listing returns all 5 documents with code 0 (success).Example:
kb_id, _ = add_documents
res = list_documents(WebApiAuth, {"kb_id": kb_id})
assert res["code"] == 0
assert len(res["data"]["docs"]) == 5
assert res["data"]["total"] == 5
Method: test_invalid_dataset_id
Parameters (pytest parametrize):
kb_id: KB identifier string (empty or invalid).expected_code: Expected error code.expected_message: Expected error message.
Description:
Tests behavior when KB ID is invalid or missing. Expected error codes include 101 (missing KB ID) and 103 (unauthorized access).
Method: test_page
Parameters (pytest parametrize):
params: Dictionary withpageandpage_sizevalues.expected_code: Expected response code.expected_page_size: Expected number of documents returned.expected_message: Expected error message if any.
Description:
Validates pagination logic:Pages are 1-indexed or 0-indexed (treated as first page).
Handles string conversion of page numbers.
Skips tests for known issues with invalid values.
Example:
res = list_documents(WebApiAuth, {"kb_id": kb_id, "page": 2, "page_size": 2})
assert len(res["data"]["docs"]) == 2
Method: test_page_size
Parameters (pytest parametrize):
params: Dictionary withpage_size.expected_code,expected_page_size,expected_message: Expected outcomes.
Description:
Tests boundary and invalid values forpage_sizeparameter, including zero, negative, and non-integer inputs.
Method: test_orderby
Parameters (pytest parametrize):
params: Dict withorderbyand optionallydesc.expected_code: API response code.assertions: A lambda to validate document sorting.expected_message: Error message for invalid inputs.
Description:
Tests sorting order bycreate_timeorupdate_time. Skips tests for unsupported order fields.Usage Example:
res = list_documents(WebApiAuth, {"kb_id": kb_id, "orderby": "create_time"})
assert is_sorted(res["data"]["docs"], "create_time", True)
Method: test_desc
Parameters (pytest parametrize):
params: Dictionary withdesc(boolean or string).expected_code,assertions,expected_message.
Description:
Verifies ascending or descending order sorting bydescparameter. Tests various representations of true/false.
Method: test_keywords
Parameters (pytest parametrize):
params: Dict withkeywordsstring.expected_num: Expected number of matching documents.
Description:
Tests keyword filtering with full matches, partial matches, and no matches.
Method: test_concurrent_list
Parameters:
WebApiAuth: Valid authentication.add_documents: Fixture adding documents.
Description:
Tests the thread safety and performance oflist_documentsunder concurrent access by spawning 100 simultaneous requests.Implementation:
Uses aThreadPoolExecutorwith 5 workers to issue concurrent calls and checks that all succeed.
Important Implementation Details
The tests rely heavily on
pytestfixtures such asWebApiAuth(valid authentication token) andadd_documents(already added documents in KB) to isolate test setup.Parametrization allows testing multiple input scenarios cleanly.
Some tests are skipped due to known issues (marked with
pytest.mark.skipreferencing issue number).Sorting validation is done using the utility function
is_sortedthat checks if the list of documents is sorted by a specified attribute and order.The concurrent test ensures the API can handle multiple simultaneous calls without failure.
Interaction with Other System Components
list_documents(fromcommon): This is the core API call tested here. It interacts with the backend service or database to fetch document metadata.RAGFlowWebApiAuth(fromlibs.auth): Provides authentication tokens or credentials.INVALID_API_TOKEN(fromconfigs): Used to test invalid authentication scenarios.is_sorted(fromutils): Used to verify sorting correctness in the response data.Fixtures like
WebApiAuthandadd_documentsare assumed to be defined elsewhere in the test suite to provide reusable setup for authentication and test data.
This file primarily tests the interface and contract of the list_documents API endpoint and ensures that any changes in backend logic or parameters will be caught by these automated tests before deployment.
Visual Diagram
classDiagram
class TestAuthorization {
+test_invalid_auth(invalid_auth, expected_code, expected_message)
}
class TestDocumentsList {
+test_default(WebApiAuth, add_documents)
+test_invalid_dataset_id(WebApiAuth, kb_id, expected_code, expected_message)
+test_page(WebApiAuth, add_documents, params, expected_code, expected_page_size, expected_message)
+test_page_size(WebApiAuth, add_documents, params, expected_code, expected_page_size, expected_message)
+test_orderby(WebApiAuth, add_documents, params, expected_code, assertions, expected_message)
+test_desc(WebApiAuth, add_documents, params, expected_code, assertions, expected_message)
+test_keywords(WebApiAuth, add_documents, params, expected_num)
+test_concurrent_list(WebApiAuth, add_documents)
}
TestAuthorization ..> list_documents : calls
TestDocumentsList ..> list_documents : calls
TestAuthorization ..> RAGFlowWebApiAuth : uses
TestDocumentsList ..> is_sorted : uses
Summary
test_list_documents.py is a critical testing module for validating the document listing API in InfiniFlow, covering authorization, input validation, pagination, sorting, keyword filtering, and concurrency. It uses parametrized pytest tests and utility functions to ensure robust verification and prevent regressions in document retrieval functionality. The tests interact primarily with list_documents API and authentication utilities and depend on external fixtures for setup.