test_delete_documents.py
Overview
The test_delete_documents.py file contains a comprehensive suite of automated tests designed to validate the functionality, robustness, and correctness of the document deletion API within the InfiniFlow system. This test module uses the pytest framework and covers various scenarios including authorization checks, input validation, edge cases, concurrency, and performance for bulk document deletions.
The primary focus is to ensure that the delete_documents API function behaves as expected under different conditions such as invalid authentication tokens, invalid dataset IDs, malformed payloads, duplicate or non-existent document IDs, and high-volume concurrent requests.
Detailed Descriptions
Imports and Setup
ThreadPoolExecutor, as_completed from
concurrent.futures: Used for concurrent execution of deletion requests in tests that check for race conditions and concurrency handling.pytest: Testing framework used to create and run the test cases.Utility functions imported from
common:bulk_upload_documents — uploads multiple documents to a dataset.
delete_documents— API call to delete documents from a dataset.list_documents— fetches the list of documents in a dataset.
Config value
INVALID_API_TOKENfromconfigs: Used to test authentication failure.RAGFlowHttpApiAuthfromlibs.auth: Represents the HTTP API authentication object.
Classes and Test Suites
1. TestAuthorization
Tests authorization behavior of the delete_documents API.
test_invalid_auth(self, invalid_auth, expected_code, expected_message)
Parameterized test to verify API response when authorization is missing or invalid.Parameters:
invalid_auth: Authentication object orNone.expected_code: Expected error code returned.expected_message: Expected error message.
Description:
Callsdelete_documentswith invalid or missing authorization and asserts that the API returns the correct error code and message.Example Usage:
test = TestAuthorization() test.test_invalid_auth(None, 0, "`Authorization` can't be empty") test.test_invalid_auth(RAGFlowHttpApiAuth(INVALID_API_TOKEN), 109, "Authentication error: API key is invalid!")
2. TestDocumentsDeletion
A rich test suite covering various scenarios related to document deletion.
test_basic_scenarios(self, HttpApiAuth, add_documents_func, payload, expected_code, expected_message, remaining)
Parameterized test that validates deletion with various payloads (valid, empty, invalid ids, malformed JSON, partial deletions).Parameters:
HttpApiAuth: Valid authentication object fixture.add_documents_func: Fixture that returns a tuple(dataset_id, document_ids)after adding documents.payload: The payload sent to delete_documents, can be a dict or a callable returning a dict.expected_code: Expected response code.expected_message: Expected response message (for errors).remaining: Number of documents expected to remain after deletion.
Implementation Details:
Handles callable payloads to dynamically generate deletion payloads from existing document IDs.
Verifies deletion response and then asserts the remaining document count by calling
list_documents.
test_invalid_dataset_id(self, HttpApiAuth, add_documents_func, dataset_id, expected_code, expected_message)
Tests behavior when an invalid or unauthorized dataset ID is used.test_delete_partial_invalid_id(self, HttpApiAuth, add_documents_func, payload)
Tests deletion attempts where payload contains a mix of valid and invalid document IDs. Expects an error code and message about documents not found, but all valid documents should still be deleted.test_repeated_deletion(self, HttpApiAuth, add_documents_func)
Tests the effect of attempting to delete documents that are already deleted. Expects an error indicating documents not found on second deletion.test_duplicate_deletion(self, HttpApiAuth, add_documents_func)
Tests deletion payloads that contain duplicate document IDs. Expects successful deletion with a warning about duplicates, and confirms all documents are deleted.
Standalone Tests
test_concurrent_deletion(HttpApiAuth, add_dataset, tmp_path)
Tests concurrency by uploading 100 documents and then deleting each document individually using multiple threads (max_workers=5). Verifies that all deletions succeed without race conditions.test_delete_1k(HttpApiAuth, add_dataset, tmp_path)
A performance and scalability test that uploads 1,000 documents and deletes them all in a single API call. Validates that document count before and after deletion matches expectations.
Important Implementation Details
The tests rely heavily on fixtures (e.g.,
HttpApiAuth,add_documents_func,add_dataset,tmp_path) which are expected to be defined elsewhere in the test suite. These fixtures provide authenticated API clients, dataset creation, and document upload capabilities.The
delete_documentsAPI function is tested with various malformed or edge-case inputs to ensure robust input validation.The concurrency test uses Python’s
ThreadPoolExecutorto simulate multiple simultaneous deletion requests, which helps verify thread safety and proper handling of concurrent modifications.Duplicate document IDs in deletion payloads are allowed but generate warnings, ensuring idempotency and user awareness.
Error codes and messages are asserted precisely to guarantee correct error handling and informative feedback.
Interaction with Other Parts of the System
The test file interacts with:
Authentication module (
libs.auth) for providing authorization tokens.Common utility functions (
common) for document management (upload, list, delete).Configurations (
configs) for test constants like invalid API tokens.
It indirectly tests the backend API endpoints for document deletion, authorization, and dataset management.
It depends on fixtures that manage dataset creation and document uploading, ensuring the test environment is correctly set up.
Usage Example
A typical test run might look like invoking pytest in the root directory of the project, which will automatically discover and run test_delete_documents.py along with other test modules.
pytest tests/test_delete_documents.py
Mermaid Class Diagram
This diagram summarizes the test classes and their key methods for clarity:
classDiagram
class TestAuthorization {
+test_invalid_auth(invalid_auth, expected_code, expected_message)
}
class TestDocumentsDeletion {
+test_basic_scenarios(HttpApiAuth, add_documents_func, payload, expected_code, expected_message, remaining)
+test_invalid_dataset_id(HttpApiAuth, add_documents_func, dataset_id, expected_code, expected_message)
+test_delete_partial_invalid_id(HttpApiAuth, add_documents_func, payload)
+test_repeated_deletion(HttpApiAuth, add_documents_func)
+test_duplicate_deletion(HttpApiAuth, add_documents_func)
}
class StandaloneTests {
+test_concurrent_deletion(HttpApiAuth, add_dataset, tmp_path)
+test_delete_1k(HttpApiAuth, add_dataset, tmp_path)
}
Summary
test_delete_documents.py is a critical test module ensuring the reliability, security, and correctness of the document deletion API for the InfiniFlow platform. It thoroughly covers authorization, payload validation, concurrency, and large-scale deletion scenarios, using parameterized tests and parallel execution to simulate real-world usage patterns and edge cases. The file integrates tightly with authentication services, document management utilities, and dataset provisioning fixtures, forming a robust safeguard for the document deletion functionality.