test_delete_datasets.py
Overview
test_delete_datasets.py is a test suite designed to verify the correctness, robustness, and security of the dataset deletion functionality within the InfiniFlow system. The file primarily tests the delete_datasets API endpoint, ensuring it behaves as expected under various scenarios, including authorization checks, payload validation, concurrency handling, and edge cases related to dataset IDs.
The tests are implemented using the pytest framework and include parameterized test cases for thorough coverage. This file interacts heavily with helper functions and fixtures such as batch_create_datasets, list_datasets, and authentication utilities to simulate realistic API usage patterns.
Classes and Their Responsibilities
1. TestAuthorization
Purpose:
To validate the authorization mechanism of the delete_datasets API, ensuring that invalid or missing credentials are correctly rejected.
Methods:
test_auth_invalid(self, auth, expected_code, expected_message)
Parameters:
auth: An authentication object orNoneto simulate various authorization states.expected_code: Expected error code returned by the API.expected_message: Expected error message string.
Returns: None (assertions are used for validation)
Description:
Tests that requests without authorization or with invalid tokens receive appropriate error responses.Usage example:
test = TestAuthorization() test.test_auth_invalid(None, 0, "`Authorization` can't be empty")
2. TestRquest
Note: The class name appears to have a typo and likely meant to be TestRequest.
Purpose:
To test how the delete_datasets endpoint handles invalid HTTP request payloads and headers.
Methods:
test_content_type_bad(self, get_http_api_auth)Tests rejection of unsupported content types in the request header.
Expects an error code
101with a message indicating the unsupported content type.
test_payload_bad(self, get_http_api_auth, payload, expected_message)Parameterized test for malformed or invalid JSON payloads.
Checks that the API returns error code
101and appropriate error messages.
test_payload_unset(self, get_http_api_auth)Tests the API behavior when no data payload is provided (
None).Expects error code
101with a message about malformed JSON syntax.
3. TestCapability
Purpose:
To verify the functional capabilities of dataset deletion at scale and under concurrency.
Methods:
test_delete_dataset_1k(self, get_http_api_auth)Creates 1000 datasets and attempts to delete them all.
Validates that deletion succeeds and no datasets remain afterward.
test_concurrent_deletion(self, get_http_api_auth)Tests concurrent deletion of datasets using
ThreadPoolExecutorwith 5 workers.Ensures all concurrent delete operations succeed without race conditions or errors.
4. TestDatasetsDelete
Purpose:
To extensively test various edge cases and input validation scenarios specifically related to the dataset IDs being deleted.
Methods:
test_ids(self, get_http_api_auth, add_datasets_func, func, expected_code, expected_message, remaining)Parameterized test to delete either a single dataset or multiple datasets, then verify the remaining count.
test_ids_empty(self, get_http_api_auth)Tests deletion request with an empty list of IDs.
Expects no deletion and code
0(success).
test_ids_none(self, get_http_api_auth)Tests deletion request with
idsfield set toNone.Expects all datasets to be deleted.
test_id_not_uuid(self, get_http_api_auth)Sends an invalid UUID string, expecting an error code
101with an invalid UUID format message.
test_id_not_uuid1(self, get_http_api_auth)Sends a valid UUID4 (not UUID1), expecting an error code
101with an invalid UUID1 format message.
test_id_wrong_uuid(self, get_http_api_auth)Sends a UUID that the user lacks permission to delete.
Expects error code
108with a permission error message.
test_ids_partial_invalid(self, get_http_api_auth, add_datasets_func, func)Tests payloads mixing valid and invalid UUIDs, expecting partial failure with code
108.
test_ids_duplicate(self, get_http_api_auth, add_datasets_func)Tests deletion request with duplicate IDs.
Expects error code
101and rejection due to duplicates.
test_repeated_delete(self, get_http_api_auth, add_datasets_func)Tests deleting datasets twice sequentially.
The second deletion should fail with code
108due to lack of permission (dataset already deleted).
test_field_unsupported(self, get_http_api_auth)Tests payloads with unsupported extra fields.
Expects rejection with error code
101about extra inputs not being permitted.
Important Implementation Details and Algorithms
Authorization Testing:
Uses theRAGFlowHttpApiAuthclass to simulate various API key states, including invalid tokens.Concurrent Deletion:
Uses Python'sThreadPoolExecutorto test thread safety and concurrency in dataset deletion operations. This ensures that the backend can handle multiple simultaneous delete requests without data corruption or race conditions.UUID Validation:
Tests validate that dataset IDs conform to UUID1 format specifically, not just any UUID. This is critical for ensuring consistent dataset identification and permissions enforcement.Payload Validation:
The tests check for malformed JSON, invalid payload types, missing payloads, duplicate IDs, and unsupported fields to ensure the API adheres strictly to its expected contract.Permission Checks:
The tests verify that users cannot delete datasets they do not have permission for, and that repeated deletion attempts for the same dataset are properly rejected.
Interaction With Other Parts of the System
Common Module:
Functions likebatch_create_datasets,delete_datasets, andlist_datasetsare imported fromcommon. These provide the necessary API interactions for creating, deleting, and listing datasets.Authentication:
TheRAGFlowHttpApiAuthclass fromlibs.authis used to simulate HTTP API authentication with tokens.Test Fixtures:
Fixtures such asget_http_api_auth,add_datasets_func, andadd_dataset_func(not defined in this file) are used to prepare the test environment, providing authenticated sessions and pre-created datasets.Pytest Markers:
Tests are marked with priorities (p1,p2,p3) to indicate their relative importance or scope of testing.
This file is a critical part of the test suite ensuring the stability and security of dataset deletion functionality in the InfiniFlow application.
Visual Diagram: Class Structure
classDiagram
class TestAuthorization {
+test_auth_invalid(auth, expected_code, expected_message)
}
class TestRquest {
+test_content_type_bad(get_http_api_auth)
+test_payload_bad(get_http_api_auth, payload, expected_message)
+test_payload_unset(get_http_api_auth)
}
class TestCapability {
+test_delete_dataset_1k(get_http_api_auth)
+test_concurrent_deletion(get_http_api_auth)
}
class TestDatasetsDelete {
+test_ids(get_http_api_auth, add_datasets_func, func, expected_code, expected_message, remaining)
+test_ids_empty(get_http_api_auth)
+test_ids_none(get_http_api_auth)
+test_id_not_uuid(get_http_api_auth)
+test_id_not_uuid1(get_http_api_auth)
+test_id_wrong_uuid(get_http_api_auth)
+test_ids_partial_invalid(get_http_api_auth, add_datasets_func, func)
+test_ids_duplicate(get_http_api_auth, add_datasets_func)
+test_repeated_delete(get_http_api_auth, add_datasets_func)
+test_field_unsupported(get_http_api_auth)
}
Summary
test_delete_datasets.py is a comprehensive test suite for the dataset deletion API in InfiniFlow. It covers:
Authorization and authentication validation.
Input payload validation including content types and JSON formats.
Functional tests for large-scale and concurrent dataset deletions.
Extensive checks on dataset ID validation, permission enforcement, and edge cases.
Integration with common utility functions and authentication helpers.
By running this suite, developers ensure that the deletion endpoint is secure, robust, and behaves consistently under various scenarios, protecting data integrity and system stability.