test_stop_parse_documents.py

Overview

This file contains automated test cases for the "stop parse documents" functionality of the InfiniFlow system's document processing API. Its main purpose is to validate that the API endpoint responsible for stopping document parsing behaves correctly under various scenarios — including authorization failures, invalid inputs, partial successes, concurrency, and large-scale document operations.

The tests ensure the system maintains data integrity, enforces access controls, and properly updates document states when parse operations are interrupted. The file utilizes the pytest framework for test management and assertions, and interacts heavily with helper functions from the common test utilities and the system's HTTP API authentication.


Detailed Explanation of Components

Helper Functions

validate_document_parse_done(auth, dataset_id, document_ids)


validate_document_parse_cancel(auth, dataset_id, document_ids)


Test Classes

class TestAuthorization


class TestDocumentsParseStop


Individual Test Functions Outside Classes

test_stop_parse_100_files(get_http_api_auth, add_dataset_func, tmp_path)


test_concurrent_parse(get_http_api_auth, add_dataset_func, tmp_path)


Important Implementation Details and Algorithms


Interaction With Other System Parts


Visual Diagram

The following Mermaid class diagram summarizes the structure of the test classes and helper functions in this file.

classDiagram
    class validate_document_parse_done {
        +auth
        +dataset_id: str
        +document_ids: list[str]
    }
    class validate_document_parse_cancel {
        +auth
        +dataset_id: str
        +document_ids: list[str]
    }
    class TestAuthorization {
        +test_invalid_auth(auth, expected_code, expected_message)
    }
    class TestDocumentsParseStop {
        +test_basic_scenarios(get_http_api_auth, add_documents_func, payload, expected_code, expected_message)
        +test_invalid_dataset_id(get_http_api_auth, add_documents_func, invalid_dataset_id, expected_code, expected_message)
        +test_stop_parse_partial_invalid_document_id(get_http_api_auth, add_documents_func, payload)
        +test_repeated_stop_parse(get_http_api_auth, add_documents_func)
        +test_duplicate_stop_parse(get_http_api_auth, add_documents_func)
    }
    class test_stop_parse_100_files {
        +get_http_api_auth
        +add_dataset_func
        +tmp_path
    }
    class test_concurrent_parse {
        +get_http_api_auth
        +add_dataset_func
        +tmp_path
    }
    
    TestAuthorization ..> stop_parse_documnets : calls
    TestDocumentsParseStop ..> stop_parse_documnets : calls
    test_stop_parse_100_files ..> stop_parse_documnets : calls
    test_concurrent_parse ..> stop_parse_documnets : calls
    validate_document_parse_done ..> list_documnets : calls
    validate_document_parse_cancel ..> list_documnets : calls

Summary

The test_stop_parse_documents.py file is a comprehensive test suite focused on verifying the robustness, correctness, and security of the "stop parse documents" API feature in InfiniFlow. It covers edge cases, authorization, concurrency, and large-scale operations using well-structured pytest tests supported by utility functions for validation and synchronization. This testing ensures that document parsing stoppage behaves predictably and securely within the system.