test_update_document.py
Overview
The test_update_document.py file contains automated tests for verifying the behavior and robustness of the document update functionality within the InfiniFlow system. It uses the pytest framework to define and run parameterized test cases that cover a wide range of scenarios, including authorization handling, validation of document attributes such as name and metadata, chunking methods, and parser configuration options.
This test suite ensures that the API endpoint responsible for updating documents enforces business rules, validates input parameters properly, and handles error conditions gracefully. The tests help maintain the integrity and reliability of the document update feature by catching regressions and inconsistencies early during development.
Detailed Explanation of Classes and Tests
Imports
pytest: Testing framework used for defining and running tests.
DOCUMENT_NAME_LIMIT,INVALID_API_TOKEN,list_documnets,update_documnetfromcommon: Constants and API interaction functions.RAGFlowHttpApiAuthfromlibs.auth: Authentication helper class for API requests.
1. TestAuthorization
Tests the authorization mechanism for updating documents.
Method: test_invalid_auth
Parameters:
auth: The authentication object orNone.expected_code: Expected status code returned by the API.expected_message: Expected error message string.
Description:
This test verifies that the document update endpoint rejects requests with missing or invalid authorization tokens.Usage example:
auth = None expected_code = 0 expected_message = "`Authorization` can't be empty" res = update_documnet(auth, "dataset_id", "document_id") assert res["code"] == expected_code assert res["message"] == expected_message
2. TestDocumentsUpdated
Contains multiple tests validating document update behavior with various payload inputs.
Method: test_name
Parameters:
get_http_api_auth: Fixture providing valid API authentication.add_documents: Fixture to add documents and return dataset and document IDs.name: The new name to assign to the document.expected_code: Expected response code.expected_message: Expected error message.
Description:
Tests validation around the document name, including length restrictions, extension immutability, and duplication handling.Important Cases:
Names exceeding byte limit.
Non-string name types (
intorNone).Attempting to change file extensions.
Duplicate document names in the same dataset.
Usage example:
res = update_documnet(auth, dataset_id, document_id, {"name": "new_name.txt"}) assert res["code"] == 0
Method: test_invalid_document_id
Tests invalid or empty document IDs, verifying proper error codes and messages.
Method: test_invalid_dataset_id
Tests invalid or empty dataset IDs, ensuring correct authorization and ownership checks.
Method: test_meta_fields
Parameters:
meta_fields: Metadata dictionary or invalid types.
Description:
Verifies that themeta_fieldsparameter accepts only dictionaries and rejects invalid types.
Method: test_chunk_method
Parameters:
chunk_method: The chunking method to apply (e.g., "naive", "manual", "qa", etc.).
Description:
Validates the chunking method field. Accepts predefined chunk methods or defaults to "naive" if empty. Rejects unknown methods.
Method: test_invalid_field
Tests for forbidden or immutable fields that should not be changed by update requests, such as
chunk_count,create_date,progress,token_count, etc.Some cases are marked to be skipped due to known issues (referenced by issue numbers).
3. TestUpdateDocumentParserConfig
Focuses on testing the parser_config options in combination with chunk_method.
Method: test_parser_config
Parameters:
chunk_method: Chunking method used.parser_config: Dictionary of parser configuration options.expected_code: Expected response code.expected_message: Expected error message.
Description:
This test validates the structure and value constraints of theparser_configdictionary, including keys likechunk_token_num,layout_recognize,html4excel,delimiter,task_page_size, and nested configs likeraptor.Validation rules include:
Numeric ranges for
chunk_token_num,task_page_size,auto_keywords,auto_questions, andtopn_tags.Boolean values for
html4excel.Allowed string values for
layout_recognize.No unknown keys allowed.
Many tests are skipped due to ongoing issues.
On success, the updated
parser_configis verified to be correctly saved.Usage example:
parser_config = { "chunk_token_num": 128, "layout_recognize": "DeepDOC", "html4excel": False, "delimiter": r"\n", "task_page_size": 12, "raptor": {"use_raptor": False}, } res = update_documnet(auth, dataset_id, document_id, {"chunk_method": "naive", "parser_config": parser_config}) assert res["code"] == 0
Important Implementation Details
The tests rely heavily on parameterization to cover a broad range of input scenarios.
Skipped tests are annotated with issue references, indicating known bugs or work-in-progress features.
The tests interact with helper functions
update_documnetandlist_documnetsimported from acommonmodule to perform API actions.Authentication is handled through the
RAGFlowHttpApiAuthclass, simulating real API key usage.Error codes and messages are validated against expected outputs to ensure API contract consistency.
The document's name validation enforces byte size limits and prevents file extension changes to maintain file integrity.
Parser configuration tests verify strict type and value constraints to prevent misconfiguration.
Integration with Other Parts of the System
This file is a critical part of the testing suite for the InfiniFlow document management API.
It depends on the
commonmodule which provides constants and API helper functions.Utilizes authentication classes from
libs.auth.Works with fixtures such as
get_http_api_authandadd_documentsthat are presumably defined elsewhere to set up test contexts.Ensures the API endpoint for updating documents adheres to the expected behavior, impacting the reliability of document updates across the system.
Visual Diagram
classDiagram
class TestAuthorization {
+test_invalid_auth(auth, expected_code, expected_message)
}
class TestDocumentsUpdated {
+test_name(get_http_api_auth, add_documents, name, expected_code, expected_message)
+test_invalid_document_id(get_http_api_auth, add_documents, document_id, expected_code, expected_message)
+test_invalid_dataset_id(get_http_api_auth, add_documents, dataset_id, expected_code, expected_message)
+test_meta_fields(get_http_api_auth, add_documents, meta_fields, expected_code, expected_message)
+test_chunk_method(get_http_api_auth, add_documents, chunk_method, expected_code, expected_message)
+test_invalid_field(get_http_api_auth, add_documents, payload, expected_code, expected_message)
}
class TestUpdateDocumentParserConfig {
+test_parser_config(get_http_api_auth, add_documents, chunk_method, parser_config, expected_code, expected_message)
}
TestAuthorization --> update_documnet : calls
TestDocumentsUpdated --> update_documnet : calls
TestDocumentsUpdated --> list_documnets : calls for verification
TestUpdateDocumentParserConfig --> update_documnet : calls
TestUpdateDocumentParserConfig --> list_documnets : calls for verification
Summary
test_update_document.py is a comprehensive test suite designed to validate the document update API functionalities of the InfiniFlow project. It covers authentication, input validation for document attributes, chunking and parser configuration, and ensures that invalid inputs are rejected appropriately. The tests are structured with pytest and utilize extensive parameterization to ensure robustness and coverage. This file plays a vital role in maintaining API integrity and preventing regressions in document update workflows.