test_create_document.py
Overview
test_create_document.py is a test suite designed to validate the functionality and robustness of the document creation feature in the InfiniFlow system. It primarily focuses on testing the create_document API endpoint, ensuring that it handles authorization, input validation, concurrency, and integration with knowledge bases (KBs) correctly.
The file uses the pytest framework for structuring tests, parametrization, and fixtures. It covers both negative and positive test scenarios, including edge cases like empty file names, maximum name length, special characters, invalid KB IDs, and concurrent document uploads.
Detailed Explanation
Imports and Dependencies
string: Used for string operations, particularly for sanitizing filenames with special characters.
concurrent.futures (ThreadPoolExecutor, as_completed): Supports concurrent execution of document creation requests to test parallel uploads.
pytest: Testing framework used for test case declarations, parametrization, and fixtures.
create_document, list_kbs (from
common): Core API functions to create documents and list knowledge bases.DOCUMENT_NAME_LIMIT, INVALID_API_TOKEN (from
configs): Configuration constants used for validation.RAGFlowWebApiAuth (from
libs.auth): Authentication handler for API requests.create_txt_file (from
utils.file_utils): Utility to create temporary text files for testing.
Test Classes
1. TestAuthorization
Tests related to authorization checks when creating documents.
Method: test_invalid_auth
Purpose: To verify that the system correctly rejects document creation requests with invalid or missing authorization tokens.
Parameters (via
pytest.mark.parametrize):invalid_auth: Authentication object orNoneto simulate no auth.expected_code: Expected HTTP or internal error code (401 for unauthorized).expected_message: Expected error message indicating unauthorized access.
Returns: None. Uses assertions to validate response.
Usage Example:
res = create_document(None) # No auth
assert res["code"] == 401
assert res["message"] == "<Unauthorized '401: Unauthorized'>"
2. TestDocumentCreate
Tests the document creation functionality under various input conditions.
Method: test_filename_empty
Purpose: Ensures the API rejects documents with empty file names.
Parameters:
WebApiAuth: Valid authentication fixture.add_dataset_func: Fixture returning a valid knowledge base ID.
Payload:
{"name": "", "kb_id": kb_id}Expected Result: Error code
101and message"File name can't be empty."
Method: test_filename_max_length
Purpose: Validates that document creation succeeds for file names at the maximum allowed length (
DOCUMENT_NAME_LIMIT).Parameters:
WebApiAuth: Valid auth.add_dataset_func: Valid KB.tmp_path: Temporary directory fixture.
Implementation: Creates a text file with a name length close to the limit, then attempts creation.
Expected Result: Success code
0and returned document name matches the input file name.
Method: test_invalid_kb_id
Purpose: Tests handling of invalid knowledge base identifiers.
Parameters:
WebApiAuth: Valid auth.
Payload:
{"name": "ragflow_test.txt", "kb_id": "invalid_kb_id"}Expected Result: Error code
102and message"Can't find this knowledgebase!"
Method: test_filename_special_characters
Purpose: Confirms that filenames with special characters are sanitized or accepted properly.
Parameters:
WebApiAuth: Valid auth.add_dataset_func: Valid KB.
Implementation Details:
Defines illegal characters
< > : " / \ | ? *.Uses
str.maketransto replace illegal chars with underscores.Constructs a safe filename and attempts document creation.
Expected Result: Success code
0, KB ID matches, and filename matches the sanitized name.
Method: test_concurrent_upload
Purpose: Validates the system's ability to handle concurrent document uploads without errors or data corruption.
Parameters:
WebApiAuth: Valid auth.add_dataset_func: Valid KB.
Implementation Details:
Creates 20 filenames.
Uses
ThreadPoolExecutorwith 5 workers to upload files concurrently.Collects all results and verifies success.
Checks that the knowledge base document count matches the number of uploaded files.
Expected Result: All uploads succeed (
code == 0), and KB's document count equals 20.
Important Implementation Details
Authorization Testing: Uses
RAGFlowWebApiAuthwith invalid tokens andNoneto simulate unauthorized requests.Parameterization:
pytest.mark.parametrizeallows running the same test with multiple input values, improving coverage.File Name Sanitization: Special characters are replaced with underscores to avoid invalid file names.
Concurrency Handling: Uses Python's
ThreadPoolExecutorto simulate multiple simultaneous uploads, testing thread safety and atomicity of document creation.Use of Fixtures:
WebApiAuth,add_dataset_func, andtmp_pathare pytest fixtures that provide reusable setup code (e.g., authentication, dataset creation, temporary file paths).
Interactions with Other System Components
create_documentAPI: Central function under test, likely interfaces with backend services to store document metadata and content.list_kbsAPI: Used to verify the state of knowledge bases after document creation.RAGFlowWebApiAuth: Provides authentication tokens for API requests.create_txt_fileUtility: Generates temporary files for upload testing.Knowledge Base Management: Tests rely on adding or referencing knowledge bases (
kb_id), indicating that documents are linked to KB entities.
Mermaid Diagram
The diagram below illustrates the test classes, their key methods, and relationships to external functions and fixtures.
classDiagram
class TestAuthorization {
+test_invalid_auth(invalid_auth, expected_code, expected_message)
}
class TestDocumentCreate {
+test_filename_empty(WebApiAuth, add_dataset_func)
+test_filename_max_length(WebApiAuth, add_dataset_func, tmp_path)
+test_invalid_kb_id(WebApiAuth)
+test_filename_special_characters(WebApiAuth, add_dataset_func)
+test_concurrent_upload(WebApiAuth, add_dataset_func)
}
class create_document {
<<function>>
}
class list_kbs {
<<function>>
}
class RAGFlowWebApiAuth {
<<class>>
}
class create_txt_file {
<<function>>
}
class Fixtures {
<<package>>
WebApiAuth
add_dataset_func
tmp_path
}
TestAuthorization --> create_document : calls
TestAuthorization --> RAGFlowWebApiAuth : uses (invalid token)
TestDocumentCreate --> create_document : calls
TestDocumentCreate --> list_kbs : calls
TestDocumentCreate --> create_txt_file : calls
TestDocumentCreate --> Fixtures : uses
Summary
test_create_document.py is a critical quality assurance module for the InfiniFlow document management system. It rigorously tests authorization, input validation, edge cases, and concurrency to ensure the create_document API behaves reliably and securely. The file leverages pytest's advanced features like parametrization and fixtures and incorporates concurrency testing to mirror real-world usage scenarios.
By validating interactions with knowledge bases and ensuring proper error handling, this test suite helps maintain the integrity and usability of the document creation workflow within the InfiniFlow platform.