test_fragment.py
Overview
The [test_fragment.py](/projects/287/67788) file is a comprehensive suite of unit tests designed to verify the functionality and robustness of the `orjson.Fragment` feature within the `orjson` JSON serialization library. The primary focus is on testing the behavior of JSON fragments—pre-serialized JSON snippets that can be embedded directly into larger serialized JSON outputs without re-serialization.
This test file ensures that `orjson.Fragment` objects behave correctly in various scenarios, including construction, equality comparison, immutability, serialization of different data types, integration with external libraries like pandas, and extensive parsing tests with numerous JSON fixture files. It validates both expected successes and expected failure modes, helping maintain the correctness and stability of the `orjson` library's fragment handling capabilities.
Classes and Their Responsibilities
1. TestFragment
This class contains tests that cover the basic behavior and usage of the `orjson.Fragment` class. It tests equality, immutability, serialization from bytes and strings, error conditions on invalid input, and argument handling during construction.
Key Methods:
test_fragment_fragment_eq(self)Tests that two
orjson.Fragmentinstances with identical content are not considered equal (i.e., equality compares object identity, not content).test_fragment_fragment_not_mut(self)Verifies that the .contents attribute of a fragment is immutable and raises
AttributeErrorif modified.test_fragment_repr(self)Checks the string representation of a fragment instance starts as expected.
test_fragment_fragment_bytes(self)Tests serialization of fragments constructed from byte strings, including when used inside lists.
test_fragment_fragment_str(self)Tests serialization of fragments constructed from Python string objects.
test_fragment_fragment_str_empty(self)Tests serialization of an empty string fragment.
test_fragment_fragment_str_str(self)Tests serialization of a fragment containing a simple JSON string.
test_fragment_fragment_str_emoji(self)Tests serialization of a fragment containing Unicode emoji characters.
test_fragment_fragment_str_array(self)Tests serialization of a large list of identical emoji fragments.
test_fragment_fragment_str_invalid(self)Confirms that serialization of a fragment with invalid Unicode raises
orjson.JSONEncodeError.test_fragment_fragment_bytes_invalid(self)Tests serialization of a fragment containing invalid UTF-8 bytes (no error expected).
test_fragment_fragment_none(self)Ensures that constructing a fragment with None raises
orjson.JSONEncodeError.test_fragment_fragment_args_zero(self)Ensures that constructing a fragment with zero arguments raises
TypeError.test_fragment_fragment_args_two(self)Ensures that constructing a fragment with two positional arguments raises
TypeError.test_fragment_fragment_keywords(self)Ensures that using keyword arguments in fragment construction raises
TypeError.test_fragment_fragment_arg_and_keywords(self)Ensures that mixing positional and keyword arguments in fragment construction raises
TypeError.
2. TestFragmentPandas
This class contains tests that integrate `orjson.Fragment` with the `pandas` library. It is conditionally skipped if `pandas` is not installed.
Key Methods:
test_fragment_pandas(self)Demonstrates how a pandas
DataFramecan be serialized usingorjson.Fragmentby converting the DataFrame to JSON and embedding it as a fragment. It uses adefaultfallback function to detectpd.DataFrameinstances and convert them appropriately.Usage example:
import pandas as pd import orjson def default(value): if isinstance(value, pd.DataFrame): return orjson.Fragment(value.to_json(orient="records")) raise TypeError df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) result = orjson.dumps({"data": df}, default=default) # result == b'{"data":[{"foo":1,"bar":4},{"foo":2,"bar":5},{"foo":3,"bar":6}]}'
3. TestFragmentParsing
This class contains an extensive set of tests focused on verifying the fragment parsing abilities using a large number of JSON fixture files. It uses a helper method `_run_test` to read JSON fixture files as bytes and attempts to serialize them as `orjson.Fragment` objects.
Key Decorators:
@needs_dataIndicates that these tests require external fixture data files for execution.
Key Methods:
_run_test(self, filename: str)Reads a fixture file from a "parsing" directory and attempts to serialize its content wrapped in a
Fragment.Many individual test methods named according to the fixture file they test, e.g.,
test_fragment_y_array_empty_string(self)test_fragment_n_array_just_comma(self)test_fragment_i_string_utf16LE_no_BOM(self)and many others...
Each method calls `_run_test` with the corresponding JSON fixture filename.
**Implementation detail:**
The tests cover a wide spectrum of valid and invalid JSON inputs, including arrays, objects, strings, numbers, Unicode characters, escape sequences, incomplete data, and edge cases. This extensive coverage ensures `orjson.Fragment` parsing is highly robust.
Important Implementation Details and Algorithms
Fragment Equality
The test
test_fragment_fragment_eqshows thatorjson.Fragmentobjects are compared by identity, not by content, i.e., two fragments holding identical bytes are not equal unless they are the same object.Immutability
The
contentsattribute oforjson.Fragmentis immutable after creation. Attempts to mutate cause anAttributeError.Serialization Behavior
orjson.dumpsserializesFragmentobjects by embedding their raw JSON content directly (bypassing encoding). This allows for optimized serialization and composability of pre-encoded JSON snippets.Input Validation
The tests verify that invalid input types or improper argument usage to
orjson.Fragmentraise appropriate exceptions (TypeError,JSONEncodeError).Pandas Integration
The tests demonstrate a recommended pattern for integrating
pandas.DataFrameserialization withorjsonby converting DataFrames to JSON strings and wrapping them inFragmentto avoid double encoding.Test Fixture Driven Parsing
The parsing tests rely on many external JSON fixture files organized by categories: valid (
y_), invalid (n_), and interesting edge cases (i_). This strategy provides exhaustive coverage of JSON syntax and error handling.
Interaction with Other Parts of the System
Imports:
pytest: Used as the testing framework for assertions and test skipping.orjson: The core JSON serialization library whoseFragmentclass is under test.pandas(optional): Used in integration tests to verify fragment serialization of DataFrames..utilmodule: Provides helper functionsneeds_data(a test marker for data-dependent tests) andread_fixture_bytes(to load JSON fixture files).
Test Fixtures:
The tests depend on a large set of JSON fixture files, presumably stored in a test data directory, to validate parsing robustness.
Test Execution:
These tests are intended to be run with a test runner like
pytest. Passing all tests indicatesorjson.Fragmentbehaves correctly under a wide range of scenarios.
File Structure Mermaid Diagram
classDiagram
class TestFragment {
+test_fragment_fragment_eq()
+test_fragment_fragment_not_mut()
+test_fragment_repr()
+test_fragment_fragment_bytes()
+test_fragment_fragment_str()
+test_fragment_fragment_str_empty()
+test_fragment_fragment_str_str()
+test_fragment_fragment_str_emoji()
+test_fragment_fragment_str_array()
+test_fragment_fragment_str_invalid()
+test_fragment_fragment_bytes_invalid()
+test_fragment_fragment_none()
+test_fragment_fragment_args_zero()
+test_fragment_fragment_args_two()
+test_fragment_fragment_keywords()
+test_fragment_fragment_arg_and_keywords()
}
class TestFragmentPandas {
+test_fragment_pandas()
}
class TestFragmentParsing {
-_run_test(filename: str)
+test_fragment_y_array_arraysWithSpace()
+test_fragment_y_array_empty_string()
+test_fragment_y_array_empty()
+test_fragment_y_array_ending_with_newline()
+test_fragment_y_array_false()
+test_fragment_y_array_heterogeneou()
+test_fragment_y_array_null()
+test_fragment_y_array_with_1_and_newline()
+test_fragment_y_array_with_leading_space()
+test_fragment_y_array_with_several_null()
+test_fragment_y_array_with_trailing_space()
+test_fragment_y_number()
+test_fragment_y_number_0e_1()
+test_fragment_y_number_0e1()
+test_fragment_y_number_after_space()
+test_fragment_y_number_double_close_to_zer()
+test_fragment_y_number_int_with_exp()
+test_fragment_y_number_minus_zer()
+test_fragment_y_number_negative_int()
+test_fragment_y_number_negative_one()
+test_fragment_y_number_negative_zer()
+test_fragment_y_number_real_capital_e()
+test_fragment_y_number_real_capital_e_neg_exp()
+test_fragment_y_number_real_capital_e_pos_exp()
+test_fragment_y_number_real_exponent()
+test_fragment_y_number_real_fraction_exponent()
+test_fragment_y_number_real_neg_exp()
+test_fragment_y_number_real_pos_exponent()
+test_fragment_y_number_simple_int()
+test_fragment_y_number_simple_real()
+test_fragment_y_object()
+test_fragment_y_object_basic()
+test_fragment_y_object_duplicated_key()
+test_fragment_y_object_duplicated_key_and_value()
+test_fragment_y_object_empty()
+test_fragment_y_object_empty_key()
+test_fragment_y_object_escaped_null_in_key()
+test_fragment_y_object_extreme_number()
+test_fragment_y_object_long_string()
+test_fragment_y_object_simple()
+test_fragment_y_object_string_unicode()
+test_fragment_y_object_with_newline()
+test_fragment_y_string_1_2_3_bytes_UTF_8_sequence()
+test_fragment_y_string_accepted_surrogate_pair()
+test_fragment_y_string_accepted_surrogate_pairs()
+test_fragment_y_string_allowed_escape()
+test_fragment_y_string_backslash_and_u_escaped_zer()
+test_fragment_y_string_backslash_double_escape_a()
+test_fragment_y_string_backslash_double_escape_n()
+test_fragment_y_string_backslash_doublequote()
+test_fragment_y_string_comment()
+test_fragment_y_string_double_escape_a()
+test_fragment_y_string_double_escape_()
+test_fragment_y_string_escaped_control_character()
+test_fragment_y_string_escaped_noncharacter()
+test_fragment_y_string_in_array()
+test_fragment_y_string_in_array_with_leading_space()
+test_fragment_y_string_last_surrogates_1_and_2()
+test_fragment_y_string_nbsp_uescaped()
+test_fragment_y_string_nonCharacterInUTF_8_U_10FFFF()
+test_fragment_y_string_nonCharacterInUTF_8_U_FFFF()
+test_fragment_y_string_null_escape()
+test_fragment_y_string_one_byte_utf_8()
+test_fragment_y_string_pi()
+test_fragment_y_string_reservedCharacterInUTF_8_U_1BFFF()
+test_fragment_y_string_simple_ascii()
+test_fragment_y_string_space()
+test_fragment_y_string_surrogates_U_1D11E_MUSICAL_SYMBOL_G_CLEF()
+test_fragment_y_string_three_byte_utf_8()
+test_fragment_y_string_two_byte_utf_8()
+test_fragment_y_string_u_2028_line_sep()
+test_fragment_y_string_u_2029_par_sep()
+test_fragment_y_string_uEscape()
+test_fragment_y_string_uescaped_newline()
+test_fragment_y_string_unescaped_char_delete()
+test_fragment_y_string_unicode()
+test_fragment_y_string_unicodeEscapedBackslash()
+test_fragment_y_string_unicode_2()
+test_fragment_y_string_unicode_U_10FFFE_nonchar()
+test_fragment_y_string_unicode_U_1FFFE_nonchar()
+test_fragment_y_string_unicode_U_200B_ZERO_WIDTH_SPACE()
+test_fragment_y_string_unicode_U_2064_invisible_plu()
+test_fragment_y_string_unicode_U_FDD0_nonchar()
+test_fragment_y_string_unicode_U_FFFE_nonchar()
+test_fragment_y_string_unicode_escaped_double_quote()
+test_fragment_y_string_utf8()
+test_fragment_y_string_with_del_character()
+test_fragment_y_structure_lonely_false()
+test_fragment_y_structure_lonely_int()
+test_fragment_y_structure_lonely_negative_real()
+test_fragment_y_structure_lonely_null()
+test_fragment_y_structure_lonely_string()
+test_fragment_y_structure_lonely_true()
+test_fragment_y_structure_string_empty()
+test_fragment_y_structure_trailing_newline()
+test_fragment_y_structure_true_in_array()
+test_fragment_y_structure_whitespace_array()
+test_fragment_n_array_1_true_without_comma()
+test_fragment_n_array_a_invalid_utf8()
+test_fragment_n_array_colon_instead_of_comma()
+test_fragment_n_array_comma_after_close()
+test_fragment_n_array_comma_and_number()
+test_fragment_n_array_double_comma()
+test_fragment_n_array_double_extra_comma()
+test_fragment_n_array_extra_close()
+test_fragment_n_array_extra_comma()
+test_fragment_n_array_incomplete()
+test_fragment_n_array_incomplete_invalid_value()
+test_fragment_n_array_inner_array_no_comma()
+test_fragment_n_array_invalid_utf8()
+test_fragment_n_array_items_separated_by_semicol()
+test_fragment_n_array_just_comma()
+test_fragment_n_array_just_minu()
+test_fragment_n_array_missing_value()
+test_fragment_n_array_newlines_unclosed()
+test_fragment_n_array_number_and_comma()
+test_fragment_n_array_number_and_several_comma()
+test_fragment_n_array_spaces_vertical_tab_formfeed()
+test_fragment_n_array_star_inside()
+test_fragment_n_array_unclosed()
+test_fragment_n_array_unclosed_trailing_comma()
+test_fragment_n_array_unclosed_with_new_line()
+test_fragment_n_array_unclosed_with_object_inside()
+test_fragment_n_incomplete_false()
+test_fragment_n_incomplete_null()
+test_fragment_n_incomplete_true()
+test_fragment_n_multidigit_number_then_00()
+test_fragment_n_number__()
+test_fragment_n_number_1()
+test_fragment_n_number_Inf()
+test_fragment_n_number_01()
+test_fragment_n_number_1_0()
+test_fragment_n_number_2()
+test_fragment_n_number_negative_NaN()
+test_fragment_n_number_negative_1()
+test_fragment_n_number_2e_3()
+test_fragment_n_number_0_1_2()
+test_fragment_n_number_0_3e_()
+test_fragment_n_number_0_3e()
+test_fragment_n_number_0_e1()
+test_fragment_n_number_0_capital_E_()
+test_fragment_n_number_0_capital_E()
+test_fragment_n_number_0e_()
+test_fragment_n_number_0e()
+test_fragment_n_number_1_0e_()
+test_fragment_n_number_1_0e_2()
+test_fragment_n_number_1_0e()
+test_fragment_n_number_1_000()
+test_fragment_n_number_1eE2()
+test_fragment_n_number_2_e_3()
+test_fragment_n_number_2_e_3_2()
+test_fragment_n_number_2_e3_3()
+test_fragment_n_number_9_e_()
+test_fragment_n_number_negative_Inf()
+test_fragment_n_number_NaN()
+test_fragment_n_number_U_FF11_fullwidth_digit_one()
+test_fragment_n_number_expressi()
+test_fragment_n_number_hex_1_digit()
+test_fragment_n_number_hex_2_digit()
+test_fragment_n_number_infinity()
+test_fragment_n_number_invalid_()
+test_fragment_n_number_invalid_negative_real()
+test_fragment_n_number_invalid_utf_8_in_bigger_int()
+test_fragment_n_number_invalid_utf_8_in_exponent()
+test_fragment_n_number_invalid_utf_8_in_int()
+test_fragment_n_number_minus_infinity()
+test_fragment_n_number_minus_sign_with_trailing_garbage()
+test_fragment_n_number_minus_space_1()
+test_fragment_n_number_neg_int_starting_with_zer()
+test_fragment_n_number_neg_real_without_int_part()
+test_fragment_n_number_neg_with_garbage_at_end()
+test_fragment_n_number_real_garbage_after_e()
+test_fragment_n_number_real_with_invalid_utf8_after_e()
+test_fragment_n_number_real_without_fractional_part()
+test_fragment_n_number_starting_with_dot()
+test_fragment_n_number_with_alpha()
+test_fragment_n_number_with_alpha_char()
+test_fragment_n_number_with_leading_zer()
+test_fragment_n_object_bad_value()
+test_fragment_n_object_bracket_key()
+test_fragment_n_object_comma_instead_of_col()
+test_fragment_n_object_double_col()
+test_fragment_n_object_emoji()
+test_fragment_n_object_garbage_at_end()
+test_fragment_n_object_key_with_single_quote()
+test_fragment_n_object_lone_continuation_byte_in_key_and_trailing_comma()
+test_fragment_n_object_missing_col()
+test_fragment_n_object_missing_key()
+test_fragment_n_object_missing_semicol()
+test_fragment_n_object_missing_value()
+test_fragment_n_object_no_col()
+test_fragment_n_object_non_string_key()
+test_fragment_n_object_non_string_key_but_huge_number_instead()
+test_fragment_n_object_repeated_null_null()
+test_fragment_n_object_several_trailing_comma()
+test_fragment_n_object_single_quote()
+test_fragment_n_object_trailing_comma()
+test_fragment_n