test_canonical.py
Overview
The [test_canonical.py](/projects/287/67683) file contains a set of automated unit tests designed to verify the correctness of JSON serialization behavior using the `orjson` library. Specifically, the tests focus on ensuring that the `orjson.dumps()` function correctly escapes special and control characters in strings, including control characters, quotes, backslashes, and Unicode line separator characters.
This file is part of a test suite that validates the canonical JSON output format produced by `orjson`, which is a fast, correct JSON library for Python. The tests help maintain the correctness and stability of JSON serialization for edge cases involving character escaping.
Classes and Methods
Class: TestCanonicalTests
This class groups together test methods that verify `orjson.dumps()` behavior with respect to escaping certain characters in strings.
Methods
test_dumps_ctrl_escape(self)
Purpose: Tests that control characters (non-printable ASCII characters) in strings are properly escaped during JSON serialization.
Parameters: None (other than
self)Returns: None (assertion based test)
Details:
The test serializes the string
"text\u0003\r\n"where\u0003is a control character (End of Text),\ris carriage return, and\nis newline.It asserts that the output is the JSON-encoded byte string with the control characters escaped as Unicode escape sequences.
Example usage:
assert orjson.dumps("text\u0003\r\n") == b'"text\\u0003\\r\\n"'Notes:
Ensures control characters are not output as raw bytes but as escaped sequences adhering to JSON standards.
test_dumps_escape_quote_backslash(self)
Purpose: Ensures that quotes (
") and backslashes (\) in strings are properly escaped in JSON output.Parameters: None (other than
self)Returns: None (assertion based test)
Details:
The test serializes a raw string containing a quote and a backslash:
r'"\ test'.It verifies that the resulting JSON byte string correctly escapes the quote as
\"and the backslash as\\.
Example usage:
assert orjson.dumps(r'"\ test') == b'"\\"\\\\ test"'Notes:
This prevents JSON syntax errors and ensures strings are safely encoded.
test_dumps_escape_line_separator(self)
Purpose: Tests the escaping of Unicode line separator characters U+2028 and U+2029 in JSON serialization.
Parameters: None (other than
self)Returns: None (assertion based test)
Details:
Serializes a dictionary containing the string
"\u2028 \u2029".Checks that the output contains the UTF-8 encoded byte sequences for these characters.
Example usage:
assert orjson.dumps({"spaces": "\u2028 \u2029"}) == b'{"spaces":"\xe2\x80\xa8 \xe2\x80\xa9"}'Notes:
U+2028 and U+2029 are line separator characters that can cause issues in JavaScript string literals if not properly encoded.
orjsonoutputs them as UTF-8 bytes, not as escaped Unicode sequences, which is valid JSON but might differ from some serializers.
Implementation Details and Algorithms
The tests leverage Python's
assertstatements to validate that the output oforjson.dumps()matches expected byte strings.orjson.dumps()serializes Python objects to JSON in bytes.The focus is on verifying how special characters are escaped or encoded in the JSON output.
No explicit algorithms are implemented here; the file functions purely as a test harness.
The file uses docstrings for each test method to briefly describe the test purpose.
Interaction with Other System Components
This file depends directly on the
orjsonlibrary, which must be installed in the environment.It is likely part of a larger test suite for the
orjsonproject or for any software that usesorjsonfor JSON serialization.These tests help ensure that any changes to
orjson.dumps()do not break canonical escaping behavior.The file would be run using a Python test runner such as
pytestorunittest.It does not interact with other application components but serves as a developer tool for maintaining serialization correctness.
Visual Diagram
classDiagram
class TestCanonicalTests {
+test_dumps_ctrl_escape()
+test_dumps_escape_quote_backslash()
+test_dumps_escape_line_separator()
}
TestCanonicalTests ..> orjson : uses
Summary
[test_canonical.py](/projects/287/67683) is a concise test file verifying that the `orjson` JSON serializer correctly escapes control characters, quotes, backslashes, and Unicode line separator characters in JSON strings. It ensures compliance with JSON standards and prevents serialization bugs related to string encoding. This file is a critical part of the quality assurance process for projects relying on `orjson` for JSON handling.