test_escape.py
Overview
`test_escape.py` is a test suite focused on verifying the correct serialization (JSON encoding) of various special and control Unicode characters by the `orjson` library. `orjson` is a fast, correct JSON library for Python that produces UTF-8 encoded JSON output.
This file ensures that specific whitespace, control characters, and escape sequences are correctly escaped according to JSON standards when serialized by `orjson.dumps()`. The tests primarily check that characters like newline, carriage return, backspace, tab, and others are transformed into their proper escaped representations, ensuring compatibility and correctness of JSON output.
Detailed Explanation
Imports
import orjson
Imports the
orjsonlibrary, which is used for JSON serialization.
Test Functions
Each test function asserts that serializing a single Unicode character results in the correct escaped byte string output.
General Pattern
def test_xxx():
assert orjson.dumps("<unicode_char>") == b'"<expected_escaped_string>"'
Purpose: Validate that
orjson.dumps()correctly escapes the input Unicode character.Parameters: None (each test is a standalone function).
Return Value: None. The function raises an
AssertionErrorif the serialized output does not match the expected escaped string.Usage: These are unit tests, typically run via a test runner like
pytest.
Examples
1. test_issue565
def test_issue565():
assert (
orjson.dumps("\n\r\u000b\f\u001c\u001d\u001e")
== b'"\\n\\r\\u000b\\f\\u001c\\u001d\\u001e"'
)
Purpose: Tests multiple special characters in one string to ensure they are escaped correctly.
Input Characters:
\n(newline)\r(carriage return)\u000b(vertical tab)\f(form feed)\u001c,\u001d,\u001e(control characters)
Expected Output: JSON string with properly escaped sequences.
2. test_0x00 through test_0x1a
Each function tests a specific control character in the Unicode range U+0000 to U+001A, verifying the escaping behavior.
Example:
def test_0x00():
assert orjson.dumps("\u0000") == b'"\\u0000"'
Character: Null byte (U+0000)
Expected Output: JSON escaped string
"\u0000"
Some characters have special escape sequences, e.g.,
def test_0x08():
assert orjson.dumps("\u0008") == b'"\\b"'
Character: Backspace (U+0008)
Expected Output: Escaped as
"\b"instead of"\u0008"
3. test_backslash
def test_backslash():
assert orjson.dumps("\\") == b'"\\\\"'
Purpose: Verify that the backslash character (
\) is escaped as\\in JSON.
4. test_quote
def test_quote():
assert orjson.dumps('"') == b'"\\""'
Purpose: Verify that the double quote character (
") is escaped as\"in JSON.
Important Implementation Details
Character Escaping:
orjson.dumps()returns a UTF-8 encodedbytesobject representing the JSON string. Special characters in strings must be escaped according to JSON specification:Control characters (U+0000 to U+001F) are either escaped as a short escape sequence (e.g.,
\n,\r,\t) or as a Unicode escape (e.g.,\u000b).The backslash and double quote characters are escaped with a preceding backslash.
Testing Approach:
Each unit test is minimal and atomic, testing exactly one character or a small sequence. This makes it easy to pinpoint escaping issues inorjson.Use of Byte Strings:
The expected output is a byte string (b"...") becauseorjson.dumps()returns bytes, not a string.
Interaction with Other Parts of the System
This file is part of the test suite for the
orjsonlibrary or an application that depends on it.It ensures consistent and standard-compliant JSON serialization of special Unicode characters.
Typically, this file is run in a test environment alongside other test files to validate the integrity and correctness of the JSON serialization functionality.
No other modules are imported or invoked here besides
orjson, and no classes are defined.
Summary
File Type: Unit Test File
Focus: JSON serialization escaping correctness of special characters using
orjson.dumps().Usage: Executed during testing to validate orjson behavior.
Contains: Multiple small test functions, each asserting correct output for a specific Unicode character.
Visual Diagram: Flowchart of Test Functions and Relationships
flowchart TD
A[Start: Run test_escape.py] --> B{Test Functions}
B --> C[test_issue565]
B --> D[test_0x00_to_0x1A]
B --> E[test_backslash]
B --> F[test_quote]
D --> D1[test_0x00]
D --> D2[test_0x01]
D --> D3[test_0x02]
D --> D4[test_0x03]
D --> D5[test_0x04]
D --> D6[test_0x05]
D --> D7[test_0x06]
D --> D8[test_0x07]
D --> D9[test_0x08]
D --> D10[test_0x09]
D --> D11[test_0x0a]
D --> D12[test_0x0b]
D --> D13[test_0x0c]
D --> D14[test_0x0d]
D --> D15[test_0x0e]
D --> D16[test_0x0f]
D --> D17[test_0x10]
D --> D18[test_0x11]
D --> D19[test_0x12]
D --> D20[test_0x13]
D --> D21[test_0x14]
D --> D22[test_0x15]
D --> D23[test_0x16]
D --> D24[test_0x17]
D --> D25[test_0x18]
D --> D26[test_0x19]
D --> D27[test_0x1a]
style B fill:#f9f,stroke:#333,stroke-width:2px
style D fill:#bbf,stroke:#333,stroke-width:1px