data.py


Overview

The `data.py` file serves as a fundamental configuration and utility module within the benchmarking suite for JSON serialization and deserialization libraries. Its primary role is to:

This setup facilitates easy parametrization of benchmarks over multiple libraries and datasets, enabling consistent comparisons between different JSON processing implementations.


Detailed Explanation of Components

Imports

from json import dumps as _json_dumps
from json import loads as json_loads

from orjson import dumps as orjson_dumps
from orjson import loads as orjson_loads

Function: json_dumps

def json_dumps(obj):
    return _json_dumps(obj).encode("utf-8")
data = {"key": "value"}
serialized_bytes = json_dumps(data)
print(type(serialized_bytes))  # <class 'bytes'>

Variable: libraries

libraries = {
    "orjson": (orjson_dumps, orjson_loads),
    "json": (json_dumps, json_loads),
}
for lib_name, (dumper, loader) in libraries.items():
    serialized = dumper(data)
    deserialized = loader(serialized)

Variable: fixtures

fixtures = [
    "canada.json",
    "citm_catalog.json",
    "github.json",
    "twitter.json",
]
for fixture in fixtures:
    data = read_fixture_obj(fixture + ".xz")  # Utility function to load fixture
    for lib_name, (dumper, loader) in libraries.items():
        serialized = dumper(data)
        deserialized = loader(serialized)
        # Perform benchmarking or correctness checks

Implementation Details and Algorithms


Interaction with Other System Components


Mermaid Class Diagram

Since this file contains no classes but only functions and variables, a **flowchart** illustrating the functional structure and relationships is more appropriate.

flowchart TD
    A[json_dumps(obj)] -->|wraps| B[json.dumps(obj) -> str]
    B --> C[encode("utf-8") -> bytes]

    subgraph Libraries
        D[orjson_dumps(obj) -> bytes]
        E[orjson_loads(bytes) -> obj]
        F[json_dumps(obj) -> bytes]
        G[json_loads(str) -> obj]
    end

    H[libraries dict]
    H --> D
    H --> E
    H --> F
    H --> G

    I[fixtures list]
    I --> J[Fixture filenames: canada.json, citm_catalog.json, ...]

Summary


Usage Example in Benchmark Code

import data

for fixture_name in data.fixtures:
    # Load data object from fixture (e.g., using utility function)
    obj = read_fixture_obj(fixture_name + ".xz")

    for lib_name, (dumper, loader) in data.libraries.items():
        serialized = dumper(obj)
        deserialized = loader(serialized)
        assert deserialized == obj  # correctness check

This example shows how the `data.py` definitions enable concise and uniform benchmark implementations.


End of Documentation for data.py