Custom Serialization Support

Overview

The **Custom Serialization Support** module addresses the challenge of serializing Python objects that are not natively serializable by the JSON encoder. This functionality is essential in scenarios where user-defined or complex types must be converted into JSON-compatible representations. The module provides a flexible mechanism to specify a fallback serialization function (`default`) that can transform unsupported objects into serializable forms, ensuring seamless JSON encoding without losing data integrity or causing errors.

Additionally, this module extends serialization capabilities to specialized data types such as NumPy arrays, enabling their efficient conversion to JSON. This broadens the library’s applicability in scientific and data-intensive Python applications where NumPy is prevalent.


Core Concepts and Purpose

Why Custom Serialization?

Standard JSON serializers handle basic Python types (e.g., `dict`, `list`, `str`, `int`) but fail when encountering custom classes or complex objects. Without customization, serialization attempts raise errors or produce incomplete output. The Custom Serialization Support module solves this issue by:

Problems Addressed


How the Module Works

Fallback Serialization Function

The module accepts a user-defined fallback function, commonly named `default`, which the serializer calls whenever it encounters an object that is not directly serializable. This function should return a JSON-compatible value (e.g., a string, number, list, dictionary) or `None` if no conversion is possible.

Example from `bench/run_default`:

class Custom:
    pass

def default(_):
    return None

obj = [[Custom()] * 1000] * 10
dumps(obj, default, OPT_SERIALIZE_NUMPY)

Here, `Custom` objects are replaced by `null` in the JSON output, preventing errors during serialization.

Numpy Serialization Option

To efficiently serialize NumPy arrays, the module offers an option flag (`OPT_SERIALIZE_NUMPY`) that activates specialized serialization paths for NumPy data types. When enabled:

Workflow Summary

  1. Serialization starts with the top-level Python object.

  2. Type inspection is performed for each element.

  3. If an element is unsupported, the default fallback is invoked.

  4. The fallback either returns a JSON-serializable object or None.

  5. Serialization continues, applying optimized paths for types like NumPy arrays if enabled.

  6. The final JSON bytes result is returned.


Interaction with Other System Components

The fallback mechanism acts as a bridge between Python's rich object ecosystem and the strict JSON format, coordinated between the Rust core serialization logic and the Python API layer.


Important Concepts and Design Patterns


Code Snippet Illustrating Key Interaction

from orjson import dumps, OPT_SERIALIZE_NUMPY

class Custom:
    pass

def default(obj):
    # Replace unsupported objects with null
    return None

obj = [[Custom()] * 1000] * 10

# Serialize with fallback and numpy support enabled
json_bytes = dumps(obj, default, OPT_SERIALIZE_NUMPY)

This snippet exemplifies how the fallback function integrates with the core serialization call and how option flags influence behavior.


Visualization: Serialization Workflow with Fallback and NumPy Support

flowchart TD
    A[Start Serialization] --> B{Is object natively serializable?}
    B -- Yes --> C[Serialize object]
    B -- No --> D{Is fallback function provided?}
    D -- No --> E[Raise Serialization Error]
    D -- Yes --> F[Invoke fallback function]
    F --> G{Fallback returns JSON-compatible value?}
    G -- Yes --> C
    G -- No --> H[Serialize as JSON null]
    C --> I{Is object a NumPy array?}
    I -- Yes & OPT_SERIALIZE_NUMPY enabled --> J[Use optimized NumPy serialization]
    I -- No or option disabled --> K[Standard serialization process]
    J --> L[Continue serialization]
    K --> L
    H --> L
    L --> M[Complete Serialization]

This flowchart clarifies the decision points during serialization when encountering unsupported objects and the role of the fallback function and NumPy serialization option.


Summary

The **Custom Serialization Support** module enhances the JSON serialization process by providing extensibility through a user-defined fallback function and optimized handling of specialized types like NumPy arrays. It ensures robustness and flexibility in serializing diverse Python objects, integrating seamlessly with the core serialization engine and Python API. The design leverages callback patterns and option flags to balance performance and usability in complex serialization scenarios.