Numpy Serialization
Purpose
Numpy Serialization addresses the challenge of efficiently converting numpy data types—such as arrays and scalars—into JSON format within the orjson library. Standard JSON serializers do not natively understand numpy objects, which can lead to errors or inefficient fallback serialization. This subtopic provides specialized support to recognize and serialize numpy types directly, improving both compatibility and performance when working with scientific and numerical Python codebases.
Functionality
The core feature revolves around an optional serialization flag that enables orjson to detect numpy objects during serialization and convert them into JSON-compatible representations seamlessly.
Key workflows include:
Detection of Numpy Types: When the
OPT_SERIALIZE_NUMPYoption is enabled, orjson inspects Python objects to identify numpy arrays, scalars, and related types.Efficient Conversion: Instead of falling back to generic or user-defined serializers (which may be slow or incomplete), numpy arrays are converted into JSON arrays of numbers, and numpy scalars into corresponding JSON numbers or strings.
Option Passing: Users enable this feature by passing the
OPT_SERIALIZE_NUMPYflag to thedumpsfunction, triggering the internal specialized serialization path.Fallback Interaction: This numpy serialization mechanism works alongside fallback serialization functions. If an object is not a numpy type, orjson defers to either the default serializer or raises an error.
The following snippet from [bench/run_default](/projects/287/67673) illustrates usage:
from orjson import dumps, OPT_SERIALIZE_NUMPY
def default(_):
return None # fallback for unsupported types
obj = [[Custom()] * 1000] * 10 # contains unsupported objects
for _ in range(n):
dumps(obj, default, OPT_SERIALIZE_NUMPY)
Here, `OPT_SERIALIZE_NUMPY` enables numpy serialization, while the `default` function handles other unknown types gracefully.
Relationship to Parent Topic and Other Subtopics
This subtopic builds upon the **Custom Serialization Support** main topic by providing a targeted, optimized path for numpy data types, which are a common source of serialization issues in scientific Python applications.
Unlike the generic Fallback Serialization Function subtopic, which allows users to define how unknown types are serialized, numpy serialization automates and accelerates the handling of numpy types without user intervention.
It complements fallback mechanisms by reducing the need for user-defined fallbacks specifically for numpy objects, improving performance and correctness.
By integrating numpy serialization into orjson’s core, this subtopic seamlessly enhances the parent topic’s goal of extended serialization support while maintaining speed and robustness.
Diagram
flowchart TD
UserCalls[Dumps called with Python object]
CheckNumpy{Is object a numpy type?}
SerializeNumpy[Convert numpy to JSON array/scalar]
FallbackCheck{Fallback serializer provided?}
CallFallback[Invoke fallback function]
RaiseError[Raise serialization error]
OutputJSON[Return JSON bytes]
UserCalls --> CheckNumpy
CheckNumpy -- Yes --> SerializeNumpy --> OutputJSON
CheckNumpy -- No --> FallbackCheck
FallbackCheck -- Yes --> CallFallback --> OutputJSON
FallbackCheck -- No --> RaiseError
This flowchart highlights how the serialization process prioritizes numpy serialization when enabled, then falls back to custom user handlers or errors otherwise.