serializer.rs
Overview
[serializer.rs](/projects/287/67745) is a core serialization module designed to convert Python objects (represented as raw pointers to `pyo3_ffi::PyObject`) into a serializable Rust data structure using the `serde` framework. It provides a unified interface to serialize various Python object types—including primitives, collections, datetime objects, enums, numpy arrays, and dataclasses—into a format that can be written out (for example, JSON or other serde-supported formats).
At its heart, the file defines the `PyObjectSerializer` struct which implements `serde::Serialize`. It dynamically dispatches serialization logic based on the runtime Python object type, leveraging specialized serializers per type. The main entry point function is `serialize()`, which wraps this process and handles output buffering and formatting options such as pretty printing and newline appending.
This file acts as the bridge between raw Python FFI objects and Rust’s strongly typed serialization ecosystem, enabling seamless conversion of Python data structures for consumption or storage in Rust applications.
Detailed Documentation
Function: serialize
pub(crate) fn serialize(
ptr: *mut pyo3_ffi::PyObject,
default: Option<NonNull<pyo3_ffi::PyObject>>,
opts: Opt,
) -> Result<NonNull<pyo3_ffi::PyObject>, String>
Purpose
Converts a raw Python object pointer into a serialized byte buffer and returns it as a new Python byte object. Handles serialization options like pretty printing and newline appending.
Parameters
ptr: *mut pyo3_ffi::PyObject
Pointer to the Python object to serialize.default: Option<NonNull<pyo3_ffi::PyObject>>
Optional pointer to a default function/object used for fallback serialization of unsupported types.opts: Opt
Serialization options bitflags controlling behavior such as indentation and newline appending.
Returns
Result<NonNull<pyo3_ffi::PyObject>, String>
On success, returns a non-null pointer to a Python bytes object containing the serialized data. On failure, returns a string with an error message.
Behavior
Creates a
BytesWriterbuffer to accumulate serialized output.Constructs a
PyObjectSerializerwith the object pointer, serialization state, and default handler.Chooses between compact or pretty-printed serialization based on options.
Writes the serialized output into buffer via
to_writerorto_writer_pretty.On success, finalizes and returns the buffer as a Python bytes object.
On error, decrements the reference count of the buffer and returns the error string.
Usage Example
let py_obj_ptr: *mut pyo3_ffi::PyObject = ...; // obtained from Python FFI
let options = Opt::default();
match serialize(py_obj_ptr, None, options) {
Ok(bytes_ptr) => {
// Use serialized bytes, e.g., return to Python or write to file
}
Err(err) => {
eprintln!("Serialization failed: {}", err);
}
}
Struct: PyObjectSerializer
pub(crate) struct PyObjectSerializer {
pub ptr: *mut pyo3_ffi::PyObject,
pub state: SerializerState,
pub default: Option<NonNull<pyo3_ffi::PyObject>>,
}
Purpose
Wraps a raw Python object pointer with serialization context and implements `serde::Serialize`. It serves as the polymorphic serializer that delegates to the appropriate per-type serializer based on the Python object's runtime type.
Fields
ptr
Pointer to the Python object being serialized.state
Serialization state, encapsulating options and possibly other metadata.default
Optional fallback serializer object for handling unsupported or unknown types.
Methods
new
pub fn new(
ptr: *mut pyo3_ffi::PyObject,
state: SerializerState,
default: Option<NonNull<pyo3_ffi::PyObject>>,
) -> Self
Constructs a new `PyObjectSerializer` instance.
**Parameters:**
ptr: Python object pointer.state: Current serialization state.default: Optional fallback default object.
**Returns:**
A new
PyObjectSerializerinstance.
**Example:**
let serializer = PyObjectSerializer::new(py_obj_ptr, state, None);
Trait Implementation: Serialize for PyObjectSerializer
This implementation is the core dispatcher that selects the correct serializer based on the Python object's type.
Method: serialize
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
Uses the helper function
pyobject_to_obtypeto determine the object's type (ObTypeenum).Matches on the
ObTypeand constructs the corresponding serializer struct (e.g.,StrSerializer,IntSerializer,DictGenericSerializer, etc.).Delegates the actual serialization call to the selected serializer.
Handles empty lists and tuples with a special
ZeroListSerializer.Falls back to
DefaultSerializerfor unknown types.Supports a wide range of Python types including primitives, datetime, UUIDs, enums, numpy arrays/scalars, dataclasses, and fragments.
**Algorithmic Detail:**
The dynamic dispatch is done at runtime by matching the Python object's type enum.
This design isolates serialization logic per type in separate modules, enhancing modularity and maintainability.
Empty sequence optimization avoids unnecessary iteration.
**Example usage:**
When a `PyObjectSerializer` instance is serialized through serde, the correct underlying serializer is invoked automatically:
let serializer = PyObjectSerializer::new(ptr, state, default);
serde_json::to_string(&serializer)?;
Important Implementation Details
Use of Unsafe Pointers: The file directly manipulates raw pointers to Python objects via
pyo3_ffi. Safety is managed externally, and this module assumes valid pointers.Option for Default Serializer: Allows clients to provide a default serialization fallback, useful for extensibility or custom object support.
Buffer Management: Uses a custom
BytesWriterto accumulate output efficiently before converting to a Python byte string.Serialization Options: The
Optflags control features such as indentation (INDENT_2) and appending a newline (APPEND_NEWLINE), allowing for flexible formatting.Integration with Serde: Implements
serde::Serializewhich allows this serializer to be plugged into any serde-compatible output format (JSON, YAML, MessagePack, etc.).Modular Per-Type Serializers: Delegates serialization to specialized serializers imported from
crate::serialize::per_type, promoting separation of concerns.
Interaction with Other Modules
crate::opt: Provides serialization options flags controlling the output format and behavior.crate::serialize::obtype: Contains the mapping logic (pyobject_to_obtype) to identify Python object types at runtime.crate::serialize::per_type: Contains the concrete serializers for each Python type (strings, ints, floats, datetime, dicts, lists, numpy arrays, etc.).crate::serialize::state: Holds the state and options for the serialization process.crate::serialize::writer: Provides utilities (to_writer,to_writer_pretty,BytesWriter) to write serialized data into buffers.serde: The serialization framework used for Rust serialization.
This file acts as the orchestrator, tying together type detection, per-type serializers, and output buffering.
Visual Diagram
classDiagram
class PyObjectSerializer {
+ptr: *mut PyObject
+state: SerializerState
+default: Option<NonNull<PyObject>>
+new(ptr, state, default) PyObjectSerializer
+serialize<S: Serializer>(serializer: S) -> Result<S::Ok, S::Error>
}
PyObjectSerializer ..> SerializerState : contains
PyObjectSerializer ..> pyo3_ffi::PyObject : contains pointer
PyObjectSerializer ..|> serde::Serialize
PyObjectSerializer "1" --> "1" StrSerializer : when ObType::Str
PyObjectSerializer "1" --> "1" StrSubclassSerializer : when ObType::StrSubclass
PyObjectSerializer "1" --> "1" IntSerializer : when ObType::Int
PyObjectSerializer "1" --> "1" NoneSerializer : when ObType::None
PyObjectSerializer "1" --> "1" FloatSerializer : when ObType::Float
PyObjectSerializer "1" --> "1" BoolSerializer : when ObType::Bool
PyObjectSerializer "1" --> "1" DateTime : when ObType::Datetime
PyObjectSerializer "1" --> "1" Date : when ObType::Date
PyObjectSerializer "1" --> "1" Time : when ObType::Time
PyObjectSerializer "1" --> "1" UUID : when ObType::Uuid
PyObjectSerializer "1" --> "1" DictGenericSerializer : when ObType::Dict
PyObjectSerializer "1" --> "1" ListTupleSerializer : when ObType::List or ObType::Tuple (if not empty)
PyObjectSerializer "1" --> "1" ZeroListSerializer : when ObType::List or ObType::Tuple (empty)
PyObjectSerializer "1" --> "1" DataclassGenericSerializer : when ObType::Dataclass
PyObjectSerializer "1" --> "1" EnumSerializer : when ObType::Enum
PyObjectSerializer "1" --> "1" NumpySerializer : when ObType::NumpyArray
PyObjectSerializer "1" --> "1" NumpyScalar : when ObType::NumpyScalar
PyObjectSerializer "1" --> "1" FragmentSerializer : when ObType::Fragment
PyObjectSerializer "1" --> "1" DefaultSerializer : when ObType::Unknown
Summary
The [serializer.rs](/projects/287/67745) file provides a critical serialization layer turning raw Python objects into serialized Rust representations using serde. It detects Python types dynamically and delegates to specialized serializers, supporting a broad range of Python data types. The main `serialize()` function exposes this functionality in a flexible way with options for pretty output and fallback handlers.
This module integrates deeply with other serialization components to form a robust, extensible serialization pipeline bridging Python FFI and Rust’s serde ecosystem.