dataclass.rs
Overview
This Rust source file implements serialization logic for Python dataclasses within a Rust-based Python serialization framework. Its primary purpose is to efficiently convert Python dataclass instances into a serializable map format (e.g., JSON) by leveraging the Rust `serde` serialization traits. The file defines multiple serializer structs that handle serialization in different scenarios:
DataclassGenericSerializer: The entry point serializer that decides whether to use a fast serialization path or fallback approach based on the Python dataclass attributes.DataclassFastSerializer: A fast path that serializes the dataclass by directly iterating over its__dict__(attribute dictionary).DataclassFallbackSerializer: A fallback path that serializes dataclasses by explicitly fetching the dataclass fields and serializing their values, used when fast path is unavailable or unsuitable.
The implementation carefully handles Python C API calls, reference counting, recursion limits, and error handling to ensure safe and correct serialization.
Detailed Explanation of Components
Struct: DataclassGenericSerializer<'a>
Purpose
Acts as the generic serializer for dataclasses, deciding at runtime whether to use the fast dictionary-based serialization or fallback field-based serialization depending on the dataclass instance characteristics.
Fields
previous: &'a PyObjectSerializer
A reference to a previousPyObjectSerializerinstance representing the Python object to serialize.
Methods
pub fn new(previous: &'a PyObjectSerializer) -> Self
Constructs a newDataclassGenericSerializerwith a reference to a previous serializer.
Trait Implementations
impl Serialize for DataclassGenericSerializer<'_>fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
Implements the serialization logic:Checks recursion limit to prevent stack overflow.
Attempts to access the
__dict__attribute of the dataclass instance.If
__dict__is missing or if the class defines__slots__, it uses the fallback serializer.Otherwise, uses the fast serializer to serialize the dataclass based on its
__dict__.
Usage Example
let py_obj_serializer = PyObjectSerializer::new(py_dataclass_obj, state, default);
let dataclass_serializer = DataclassGenericSerializer::new(&py_obj_serializer);
let serialized = serde_json::to_string(&dataclass_serializer)?;
Struct: DataclassFastSerializer
Purpose
Implements a fast serialization path by serializing the dataclass instance directly from its `__dict__` attribute, which is a Python dictionary of attribute names to values.
Fields
ptr: *mut pyo3_ffi::PyObject
Raw pointer to the Python dictionary (__dict__).state: SerializerState
Serialization state to track recursion and configuration.default: Option<NonNull<pyo3_ffi::PyObject>>
Optional default serializer object for handling missing or default values.
Methods
pub fn new(ptr: *mut pyo3_ffi::PyObject, state: SerializerState, default: Option<NonNull<pyo3_ffi::PyObject>>) -> Self
Creates a new instance, copying the serialization state for recursive calls.
Trait Implementations
impl Serialize for DataclassFastSerializerfn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
Serializes the dictionary:Obtains the size of the dictionary.
If empty, uses a zero-length dictionary serializer.
Iterates through all key-value pairs:
Ensures keys are strings (raises error if not).
Skips keys starting with underscore
_(private attributes).Serializes each key-value pair using
PyObjectSerializer.
Ends the map serialization.
Important Details
Uses low-level Python C API macros/functions to iterate dictionary items safely.
Skips private fields prefixed with
_to mimic Python dataclass serialization behavior.Handles errors for non-string keys or invalid UTF-8 strings gracefully.
Usage Example
Internally used by `DataclassGenericSerializer`, but can be used standalone if you have the `__dict__` pointer:
let fast_serializer = DataclassFastSerializer::new(dict_ptr, state, default);
let serialized = serde_json::to_string(&fast_serializer)?;
Struct: DataclassFallbackSerializer
Purpose
Fallback serializer that serializes dataclass instances by iterating over the dataclass's declared fields (from `__dataclass_fields__`), rather than relying on `__dict__`. Used when the fast path is not viable (e.g., if `__dict__` is missing or class uses `__slots__`).
Fields
ptr: *mut pyo3_ffi::PyObject
Pointer to the dataclass instance.state: SerializerState
Serialization state for recursion control.default: Option<NonNull<pyo3_ffi::PyObject>>
Optional default serializer.
Methods
pub fn new(ptr: *mut pyo3_ffi::PyObject, state: SerializerState, default: Option<NonNull<pyo3_ffi::PyObject>>) -> Self
Creates a new fallback serializer copying the state for recursion.
Trait Implementations
impl Serialize for DataclassFallbackSerializerfn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
Serialization steps:Retrieves
__dataclass_fields__attribute from the object.If no fields, serializes an empty dictionary.
Iterates over each field:
Checks if field type matches expected dataclass Field type.
Skips private fields starting with
_.Retrieves attribute value from dataclass instance.
Serializes each key-value pair.
Ends the map serialization.
Important Details
Uses explicit attribute access rather than dictionary iteration.
Verifies that each field is a valid dataclass field by comparing its type pointer.
Skips private attributes consistently.
This slower path is more robust for dataclasses with no
__dict__or__slots__.
Implementation Details & Algorithms
Recursion Limit Check:
The generic serializer checks the recursion limit throughself.previous.state.recursion_limit()to avoid infinite recursion during serialization.Fast vs Fallback Decision Logic:
The generic serializer first tries to get the__dict__.If
__dict__is missing or the object uses__slots__, fallback serializer is used.Otherwise, fast dictionary iteration is used.
Dictionary Iteration:
Uses Python C API macros (pydict_next!) to efficiently iterate over dictionary entries, avoiding overhead of high-level Python calls.String Key Validation:
Keys are validated to be Python strings and convertible to Rust UTF-8 strings. Otherwise, serialization fails with appropriate errors.Private Attribute Skipping:
Attributes starting with an underscore_are skipped, aligning with typical Python dataclass serialization behavior.Reference Counting:
ExplicitPy_DECREFcalls are used to manage Python object reference counts correctly after attribute accesses, preventing memory leaks.
Integration with Other Components
Dependency on
PyObjectSerializer:
The file relies onPyObjectSerializerfor serializing Python objects recursively. This means it integrates deeply with the overall Python object serialization infrastructure.SerializerState:
TheSerializerStatetracks recursion depth and other serialization settings, ensuring consistent behavior across recursive serialization calls.ZeroDictSerializer:
Used for efficiently serializing empty dictionaries, avoiding unnecessary overhead.Python C API:
Uses raw pointers and FFI calls extensively to interact with Python objects directly for performance.Type References:
Uses constants likeDICT_STR,SLOTS_STR,DATACLASS_FIELDS_STRfor attribute names, facilitating consistent access to Python dataclass internals.
Mermaid Diagram: Class Structure
classDiagram
class DataclassGenericSerializer {
- previous: &PyObjectSerializer
+ new(previous: &PyObjectSerializer) DataclassGenericSerializer
+ serialize<S: Serializer>(serializer: S) -> Result<S::Ok, S::Error>
}
class DataclassFastSerializer {
- ptr: *mut PyObject
- state: SerializerState
- default: Option<NonNull<PyObject>>
+ new(ptr: *mut PyObject, state: SerializerState, default: Option<NonNull<PyObject>>) DataclassFastSerializer
+ serialize<S: Serializer>(serializer: S) -> Result<S::Ok, S::Error>
}
class DataclassFallbackSerializer {
- ptr: *mut PyObject
- state: SerializerState
- default: Option<NonNull<PyObject>>
+ new(ptr: *mut PyObject, state: SerializerState, default: Option<NonNull<PyObject>>) DataclassFallbackSerializer
+ serialize<S: Serializer>(serializer: S) -> Result<S::Ok, S::Error>
}
DataclassGenericSerializer --> DataclassFastSerializer : uses
DataclassGenericSerializer --> DataclassFallbackSerializer : uses
Summary
The [dataclass.rs](/projects/287/67677) file provides a highly optimized and robust mechanism to serialize Python dataclasses in Rust using Serde. It intelligently chooses between a fast dictionary-based serialization and a fallback field-based approach to handle various dataclass configurations, including those with `__slots__`. The file demonstrates careful use of Python C API, Rust safety practices, and serialization protocol adherence, making it a critical component in the Python-Rust serialization bridge.
Additional Notes
This file is likely part of a larger Python serialization framework implemented in Rust, which integrates tightly with Python's C API.
Error handling uses macros like
err!()andunlikely!()which are presumably defined elsewhere in the project to streamline error paths.The serializers are designed for internal use (
pub(crate)), indicating they are not public API but used within the crate only.