list.rs
Overview
The `list.rs` file is a core utility within the serialization module of the project, responsible for efficiently serializing Python list and tuple objects into a format compatible with Serde serializers. It provides specialized serializers that handle Python lists and tuples by directly accessing their underlying C-level data structures via the PyO3 FFI (Foreign Function Interface). This approach enables zero-copy or minimal-copy serialization, improving performance when converting Python sequences to serialized data formats (e.g., JSON).
Key functionalities include:
Serializing empty lists efficiently using
ZeroListSerializer.Serializing Python lists and tuples by iterating over their elements and delegating serialization based on element types via
ListTupleSerializer.Handling recursive serialization with recursion depth checks to avoid stack overflows.
Integrating tightly with other serializers for Python built-in types and custom types like dataclasses, enums, NumPy arrays, etc.
The file acts as a bridge between the Python runtime's internal object representation and Rust's Serde serialization framework, enabling seamless and performant serialization of Python sequence objects.
Detailed Explanation of Components
Struct: ZeroListSerializer
pub(crate) struct ZeroListSerializer;
**Purpose:** A lightweight serializer for empty Python lists or tuples. It serializes these sequences as the byte string `b"[]"`, representing an empty JSON array or equivalent.
**Key Methods:**
new() -> Self
Aconst fnconstructor returning an instance ofZeroListSerializer.serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
Implements the SerdeSerializetrait. Serializes an empty list as raw bytesb"[]".
**Usage Example:**
let zero_list = ZeroListSerializer::new();
let serialized = serde_json::to_string(&zero_list).unwrap();
assert_eq!(serialized, "[]");
Struct: ListTupleSerializer
pub(crate) struct ListTupleSerializer {
data_ptr: *const *mut pyo3_ffi::PyObject,
state: SerializerState,
default: Option<NonNull<pyo3_ffi::PyObject>>,
len: usize,
}
**Purpose:** Serializes non-empty Python list or tuple objects by accessing their underlying array of PyObject pointers and serializing each element according to its type.
**Fields:**
data_ptr: Raw pointer to the internal array of PyObject pointers (elements of the list/tuple).state:SerializerStateinstance that maintains context, options, and recursion state during serialization.default: Optional pointer to a Python object used as a default fallback during serialization (e.g., for missing keys).len: Number of elements in the list or tuple.
Constructors
from_list(ptr: *mut pyo3_ffi::PyObject, state: SerializerState, default: Option<NonNull<pyo3_ffi::PyObject>>) -> SelfCreates a
ListTupleSerializerfrom a Python list object pointer.
Validates the object type is a list or subclass thereof.
Extracts the internal pointer to the list items and length.from_tuple(ptr: *mut pyo3_ffi::PyObject, state: SerializerState, default: Option<NonNull<pyo3_ffi::PyObject>>) -> SelfSimilar to
from_list, but for Python tuple objects.
Method: serialize
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
Implements Serde's `Serialize` trait.
**Workflow:**
Checks recursion depth via
self.state.recursion_limit()to prevent infinite recursion.Asserts the length is at least 1 (empty sequences are handled by
ZeroListSerializer).Begins serializing a sequence with no fixed length (
serialize_seq(None)).Iterates over each element in the Python list/tuple:
Uses
pyobject_to_obtypeto determine the Python object's type.Matches on the detected
ObTypeand delegates to the appropriate serializer, such as:StrSerializerfor strings.IntSerializerfor integers.DictGenericSerializerfor dicts.ListTupleSerializerrecursively for nested lists/tuples.Other serializers for dates, times, UUIDs, enums, numpy arrays, fragments, and unknown objects.
Ends the sequence serialization.
**Error Handling:**
Returns
SerializeError::RecursionLimitif recursion limit exceeded.Propagates Serde serialization errors from child serializers.
Important Implementation Details
Unsafe Pointer Manipulation:
The serializers directly manipulate raw pointers (*const *mut PyObject) to access Python objects in lists or tuples. This is done for performance reasons but requires careful validation and use ofunsafeblocks.Recursion Control:
Prevents stack overflow during recursive serialization by limiting recursion depth (self.state.recursion_limit()).Type Dispatching:
Usespyobject_to_obtypeto identify Python object types and dispatch to specialized serializers, enabling correct and efficient serialization of heterogeneous sequences.Handling Empty Sequences:
Empty lists and tuples are serialized usingZeroListSerializerto produce a minimal serialized representation.Integration with Other Serializers:
Relies on other serializers fromper_typemodule (e.g., for strings, ints, dicts, numpy arrays), maintaining modularity and separation of concerns.
Interaction with Other System Components
serialize::obtypeModule:
Providespyobject_to_obtypeandObTypeenum to identify the Python object types.serialize::per_typeModule:
The file imports numerous serializers for specific Python types (e.g.,StrSerializer,IntSerializer,DictGenericSerializer) which are used to serialize individual elements within lists/tuples.serialize::state::SerializerState:
Maintains serialization options and recursion state, ensuring consistent behavior across nested serialization calls.serialize::serializer::PyObjectSerializer:
Used as a generic fallback serializer for unknown types or complex nested objects such as dataclasses or enums.PyO3 FFI Types:
Uses raw pointers and PyO3's FFI bindings (pyo3_ffi::PyObject,PyListObject,PyTupleObject) to access Python objects at the C level.Serde Serialization Framework:
Implements theSerializetrait from Serde, enabling seamless integration with Serde-based serializers like JSON, CBOR, MessagePack, etc.
Usage Example
use pyo3_ffi::PyObject;
use crate::serialize::state::SerializerState;
use crate::serialize::list::ListTupleSerializer;
// Assume ptr is a valid pointer to a Python list object obtained from Python runtime
let py_list_ptr: *mut PyObject = /* obtained from Python runtime */;
let state = SerializerState::new(/* options */);
let default = None;
let serializer = ListTupleSerializer::from_list(py_list_ptr, state, default);
let json = serde_json::to_string(&serializer).unwrap();
println!("Serialized Python list: {}", json);
Mermaid Class Diagram
classDiagram
class ZeroListSerializer {
+new() ZeroListSerializer
+serialize<S>(serializer: S) Result<S::Ok, S::Error>
}
class ListTupleSerializer {
-data_ptr: *const *mut PyObject
-state: SerializerState
-default: Option<NonNull<PyObject>>
-len: usize
+from_list(ptr: *mut PyObject, state: SerializerState, default: Option<NonNull<PyObject>>) ListTupleSerializer
+from_tuple(ptr: *mut PyObject, state: SerializerState, default: Option<NonNull<PyObject>>) ListTupleSerializer
+serialize<S>(serializer: S) Result<S::Ok, S::Error>
}
ListTupleSerializer ..|> Serialize
ZeroListSerializer ..|> Serialize
Summary
The `list.rs` file provides high-performance serializers for Python list and tuple objects by leveraging direct access to the Python C API's internal data structures. It handles complex nested sequences with recursion control and delegates element serialization to specialized serializers, ensuring accurate and efficient serialization of heterogeneous Python sequences. This component is essential for converting Python sequences into formats compatible with Rust's Serde ecosystem, bridging Python's dynamic typing with Rust's strongly-typed serialization framework.