scalar.rs
Overview
The [scalar.rs](/projects/287/67735) file provides low-level utilities for formatting and escaping byte strings in a scalar (non-SIMD) context within the system. Its primary function is to safely transform raw byte sequences into escaped, quoted string representations suitable for output or serialization, particularly when certain characters require escaping.
This file defines a macro and a core unsafe function that operate on raw pointers to bytes, applying character escaping rules and writing the output into a destination buffer. It is specifically designed for platforms without x86_64 architecture or when the `generic_simd` feature is disabled, indicating a fallback scalar implementation for string escaping.
Detailed Explanation
Macro: impl_format_scalar!
macro_rules! impl_format_scalar {
($dst:expr, $src:expr, $value_len:expr) => {
unsafe {
for _ in 0..$value_len {
core::ptr::write($dst, *($src));
$src = $src.add(1);
$dst = $dst.add(1);
if *super::escape::NEED_ESCAPED.get_unchecked(*($src.sub(1)) as usize) != 0 {
$dst = $dst.sub(1);
write_escape!(*($src.sub(1)), $dst);
}
}
}
};
}
Purpose
Iterates over a sequence of bytes (
$src) of length$value_len.Copies each byte to the destination pointer (
$dst).Checks if the byte requires escaping by consulting a lookup table
NEED_ESCAPED.If escaping is needed, rewinds the destination pointer by one and invokes
write_escape!to write the escaped version of the byte.
Parameters
$dst: A mutable destination pointer (*mut u8) where the output bytes are written.$src: A source pointer (*const u8) from which input bytes are read.$value_len: The number of bytes to process.
Implementation Details
Uses unsafe pointer arithmetic and raw writes to maximize performance.
Relies on an external constant lookup table
NEED_ESCAPEDfor escape detection.Uses the macro
write_escape!(defined elsewhere) to perform the actual byte escaping.
Usage Example
// Assuming `dst_ptr` and `src_ptr` are valid pointers and `len` is the length of source
impl_format_scalar!(dst_ptr, src_ptr, len);
Function: format_escaped_str_scalar
#[inline(never)]
#[cfg(all(not(target_arch = "x86_64"), not(feature = "generic_simd")))]
pub(crate) unsafe fn format_escaped_str_scalar(
odst: *mut u8,
value_ptr: *const u8,
value_len: usize,
) -> usize {
let mut dst = odst;
let mut src = value_ptr;
core::ptr::write(dst, b'"');
dst = dst.add(1);
impl_format_scalar!(dst, src, value_len);
core::ptr::write(dst, b'"');
dst = dst.add(1);
dst as usize - odst as usize
}
Purpose
Formats a raw byte string by surrounding it with double quotes (
") and escaping necessary characters inside.This function operates in a scalar manner (no SIMD optimizations) and is only compiled on non-x86_64 architectures or when SIMD support is disabled.
Parameters
odst: *mut u8- Destination pointer where the formatted string will be written.value_ptr: *const u8- Source pointer to the raw byte string.value_len: usize- Length of the input byte string.
Returns
usize- The total number of bytes written to the destination buffer.
Implementation Details
Writes an opening quote
"at the destination.Calls the
impl_format_scalar!macro to copy and escape the input bytes.Writes a closing quote
"at the end.Calculates and returns the total written length by pointer arithmetic.
Usage Example
// Unsafe context required due to raw pointer usage
unsafe {
let input = b"Hello\nWorld";
let mut buffer = [0u8; 32];
let written = format_escaped_str_scalar(buffer.as_mut_ptr(), input.as_ptr(), input.len());
let result_str = std::str::from_utf8(&buffer[..written]).unwrap();
// result_str == "\"Hello\\nWorld\""
}
Important Implementation Details
Unsafe Code: Both the macro and the function use unsafe Rust code with raw pointers for performance and low-level memory manipulation.
Escape Lookup: Uses a static lookup table
NEED_ESCAPED(expected to be a byte slice/array) from a sibling moduleescapeto determine if a character needs escaping without branching on character values.Escape Writing: Delegates actual escaped character writing to a macro
write_escape!, which is not defined in this file but is assumed to write the appropriate escape sequence for the given byte.Platform-Specific Compilation: The function is conditionally compiled only when the architecture is not x86_64 and when no generic SIMD optimization is enabled. This ensures optimized SIMD versions can exist in other parts of the codebase.
Interactions with Other Parts of the System
super::escapeModule: This module provides theNEED_ESCAPEDlookup table. The scalar formatter depends on this table to identify which characters require escaping.write_escape!Macro: Used within the macro to write escaped bytes, presumably defined elsewhere in the project.SIMD Optimized Counterparts: The conditional compilation hints that other files or modules provide SIMD-accelerated versions of string escaping, and this file offers a fallback scalar implementation.
Serialization/Output Routines: This file likely integrates into larger string formatting or serialization pipelines that require safe, escaped string output (e.g., JSON serialization).
Visual Diagram
flowchart TD
A[format_escaped_str_scalar] --> B[Write opening quote '"']
B --> C[impl_format_scalar! macro]
C --> D[For each byte in input]
D --> E{Needs escaping?}
E -- No --> F[Copy byte to dst]
E -- Yes --> G[Undo copy]
G --> H[write_escape! macro]
F --> I[Next byte]
H --> I
I --> J[Loop until all bytes processed]
J --> K[Write closing quote '"']
K --> L[Return total bytes written]
Summary
The [scalar.rs](/projects/287/67735) file implements scalar (non-SIMD) string escaping utilities for raw byte strings, focusing on performance through unsafe pointer manipulation and lookup tables. It provides a macro for byte-wise copying with escape handling and a function to wrap and format strings with quotes and escapes. The file is a fallback implementation designed for platforms where SIMD acceleration is unavailable, integrating closely with other modules for escape data and escape writing logic.