avx512.rs
Overview
The [avx512.rs](/projects/287/67733) file provides a highly optimized implementation of a string escaping function leveraging Intel AVX-512 vector instructions (specifically AVX512F, AVX512BW, AVX512VL, and BMI2 instruction sets). The function processes input strings in 32-byte chunks to efficiently detect and escape certain characters that require escaping in string literals, such as backslashes (`\`), double quotes (`"`), and control characters below space (`0x20`).
This file is intended for use in performance-critical serialization or formatting tasks, where escaping strings correctly and quickly is necessary (e.g., JSON encoding or similar). The use of explicit SIMD intrinsics enables parallel comparisons and memory operations, significantly accelerating the escaping process compared to scalar implementations.
Detailed Explanation
Function: format_escaped_str_impl_512vl
#[inline(never)]
#[target_feature(enable = "avx512f,avx512bw,avx512vl,bmi2")]
pub(crate) unsafe fn format_escaped_str_impl_512vl(
odst: *mut u8,
value_ptr: *const u8,
value_len: usize,
) -> usize
Purpose
This function formats and escapes a given input byte string, writing the escaped result to a destination buffer. It wraps the escaped content in double quotes (`"`), escaping backslash, quote, and control characters as needed.
Parameters
odst: *mut u8
A raw mutable pointer to the destination buffer where the escaped string will be written.value_ptr: *const u8
A raw pointer to the input byte string to be escaped.value_len: usize
Length of the input string in bytes.
Returns
usize
The total length in bytes of the escaped string written to the destination buffer, including the surrounding quotes.
Usage Example
let input = b"Hello \"world\"\n";
let mut buffer = vec![0u8; 64];
let len = unsafe {
format_escaped_str_impl_512vl(buffer.as_mut_ptr(), input.as_ptr(), input.len())
};
let escaped_str = std::str::from_utf8(&buffer[..len]).unwrap();
println!("{}", escaped_str); // Outputs: "Hello \"world\"\n"
Implementation Details
SIMD Vector Length: The function processes the input in chunks of 32 bytes (
STRIDE = 32), matching the width of the AVX2 256-bit registers used here, combined under AVX-512VL extensions.Character Masks:
Three SIMD vectors are initialized to detect characters that need escaping:blash: backslash (\, ASCII 0x5C)quote: double quote (", ASCII 0x22)x20: space (0x20), used as a threshold to detect control characters< 0x20
Processing Loop:
The input string is loaded into a 256-bit register (__m256i) using_mm256_loadu_epi8. The function then:Copies the chunk to the destination buffer.
Computes a mask of bytes equal to backslash or quote, or less than space.
If any such byte exists, it identifies the position (
cn) of the first such byte using atrailing_zeros!macro (likely counting trailing zeros in the mask).Writes an escape sequence for that byte using a
write_escape!macro.Advances pointers and counters accordingly.
Otherwise, advances by the full stride.
Tail Handling:
When fewer than 32 bytes remain, a masked load (_mm256_maskz_loadu_epi8) loads only the remaining bytes. The same detection and escape logic applies, respecting the valid byte count.Output Formatting:
The escaped string is wrapped in double quotes by writingb'"'at the start and end.Unsafe and Target Features:
The function is markedunsafedue to raw pointer manipulation and use of SIMD intrinsics. It requires CPU support for AVX-512F, AVX-512BW, AVX-512VL, and BMI2.
Macros and Helpers
trailing_zeros!andwrite_escape!are macros not defined in this file. Presumably:trailing_zeros!(mask)returns the index of the first set bit in the mask.write_escape!(char, dst)writes the escaped version of the given character todst.
These macros implement the finer details of escaping, such as writing sequences like `\\`, `\"`, or `\n`.
Interaction with Other Parts of the System
This file likely belongs to a module responsible for serialization or formatting (e.g., JSON serialization).
It provides a low-level optimized core routine used by higher-level string escaping functions.
The macros
trailing_zeros!andwrite_escape!indicate dependency on shared utilities or helper macros elsewhere in the codebase.The function is marked
pub(crate), indicating use internal to the crate, not exposed as a public API.The AVX-512 targeted nature suggests this is an optional, highly optimized fallback or enhancement path, potentially selected at runtime or compile-time based on CPU capabilities.
Important Implementation Notes
The function is carefully designed for performance-critical escaping:
It uses vectorized loads and stores to minimize memory operations.
Masking and bitwise operations detect escapable characters in parallel.
Early exit conditions avoid unnecessary processing.
The function assumes the caller provides a sufficiently sized destination buffer.
The use of unsafe code and raw pointers requires the caller to uphold safety invariants.
The function adds surrounding quotes, so the caller should not add them again.
The code uses AVX-512VL intrinsics with 256-bit vectors (
__m256i), leveraging VL extensions to use 256-bit vectors with AVX-512 instructions.
Mermaid Diagram - Function Workflow Flowchart
flowchart TD
A[Start: Initialize pointers and constants] --> B{Input length >= 32 bytes?}
B -- Yes --> C[Load 32 bytes into SIMD register]
C --> D[Store bytes to destination]
D --> E[Compare bytes to '\\', '\"', and < 0x20]
E --> F{Mask != 0?}
F -- Yes --> G[Find first escapable byte position]
G --> H[Write escape sequence]
H --> I[Update pointers and counters]
I --> B
F -- No --> J[Advance pointers by 32]
J --> B
B -- No --> K[Handle tail bytes with masked load]
K --> L[Store tail bytes]
L --> M[Compare tail bytes as before]
M --> N{Mask != 0?}
N -- Yes --> O[Find first escapable tail byte position]
O --> P[Write escape sequence]
P --> Q[Update pointers and counters]
Q --> K
N -- No --> R[Advance pointers by remaining bytes]
R --> S[Write closing quote]
S --> T[Return total written length]
Summary
The [avx512.rs](/projects/287/67733) file implements an AVX-512 optimized string escaping function, processing input strings in 32-byte chunks to detect and escape special characters efficiently. It is a low-level, unsafe, internal utility carefully leveraging SIMD instructions and vector masks for high-performance serialization tasks, especially useful in scenarios demanding fast JSON or string encoding on modern Intel CPUs supporting AVX-512 features.