fixed_size_hash_set.rs
Overview
The fixed_size_hash_set.rs file defines a generic data structure named FixedSizeHashSet<T>, which implements a hash set with a fixed capacity. This structure maintains a collection of unique elements (T) with a limited size and supports eviction of the oldest inserted elements when the capacity is reached. It is optimized for rapid membership checks and insertion, while automatically managing storage size by removing older entries in a round-robin fashion.
This file targets scenarios where a bounded set of unique items must be maintained efficiently, such as caching finalized blocks in a blockchain system or managing a limited memory footprint for sets of identifiers.
Detailed Descriptions
FixedSizeHashSet<T> Struct
Purpose
FixedSizeHashSet<T> is a fixed-capacity hash set that stores unique elements with efficient membership queries, and evicts the oldest elements in insertion order when the capacity is exceeded.
Fields
capacity: usize
The maximum number of elements this set can hold.blocks: HashSet<T>
The underlying hash set that stores the unique elements.eviction_order: Vec<T>
Vector tracking the order of inserted elements to determine which element to evict next.eviction_order_cursor: usize
An index pointer intoeviction_orderindicating the next element to be evicted.
Type Constraints
Tmust implement the traitsClone,Hash, and Eq to support hashing, equality comparisons, and cloning for eviction management.
Implementation Details
new(capacity: usize) -> FixedSizeHashSet<T>
Creates a new FixedSizeHashSet with the specified capacity.
Parameters:
capacity— The maximum number of elements the set can store.Returns:
A new instance ofFixedSizeHashSet<T>with pre-allocated internal storage according to the capacity.Usage Example:
let mut set = FixedSizeHashSet::<u32>::new(100);
contains(&self, id: &T) -> bool
Checks if the set contains the specified element.
Parameters:
id— A reference to the element to check for membership.Returns:
trueif the element is present; otherwisefalse.Usage Example:
if set.contains(&42) { println!("Element found"); }
insert(&mut self, id: T)
Inserts a new element into the set. If the element already exists, the method returns immediately without altering the set. If insertion causes the set to exceed its capacity, the oldest element (based on insertion order) is evicted to make room for the new element.
Parameters:
id— The element to insert.Returns:
Nothing. The method modifies the internal state of the set.Eviction Algorithm:
If the current number of elements is less than capacity, the element is simply added.
If the set is at capacity:
The element at the current
eviction_order_cursoris evicted.The new element replaces the evicted element's position in
eviction_order.The evicted element is removed from the hash set.
The new element is inserted.
The eviction cursor advances circularly (wraps around to zero when reaching capacity).
Usage Example:
set.insert(42);
Debug Trait Implementation
The file implements the std::fmt::Debug trait for FixedSizeHashSet<T>, which allows formatted debugging output.
Behavior:
Displays the current number of elements and the maximum capacity in the format:FixedSizeHashSet(current_size/capacity)Example Output:
FixedSizeHashSet(10/100)
Important Implementation Notes
The eviction strategy uses a circular buffer (
Vec<T>) indexed byeviction_order_cursorto track insertion order and determine which element to evict. This avoids expensive shifting operations and ensures O(1) eviction time.Elements are only evicted when the capacity is reached, making insertions efficient for sets smaller than capacity.
The set relies on
HashSetfor O(1) average membership checks.The comment in the file mentions a tradeoff: it does not clean up finalized blocks based on descendants but relies on the fixed capacity for safety and performance. This reflects a design decision prioritizing speed and simplicity over more complex dependency-based cleanup.
Interaction with Other System Components
This file is designed to serve as a utility for managing collections of unique items with bounded memory. It is likely used in contexts where:
A limited cache or window of identifiers (e.g., block hashes, transaction IDs) must be maintained.
Fast membership queries are needed without unbounded memory growth.
Items can be safely evicted after a certain time or number of insertions, assuming no further dependencies.
Interaction with storage or blockchain logic would involve:
Inserting finalized block IDs into the set.
Querying if a block is already recorded.
Relying on eviction to remove old blocks after descendants finalize, as mentioned in the file comment.
Mermaid Diagram: Structure of FixedSizeHashSet<T>
classDiagram
class FixedSizeHashSet~T~ {
-capacity: usize
-blocks: HashSet~T~
-eviction_order: Vec~T~
-eviction_order_cursor: usize
+new(capacity: usize): Self
+contains(id: &T): bool
+insert(id: T)
+fmt(f: &mut Formatter): fmt::Result
}
The diagram shows the internal fields and the main public methods of the FixedSizeHashSet<T> struct. The generic type T is constrained by traits for hashing, equality, and cloning.