mod.rs
Overview
This file implements a Load Balancing Service for managing and optimizing thread workloads in a multi-threaded environment. It monitors the load on each thread, decides whether a thread should continue as is, be split into smaller units, or be collapsed (merged) with others to maintain balanced resource utilization. The service uses statistical load metrics aggregated over a configurable window and employs heuristics based on load thresholds and load disproportions.
Core functionalities include:
Tracking and updating thread load metrics.
Making split or collapse proposals to balance load across threads.
Interacting with thread lifecycle events such as starting and stopping threads.
This file is part of the system managing thread distribution and load balancing, interfacing with the thread tracking service, metrics collection, thread tables, and block production components.
Constants
MAX_LOAD_DISPROPORTION: Load = 2
Defines the maximum allowed load disproportion between threads before triggering a split or collapse action. It must not be less than 2 to avoid infinite splitting/collapsing cycles. This constant constrains the heuristic balancing logic.
Data Structures
Proposal
pub struct Proposal {
pub proposed_threads_table: ThreadsTable,
}
Represents a proposal for changing the threads layout, containing the proposed new threads table after a split or collapse operation.
Fields:
proposed_threads_table: The new configuration of thread partitions after the proposed change.
ThreadAction
pub enum ThreadAction {
ContinueAsIs,
Split(Proposal),
Collapse(Proposal),
}
Enumerates possible actions the load balancer can decide for a thread:
ContinueAsIs: No change needed for this thread.Split(Proposal): Propose splitting the thread into smaller threads.Collapse(Proposal): Propose collapsing (merging) the thread with others.
CheckError
pub enum CheckError {
StatsAreNotReady,
ThreadIsNotInTheTable,
}
Represents possible error states when checking thread load for balancing:
StatsAreNotReady: Load metrics for the thread are not yet available.ThreadIsNotInTheTable: The thread is not found in the current threads table.
Main Struct: LoadBalancingService
pub struct LoadBalancingService {
thread_load_map: HashMap<ThreadIdentifier, AggregatedLoad>,
metrics: Option<BlockProductionMetrics>,
window_size: usize,
load_threshold: Load,
}
Manages load balancing logic by maintaining load statistics for each thread, deciding when to split or merge threads based on load metrics.
Fields:
thread_load_map: Maps each thread to its aggregated load statistics.metrics: Optional metrics collector for reporting thread load.window_size: Number of blocks used for load aggregation window.load_threshold: Load threshold above which splitting is considered.
Implementation Details and Methods
start
pub fn start(
metrics: Option<BlockProductionMetrics>,
window_size: usize,
load_threshold: usize,
) -> Self
Initializes a new LoadBalancingService instance.
Parameters:
metrics: Optional metrics object for reporting.window_size: Size of the load aggregation window.load_threshold: Load value threshold for splitting threads.
Returns: A new
LoadBalancingServiceinstance with an empty load map.
Usage Example:
let service = LoadBalancingService::start(Some(metrics), 50, 100);
check
pub fn check(
&mut self,
block_identifier: &BlockIdentifier,
thread_identifier: &ThreadIdentifier,
threads_table: &ThreadsTable,
max_table_size: usize,
) -> anyhow::Result<ThreadAction, CheckError>
Evaluates the current load state of a thread and decides what action to take: continue, split, or collapse.
Parameters:
block_identifier: Identifier of the current block being processed.thread_identifier: The thread to check.threads_table: Current threads table representing thread partitions.max_table_size: Maximum allowed number of threads in the table.
Returns:
ThreadActionspecifying the balancing decision or aCheckError.Behavior:
Reads the current load of the specified thread.
Scans all threads to find the minimum and maximum loads and their thread IDs.
Reports current load to metrics if enabled.
Depending on the current number of threads and load distribution:
If number of threads exceeds
max_table_size, attempts to merge threads by invokingthreads_merge::try_threads_merge.If maximum load exceeds
load_threshold, attempts to split the overloaded thread by invokingthreads_split::try_threads_split.Otherwise, continues operation without changes.
Error Handling:
Fails if thread load stats are not ready or thread is missing from the table.
handle_block_finalized
pub fn handle_block_finalized<TOptimisticState>(
&mut self,
block: &AckiNackiBlock,
block_state: Arc<TOptimisticState>,
) where
TOptimisticState: OptimisticState,
Updates thread load statistics upon block finalization.
Parameters:
block: The finalized block.block_state: Shared reference to the optimistic state of the block.
Behavior:
If the block causes a thread split (detected via threads table and spawning check), resets the load of the thread.
Otherwise, appends new load data from the block to the corresponding thread's aggregated load.
Panics if the thread is not found in the load map, indicating a serious error.
read_load
fn read_load(&self, thread_identifier: &ThreadIdentifier) -> anyhow::Result<Load, CheckError>
Retrieves the current load value of a thread from the aggregated load map.
Parameters:
thread_identifier: Thread to query.
Returns: The load value if ready; otherwise returns a
CheckError.Notes: This method ensures the load is ready before returning it.
Trait Implementation
Subscriber for LoadBalancingService
Implements subscription to thread lifecycle events.
handle_start_thread
fn handle_start_thread(
&mut self,
_parent_split_block: &BlockIdentifier,
thread_identifier: &ThreadIdentifier,
threads_table: Option<ThreadsTable>,
)
Handles the event of a new thread starting.
Resets the load statistics of the parent thread after splitting.
Inserts a new entry in
thread_load_mapfor the new thread with a fresh aggregation window.Parameters:
_parent_split_block: Block that caused the split (unused here).thread_identifier: Identifier of the newly started thread.threads_table: Optional threads table to find parent thread information.
Notes: This ensures load tracking is prepared immediately for new threads.
handle_stop_thread
fn handle_stop_thread(
&mut self,
_last_thread_block: &BlockIdentifier,
thread_identifier: &ThreadIdentifier,
)
Handles the event of a thread stopping.
Removes the thread's load entry from
thread_load_map.Parameters:
_last_thread_block: The last block for the stopped thread (unused here).thread_identifier: Identifier of the thread being stopped.
Interaction with Other Components
Threads Table (
ThreadsTable): Used as the authoritative source of current thread partitions and their bitmasks.AggregatedLoad (
aggregated_thread_load): Encapsulates load metrics aggregation and readiness checking.Thread Splitting and Merging (
threads_split,threads_merge): Submodules invoked to generate proposals for splitting or collapsing threads based on load.Block Production Metrics (
BlockProductionMetrics): Optional metrics collector for reporting load statistics.Optimistic State (
OptimisticState): Provides block state information to update load metrics.Thread Tracking Service (
threads_tracking_service::Subscriber): The load balancer subscribes to thread lifecycle events to keep its internal state consistent.
Important Implementation Details
Load Aggregation Window: Load is aggregated over a configurable number of blocks (
window_size) to smooth transient load spikes.Load Threshold: Only threads exceeding
load_thresholdare considered for splitting.Load Disproportion Limit: The constant
MAX_LOAD_DISPROPORTIONprevents oscillations by limiting how much load difference triggers splitting or merging.Thread Identification: Thread IDs and bitmasks are used extensively to map and identify thread partitions.
Metrics Reporting: Optional metrics integration allows external monitoring of thread loads.
Error Handling: Robust error handling ensures only threads present in the table with ready statistics are processed.
Resetting Load on Split: When a thread spawns new threads (split), its load statistics are reset to avoid skewing balancing decisions.
Visual Diagram
classDiagram
class LoadBalancingService {
-thread_load_map: HashMap
-metrics: Option<BlockProductionMetrics>
-window_size: usize
-load_threshold: Load
+start()
+check()
+handle_block_finalized()
-read_load()
}
class Proposal {
+proposed_threads_table: ThreadsTable
}
class ThreadAction {
<<enum>>
+ContinueAsIs
+Split
+Collapse
}
class CheckError {
<<enum>>
+StatsAreNotReady
+ThreadIsNotInTheTable
}
LoadBalancingService ..> Proposal : uses
LoadBalancingService ..> ThreadAction : returns
LoadBalancingService ..> CheckError : returns
LoadBalancingService ..> AggregatedLoad : manages
LoadBalancingService ..> BlockProductionMetrics : optionally reports
LoadBalancingService ..> ThreadsTable : reads
LoadBalancingService ..> OptimisticState : updates load from block state
LoadBalancingService ..|> Subscriber : implements
This class diagram illustrates the main data structures and relationships within the file, highlighting the primary class LoadBalancingService, its dependencies, and key enums used for control flow.