profile
Overview
The `profile` file is a shell script designed to perform performance profiling on a specific executable within the project. It uses the Linux `perf` tool to record and analyze the performance metrics of the executable `./bench/run_func` when run with user-supplied arguments.
The script automates the process, making it easier for developers to gather profiling data, specifically call graphs, and subsequently generate a detailed performance report focused on the most significant functions.
Detailed Explanation
Script Purpose
Run the executable
./bench/run_funcwith supplied command-line arguments.Collect profiling data using
perf recordwith call graph information.Generate a performance report with a filter to show only functions contributing at least 0.1% of the total CPU cycles.
This automates profiling runs, especially useful during benchmarking or optimization phases.
Usage
./profile data/citm_catalog.json.xz loads
The example above profiles the executable with the arguments
data/citm_catalog.json.xz loads.
Script Content Breakdown
#!/bin/sh -e
Shebang line specifying the script is executed with
sh.-eoption: script exits immediately if any command returns a non-zero status.
perf record -g --delay 250 ./bench/run_func "$@"
Runs
perf recordwith the following options:-g: Enables call graph (stack trace) recording.--delay 250: Waits 250 milliseconds before starting to record (to allow the process to initialize).
./bench/run_func "$@"runs the target executable passing all parameters given to the script.
perf report --percent-limit 0.1
Generates a performance report from the recorded data.
--percent-limit 0.1filters the report to show functions that consume at least 0.1% of CPU cycles, focusing on the most impactful functions.
Parameters
The script forwards all command-line arguments (
"$@") to./bench/run_func.Example arguments:
data/citm_catalog.json.xz— likely a data file input.loads— possibly a command or mode for the executable.
Return Values
The script has no explicit return values.
It exits with status
0on success or a non-zero status if any command fails (due toset -e).
Usage Example
./profile data/citm_catalog.json.xz loads
This command profiles the `run_func` executable with the specified data file and the `loads` argument, producing a performance report focused on the most significant CPU consumers.
Implementation Details and Algorithms
The script leverages the Linux
perftool, which utilizes hardware performance counters to capture detailed CPU profiling data.The
-goption collects call graphs, enabling the identification of call stacks responsible for CPU usage.The 250ms delay avoids recording startup overhead or system noise.
The report filters out functions with negligible CPU usage to improve readability.
Interaction with Other Parts of the System
This script interacts primarily with:
./bench/run_func: The executable being profiled.Linux
perftool: for recording and reporting.
It serves as a utility to assist developers and performance engineers in profiling the benchmark or workload represented by
run_func.Input data files (e.g.,
data/citm_catalog.json.xz) and commands (loads) are passed through to the executable.
Diagram: Workflow of the profile Script
flowchart TD
A[Start: Run profile script] --> B[Parse command-line arguments]
B --> C[Execute perf record -g --delay 250 ./bench/run_func with args]
C --> D[perf collects call graph data during execution]
D --> E[perf record ends after run_func completes]
E --> F[Generate perf report --percent-limit 0.1]
F --> G[Display performance report to user]
G --> H[End]
Summary
The
profilescript automates running performance profiling on therun_funcexecutable.It captures detailed CPU usage with call graphs and produces filtered reports highlighting hotspots.
It is a simple yet powerful utility facilitating performance optimization tasks within the project.
If you need to profile other executables or add further customization (e.g., different perf options), this script can be easily modified to accommodate those needs.