Operations 8 min read

Profiling Rust Applications with macOS Instruments Time Profiler

This article explains how to use the macOS Instruments Time Profiler to perform CPU‑time profiling of Rust programs, demonstrates a sample π‑calculation benchmark, shows the required Cargo configuration, walks through recording and inspecting trace files, and applies the method to diagnose performance regressions in the Rspack project.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Profiling Rust Applications with macOS Instruments Time Profiler

After submitting a large PR to improve SourceMapDevToolPlugin in Rspack, the author observed a significant performance regression and decided to investigate using macOS Instruments.

Instruments, built into Xcode, provides a suite of analysis tools; the article focuses on the Time Profiler, which samples stack traces at configurable intervals (e.g., 1 ms) to reveal where CPU time is spent without heavily impacting the program.

To generate meaningful data, a Rust benchmark that approximates π using the Leibniz series is prepared. The source code is deliberately marked with #[inline(never)] to keep the function visible in the profile:

#[inline(never)]
fn calculate_pi(iterations: u64) -> f64 {
    let mut pi: f64 = 0.0;
    let mut denominator: f64 = 1.0;
    for i in 0..iterations {
        if i % 2 == 0 {
            pi += 4.0 / denominator;
        } else {
            pi -= 4.0 / denominator;
        }
        denominator += 2.0;
    }
    pi
}

fn main() {
    let pi = calculate_pi(1_000_000_000);
    println!("Calculated Pi is: {}", pi);
}

The Cargo.toml is configured to emit debug information in release builds:

[profile.release]
debug = 1 # enable debug info
strip = false # keep symbols

Running the binary with cargo run --release ensures the profile reflects the optimized code that end users will execute.

To record a profile, the following command is used:

xcrun xctrace record --template 'Time Profile' --output ./output.trace --launch -- /path/to/your/rust/project/target/release/your_binary

After the trace is generated, it can be opened with open ./output.trace , which launches Instruments and displays the sampled CPU usage. In the example, the calculate_pi function consumes about 4.90 seconds (≈99.9% of total CPU time), clearly identifying the hotspot.

The same technique is applied to the Rspack codebase. Comparing the main branch with a development branch revealed that the process_assets_stage_dev_tooling method took 2.22 seconds versus 1.06 seconds, primarily due to the source.map call inside a filter_map iteration.

The regression stemmed from unintentionally replacing a parallel iterator ( par_iter from Rayon) with a sequential iter , eliminating multi‑threaded execution of source.map . Restoring par_iter eliminated the slowdown and restored benchmark numbers.

In conclusion, the author reflects that while Rust offers high performance, achieving optimal results requires deep understanding of its tooling and concurrency primitives.

References:

Rust Profiling with Instruments and FlameGraph on macOS (CPU/Time)

Apple Instruments Help

Rspack Development Guide – Profiling

performanceRustRspackmacOSprofilinginstrumentstime-profiler
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.