Compiler Optimization Recipes for Performance Engineers
MTA
Advanced optimization techniques, profiling strategies, and transformation recipes that improve runtime performance
2nd Edition
This book provides a comprehensive guide for systems programmers and performance engineers to optimize software by bridging the gap between high-level code and modern microarchitecture. It emphasizes a data-driven, systematic workflow known as Measure, Analyze, Hypothesize, Transform, and Verify (MAHTV). By standardizing on robust benchmarking and profiling methodologies, the text demonstrates how to identify bottlenecks related to CPU pipeline stalls, memory hierarchy latencies, and branch mispredictions before applying specific optimization "recipes."
The technical core of the book explores the inner workings of compilers, specifically focusing on intermediate representations (IR) like Static Single Assignment (SSA) form and various analysis passes. Detailed chapters cover scalar optimizations—such as constant propagation, common subexpression elimination (CSE), and global value numbering (GVN)—alongside critical inter-procedural techniques like inlining and devirtualization. These chapters teach engineers how to structure code to make it "analyzable" for the compiler, ensuring that high-level abstractions do not hinder the generation of efficient machine code.
A significant portion of the book is dedicated to loop and memory optimizations, which often dominate runtime performance. It provides practical instructions for loop unrolling, interchange, fusion, tiling, and strip-mining to enhance cache locality and instruction-level parallelism. The text offers an in-depth look at vectorization (SIMD), explaining the cost models compilers use to decide between scalar and vector paths. It also addresses the complexities of modern multi-core systems, offering strategies to mitigate false sharing and optimize for Non-Uniform Memory Access (NUMA) architectures.
The final section moves beyond local code changes to address whole-program strategies and automated tuning. It covers Profile-Guided Optimization (PGO) and Link-Time Optimization (LTO) as methods to break down translation-unit barriers and utilize real-world execution data for better heuristics. The book concludes with domain-specific patterns for Machine Learning, Graphics, and DSP, alongside a guide to building automated performance guardrails in CI/CD pipelines. Through real-world case studies, the book illustrates how these diverse techniques compose to deliver substantial, measurable gains in complex software systems.
This book is specifically designed for systems programmers, performance engineers, and C++/Rust developers who need to extract maximum efficiency from high-performance codebases. It is an essential resource for those working on latency-sensitive or throughput-heavy applications such as scientific simulations, graphics engines, and high-frequency microservices. Readers who want to transition from writing merely 'correct' code to mastering the dialogue between modern compilers and complex CPU microarchitectures will find this content most beneficial.
MixCache.com
View booksJanuary 14, 2026
62,060 words
4 hours 21 minutes
Get unlimited access to this book + all MixCache.com books for $11.99/month
Subscribe to MTAOr purchase this book individually below
$6.99 USD
Click to buy this ebook:
Buy NowFull ebook will be available immediately
- read online or download as a PDF file.
Full ebook will be available immediately
- read online or download as a PDF file.
$5 account credit for all new MixCache.com accounts!
Have a question about the content? Ask our AI assistant!
Start by asking a question about "Compiler Optimization Recipes for Performance Engineers"
Example: "Does this book mention William Shakespeare?"
Thinking...