LLVM Logo

I attended EuroLLVM 2026 for work, since I write static analysis tools, and also because of my passion for compiler engineering. I felt welcomed by the LLVM community, and had a lot of interesting conversations with people from all around the globe, coming together to share their enthusiasm for compilers and for LLVM.

Looking back at the talks that stayed with me the most, they share a common thread: each one is about the cost of an assumption breaking somewhere downstream. Whether the assumption lives in a test suite, in an alias query, at a function boundary, in the ISA, or in a fusion pass, the interesting work is in catching it before someone else pays for it. That is the lens through which I'd like to recap the five sessions that mattered most to me.

The Keynote: Why Testing LLVM Is Hard

Reid Kleckner's keynote on testing LLVM at scale set the tone for the conference. The framing was deceptively simple: LLVM is foundational software, programming languages are a rich API surface, and semantics are the contract. When you change something in LLVM, you don't break a test: you potentially break the semantics that dozens of downstream consumers have silently depended on. The software testing funnel he described (local testing, premerge via the LLVM premerge buildbots, downstream acceptance, release qualification, production) is the kind of infrastructure that only becomes visible when it's missing, and Kleckner's argument was that most of the pain in the LLVM release process comes from downstream testing happening too late and too privately.

This resonated with me from a static analysis perspective. The problem of "my change broke something I didn't know I was responsible for" is one we deal with constantly when modifying an analysis that runs across millions of lines of code. The solution in both cases is the same: move the feedback earlier, make it shared, make it cheap to run. Easier said than done.

Alias Analysis: Deep Foundations

Nikita Popov's tutorial on alias analysis was the most technically dense session I attended. The core idea is simple but the architecture behind it is not. LLVM's alias analysis infrastructure answers queries of the form "can these two memory locations overlap?", and the answer to that question gates almost every interesting mid-end transformation: Loop-Invariant Code Motion, Global Variable Numbering, Dead Store Elimination, the vectorizers. Get it wrong in either direction and you either miscompile or leave performance on the table.

What stuck with me was the discussion of BasicAA and provenance-based reasoning. The getUnderlyingObject function peels through GEPs (GetElementPtr) and casts to find the allocation a pointer derives from; if two pointers derive from different identified objects (two separate alloca instructions, say), they can't alias. This sounds obvious until you start thinking about how capture analysis interacts with it: whether a pointer has "escaped" a scope determines what you can prove about it downstream.

The derived analyses built on top, like MemorySSA and Loop Access Analysis, are where this pays off practically. MemorySSA gives you a versioned view of memory that makes it possible to reason about which writes can affect which reads without iterating over every instruction in a function. I also found fascinating the future directions Popov mentioned: encoding more aliasing information directly in the IR and better reasoning about loop-varying pointers.

Floating-Point Types in MLIR

Matthias Springer's talk on floating-point types in MLIR was a good reminder that even something as fundamental as "how do you represent a float" becomes surprisingly complex at the compiler IR level. The core of LLVM's floating-point infrastructure is APFloat, a software implementation of FP arithmetic used during constant folding (whenever the compiler evaluates a floating-point expression at compile time, it goes through APFloat to ensure the result matches what the hardware would produce). Adding a new FP type to MLIR means defining its fltSemantics in APFloat.h, which works but isn't modular: it requires patching a core LLVM file rather than extending the system from outside. The practical motivation here is the proliferation of low-precision formats in ML workloads: fp8, bf16, block-scaled variants, and NVIDIA's own extensions exposed through the NVVM dialect. The testing insight that stayed with me was simple but useful: to verify that a new FP type behaves identically on CPU and GPU, you can use software emulation via a dedicated pass, running the same program through APFloat on CPU and comparing against GPU execution. It's the kind of cross-layer testing problem that doesn't have an obvious solution until someone points it out.

Finding Injection Vulnerabilities in the Clang Static Analyzer

Dániel Krupp presented improvements to the Clang Static Analyzer's taint analysis, specifically the optin.taint.GenericTaint checker, aimed at making it viable for industrial use.

The fundamental tension he described is one I recognize: the CSA is an opportunistic bug-finding tool, not a sound verifier. Taint analysis in particular suffers from a precision problem at function boundaries. When a function can't be inlined, which happens frequently in large codebases, you lose track of writable objects and global values, and the taint property silently disappears. The talk introduced three propagation modes to give users control over this tradeoff: a Forget mode (the default, which loses taint at unknown functions), a Keep mode (which conservatively preserves taint through unknown functions), and a Spread mode. The results have been encouraging: strong numbers on the Juliet Benchmark and an acceptable true-positive / false-positive ratio on a handful of real-world projects, which is more than acceptable for a checker aimed primarily at vulnerability research.

CHERI-Enabled Architecture: Pointers That Aren't Just Integers

Owen Anderson's keynote on CHERI was the talk I least expected to find interesting, and it ended up changing how I think about pointer semantics in compiler IR.

CHERI isn't a new ISA: it's an architectural extension (instantiated today by Arm's Morello and by CHERI-RISC-V) that merges fat pointers and capabilities. A CHERI pointer refers to a range of addresses, carries encoded permissions, and has a tag bit that acts as a 1-bit dynamic type system: is this a valid, unforgeable capability? The security guarantee this enables is striking: third-party code cannot tamper with a CHERI capability even if it knows every address in the process, because the tag bit is enforced in hardware and capabilities are monotonic (you can only derive a capability with equal or fewer permissions than the one you hold).

What made this talk technically compelling for me was the LLVM implications. CHERI breaks a fundamental assumption that much of LLVM's optimization infrastructure relies on: that pointers are integers. memcpy semantics become problematic: you can't bitwise-copy a capability and have it remain valid. LSR (Loop Strength Reduction) can produce imprecise bounds that move a pointer outside its valid range. Some optimizations that are correct for integer pointers are incorrect for capabilities, and the compiler has to know which world it's in. Two ABIs are being developed, pure-capability and hybrid, each requiring different lowering from Clang. It's the kind of change that touches almost every layer of the compiler stack, which is exactly why it's interesting.

Atlas NPU: MLIR All the Way Down

The HIVM tutorial on Huawei's Atlas NPU compiler showed how their compilation pipeline runs: PyTorch → Triton-Ascend → AscendNPU-IR → BiSheng → binary. The interesting parts are in the middle.

They've built a custom HFusion dialect in MLIR that intentionally overlaps with the Linalg dialect, and they're explicit about this being a deliberate design choice rather than an oversight: rather than upstreaming NPU-specific patterns into Linalg and fighting the abstraction in both directions, they keep a parallel dialect tuned to the constraints of the Atlas hardware. The HIVM dialect sits below it as a multilevel hardware-aware dialect with explicit control over memory layout, which they know statically at compile time. The split-mix kernel strategy they described — detecting which operations map to the NPU's cube units (AIC, the matrix engines) and which to its vector units (AIV, the SIMD engines), then synchronizing between them when necessary.

The vectorization and tiling pipeline was also instructive: identify vectorizable CFG structure, create fusion groups, assign tile sizes per fused op. What makes this tractable is the assumption that memory layout is known at compile time. GPU compilers usually can't make that assumption, and a lot of what makes GPU compilation hard follows directly from its absence.


Three days in Dublin. I came back with a notebook full of things I want to understand better, a clearer sense of where the static analysis and compiler backend communities overlap, and a concrete short list of LLVM source files I want to read more carefully: MemorySSA and the GenericTaintChecker are at the top of it. That's about as good an outcome as I could have hoped for.