I. Overview


Problematic Scene

Two optimizations:

When these two are performed together, the outcome can be problematic:

Untitled

So we have to forbid one transfomation.

A WMM should have unambiguous specification, be amenable to formal reasongning, but it’s not easy to get right.

<aside> 💡 WMM Desiderata

  1. Mathematically sane (e.g., monotone)
  2. Not too weak (good for programmers)
  3. Not too strong (good for hardware)
  4. Admits optimizations (good for compilers :-) </aside>

II. Define a WMM


1. Operational

3. Basic Operational Semantics for Concurrency

Define program semantics with an abstract machine.

Untitled

2. Transformational

Define the model as a sequence of program transformations over some basic operational model, such as SC.

$$ \textsf{\textcolor{grey}{Example:} TSO = SC + WR-reordering + RaW-elimination} $$

But it’s applicable in very few cases and doesn’t work for ARM:

The ARMv8 architecture employs a weakly-ordered model of memory.

In general terms, this means that the order of memory accesses is not required to be the same as the program order for load and store operations. The processor is able to re-order memory read operations with respect to each other. Writes may also be re-ordered (for example, write combining).

As a result, hardware optimizations, such as the use of cache and write buffer, function in a way that improves the performance of the processor, which means that the required bandwidth between the processor and external memory can be reduced and the long latencies associated with such external memory accesses are hidden.

[ARM Cortex-A Series Programmer's Guide for ARMv8-A](https://developer.arm.com/documentation/den0024/a/Memory-Ordering#:~:text=The ARMv8 architecture employs a,with respect to each other.)