Backend Development 25 min read

Java Memory Model and Concurrent Programming: Visibility, Ordering, and Atomicity

The article explains how the Java Memory Model addresses concurrency challenges by defining visibility, ordering, and atomicity guarantees through mechanisms such as volatile, synchronized, cache coherence, memory barriers, CAS operations, and happens‑before relationships, enabling correct and portable multi‑threaded programming.

Tencent Cloud Developer

Aug 17, 2023

Java Memory Model and Concurrent Programming: Visibility, Ordering, and Atomicity

With the rapid development of hardware technology, multi-core processors have become standard in computing devices, requiring developers to master concurrent programming knowledge to fully utilize multi-core potential. However, concurrent programming is not easy; it involves many complex concepts and principles. To better understand the internal mechanisms of concurrent programming, it is necessary to deeply study memory models and their applications in concurrent programming.

This article explores the root causes of bugs in concurrent programming using the Java Memory Model and the underlying implementation principles for handling these issues.

1. Concurrency Issues - Visibility and Ordering

First, let's examine a code example with two shared variables x and y, assigned in two threads respectively. When both threads are started and waited to complete, whether the final result is x equals 2 and y equals 1 is unpredictable—shared variables x and y may have multiple execution results.

The main reasons for this problem are: 1) The speed difference between processor and memory when handling shared variables. 2) Code instruction reordering caused by compiler and processor optimizations. The former causes visibility issues, while the latter causes ordering issues.

1.1 Visibility Issues Caused by Processor Cache

Due to the large speed gap between processor and memory, the processor does not directly communicate with memory. Instead, it first reads system memory data into internal caches (L1, L2, or others) before operations. Based on the principle of locality, when reading memory data, the processor reads in blocks, each called a cache line. After processing data, instead of writing directly back to memory, it first writes to the cache and marks the current cache as dirty. When the current cache is replaced, data is written back to memory—this is called the write-back strategy.

To improve efficiency, processors also use a store buffer to temporarily save data being written to memory. However, because buffer data is not immediately written back to memory, and the store buffer is only visible to its own processor, other processors cannot perceive changes to shared variables. The processor's read/write order may differ from the actual memory operation order.

Achieving visibility requires the processor to promptly write the latest value of shared variables back to memory, and for other processors to promptly read the latest value of shared variables from memory. In Java, this is achieved through the volatile keyword.

When a volatile-modified shared variable is converted to assembly code, it includes a LOCK instruction. The LOCK prefix instruction causes two things on multi-core processors: 1) Write the current processor cache line data back to system memory. 2) This write-back operation invalidates cached data in other CPUs that have cached this memory address. These operations are implemented through bus snooping and bus arbitration, leading to various cache coherence protocols like MESI.

1.2 Ordering Issues Caused by Compiler Optimizations

Reordering refers to the rearrangement of instruction sequences by compilers and processors to optimize program performance. Reordering must follow two principles:

Data Dependency: If there is data dependency between two operations, compilers and processors cannot reorder them.

<span>// Write after read</span></code><code><span>a = 1;</span></code><code><span>b = a;</span></code><code><span>// Write after write</span></code><code><span>a = 1;</span></code><code><span>a = 2;</span></code><code><span>// Read after write</span></code><code><span>a = b;</span></code><code><span>b = 1;</span>

As-if-serial Semantics: This gives the illusion of sequential execution—the execution result after reordering must be consistent with sequential execution.

However, data dependency and as-if-serial semantics only guarantee instruction sequences executed in a single processor and operations in a single thread, without considering data dependencies between multiple processors and threads. Therefore, in multi-threaded programs, reordering operations with data dependencies may change program execution results.

This method is called memory barriers—a set of processor instructions used to implement ordering restrictions on memory operations. In Java, memory barriers are implemented through the volatile keyword, which prohibits instruction reordering on variables modified by it.

2. Concurrency Issues - Atomicity

For high-level languages like Java, a single statement is ultimately converted into multiple CPU instructions. For example, count+=1 requires at least three CPU instructions:

Instruction 1: Load the variable count from memory into CPU registers

Instruction 2: Execute +1 operation in the register

Instruction 3: Write the result back to memory

If two threads A and B simultaneously execute count+=1, there can be situations where both threads load count from memory, execute count+=1, and write back to memory simultaneously—resulting in count = 1 (lost update).

To ensure correct count results, the three processes of reading, operating, and writing must not be interrupted. This process is called an atomic operation.

Processors mainly use cache locking or bus locking to implement atomic operations:

Bus Locking: Lock the bus using the LOCK# signal, giving the current processor exclusive memory access. However, other processors cannot access other memory addresses, making it inefficient.

Cache Locking: Cache coherence protocol (MESI). Force the current processor cache line to invalidate and read data written back by other processors from memory. For data that cannot be cached or spans multiple cache lines, bus locking is still required.

The most important instruction is CMPXCHG. However, CMPXCHG is only effective on single-core processors; multi-core processors still need the LOCK prefix (LOCK CMPXCHG). Using CMPXCHG instructions, atomic operations can be implemented through cyclic CAS (Compare and Swap).

CAS requires two values as input: an old value (expected value before operation) and a new value. During the operation, it first checks whether the old value has changed; if not, it exchanges to the new value; if changed, it does not exchange.

Java provides many atomic operation classes for CAS operations, such as AtomicBoolean, AtomicInteger, AtomicLong, etc.

3. Memory Model and Happens-Before Relationships

High-level languages provide an abstract memory model to describe memory access behavior in multi-threaded environments. Without worrying about specific implementations of underlying hardware and operating systems, efficient and portable concurrent programs can be written. For Java, this memory model is the Java Memory Model (JMM).

Java Memory Model provides synchronization primitives like volatile, synchronized, and final to implement atomicity, visibility, and ordering. Another important concept is the happens-before relationship, which describes the partial ordering between operations in concurrent programming.

Java Memory Model defines main memory, local memory, and shared variable abstract relationships to determine how shared variables communicate and synchronize between threads. Local memory encompasses caches, store buffers, registers, and other hardware and compiler optimization concepts.

If thread A needs to communicate with thread B, it must go through two steps: 1) Thread A flushes updated shared variables from local memory A to main memory. 2) Thread B reads the previously updated shared variables from main memory.

Java Memory Model defines the following happens-before relationships between threads:

Program Order Rule: In a single thread, each operation happens-before any subsequent operation in that thread.

Monitor Lock Rule: An unlock operation happens-before a subsequent lock operation on the same lock.

Volatile Variable Rule: A write operation on a volatile field happens-before a subsequent read operation on the same field.

Transitivity Rule: If A happens-before B, and B happens-before C, then A happens-before C.

start() Rule: If thread A executes ThreadB.start(), then A's ThreadB.start() operation happens-before any operation in thread B.

join() Rule: If thread A executes ThreadB.join() and returns successfully, then any operation in thread B happens-before thread A successfully returning from ThreadB.join().

4. Memory Model Overview

This article provides a comprehensive overview of the Java Memory Model. JMM is part of the Java Virtual Machine specification, providing an abstract memory model for Java developers to describe memory access behavior in multi-threaded environments.

Java Memory Model focuses on atomicity, visibility, and ordering issues in concurrent programming and provides a series of synchronization primitives (such as volatile, synchronized, etc.) to implement these principles. Additionally, it defines happens-before relationships to describe partial ordering between operations, ensuring correctness and consistency of memory access.

The main advantage of Java Memory Model is that it provides a foundation for concurrent programming, simplifying complexity. It shields differences between processors, presents a consistent memory model across different processor platforms, and allows certain performance optimizations. These advantages make it easier for Java developers to write correct, efficient, and portable concurrent programs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

ordering CAS volatile JMM atomicity Concurrent Programming Happens-before Java Memory Model synchronized visibility

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.