Non-blocking
Michael-Scott queue algorithm
Alexey Fyodorov
JUG.ru Group
• Programming
• Algorithms
• Concurrency
What is this talk about?
• Programming
• Algorithms
• Concurrency
Are	you	sure	you	need	it?
What is this talk about?
For concurrency
beginners
Sorry
Please go to another room
For concurrency
beginners
Sorry
Please go to another room
For non-blocking
programming beginners
A short introduction
For concurrency
beginners
Sorry
Please go to another room
For non-blocking
programming beginners
A short introduction
For advanced concurrent
programmers
CAS-based queue algorithm
You have another room!
12:10
Non-blocking Michael-
Scott queue algorithm
Alexey Fyodorov
Easily scale enterprise
applications
using distributed data grids
Ondrej Mihaly
Main Models
Shared Memory
write + read
Similar to how
we program it
Concurrent
Programming
Main Models
Shared Memory Messaging
write + read send + onReceive
Similar to how
we program it
Similar to how
a real hardware works
Distributed
Programming
Concurrent
Programming
Advantages of Parallelism
Resource utilization Utilization of several cores/CPUs
aka PERFORMANCE
Advantages of Parallelism
Resource utilization
Simplicity Complexity goes to magic frameworks
• ArrayBlockingQueue
• ConcurrentHashMap
• Akka
Utilization of several cores/CPUs
aka PERFORMANCE
Advantages of Parallelism
Resource utilization
Async handling
Simplicity
Utilization of several cores/CPUs
aka PERFORMANCE
Complexity goes to magic frameworks
• ArrayBlockingQueue
• ConcurrentHashMap
• Akka
Responsible services, Responsible UI
Disadvantages of Locking
• Deadlocks
Disadvantages of Locking
• Deadlocks
• Priority Inversion
Disadvantages of Locking
• Deadlocks
• Priority Inversion
• Reliability
• What will happen if lock owner die?
Disadvantages of Locking
• Deadlocks
• Priority Inversion
• Reliability
• What will happen if lock owner die?
• Performance
• Scheduler can push lock owner out
• No parallelism inside a critical section!
Amdahl’s Law
α non-parallelizable part of the computation
1-α parallelizable part of the computation
p number of threads
Amdahl’s Law
α non-parallelizable part of the computation
1-α parallelizable part of the computation
p number of threads
S =	
#
α$	
%&α
'
If-Modify-Write
volatile int value = 0;
Can we run it
in multithreaded environment?
if (value == 0) {
value = 42;
}
If-Modify-Write
volatile int value = 0;
No atomicity
if (value == 0) {
value = 42;
}
}
Compare-And-Set
int value = 0;
LOCK
if (value == 0) {
value = 42;
}
UNLOCK
Introducing a Magic Operation
value.compareAndSet(0, 42);
int value = 0;
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue) {
return expected == compareAndSwap(expected, newValue);
}
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue) {
return expected == compareAndSwap(expected, newValue);
}
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue) {
return expected == compareAndSwap(expected, newValue);
}
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue){
return expected == compareAndSwap(expected, newValue);
}
Compare and Swap — Hardware Support
compare-and-swap
CAS
load-link / store-conditional
LL/SC
cmpxchg ldrex/strex lwarx/stwcx
Atomics in JDK
AtomicReference
• ref.get()
• ref.compareAndSet(v1, v2)
• ...
AtomicLong
• i.get()
• i.compareAndSet(42, 43)
• i.incrementAndGet(1)
• i.getAndAdd(5)
• ...
java.util.concurrent.atomic
Atomics in JDK
AtomicReference
• ref.get()
• ref.compareAndSet(v1, v2)
• ...
AtomicLong
• i.get()
• i.compareAndSet(42, 43)
• i.incrementAndGet(1)
• i.getAndAdd(5)
• ...
java.util.concurrent.atomic
Example. Atomic Counter
AtomicLong value = new AtomicLong();
long get() {
return value.get();
}
void increment() {
long v;
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
}
AtomicLong value = new AtomicLong();
long get() {
return value.get();
}
void increment() {
long v;
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
}
Example. Atomic Counter
Atomics.
Questions?
Non-blocking Guarantees
Wait-Free Per-thread progress is guaranteed
Non-blocking Guarantees
Wait-Free Per-thread progress is guaranteed
Lock-Free Overall progress is guaranteed
Non-blocking Guarantees
Wait-Free Per-thread progress is guaranteed
Lock-Free Overall progress is guaranteed
Obstruction-Free Overall progress is guaranteed
if threads don’t interfere with each other
CAS-loop
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
A. Wait-Free
B. Lock-Free
C. Obstruction-Free
CAS-loop
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
A. Wait-Free
B. Lock-Free
C. Obstruction-Free
*for modern hardware supporting CAS or LL/SC
Stack & Concurrency
class Node<E> {
final E item;
Node<E> next;
Node(E item) {
this.item = item;
}
}
...
class Node<E> {
final E item;
Node<E> next;
Node(E item) {
this.item = item;
}
}
E3
E1
E2
E3
E1
E2
top
E3
E1
E2
top
item1
Thread 1
E3
E1
E2
top
item1
Thread 1
E3
E1
E2
top
item2item1
Thread 1 Thread 2
E3
E1
E2
top
item2item1
Thread 1 Thread 2
E3
E1
E2
item2item1
Thread 1 Thread 2top
E3
E1
E2
item2item1
Thread 1 Thread 2
We need
a synchronization
top
Non-blocking Stack
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
top
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
item
AtomicReference<Node<E>> top;
top
newHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
AtomicReference<Node<E>> top;
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
AtomicReference<Node<E>> top;
top
itemnewHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
AtomicReference<Node<E>> top;
top
item
E pop() {
Node<E> newHead;
Node<E> oldHead;
do {
oldHead = top.get();
if (oldHead == null) return null;
newHead = oldHead.next;
} while (!top.compareAndSet(oldHead, newHead));
return oldHead.item;
}
E3
E1
E2
top
Non-blocking Stack.
Questions?
Non-blocking Queue
Michael and Scott, 1996
https://www.research.ibm.com/people/m/michael/podc-1996.pdf
Threads help each other
Non-blocking queue
class LinkedQueue<E> {
static class Node<E> {
E item;
AtomicReference<Node<E>> next;
Node(E item, AtomicReference<Node<E>> next) {
this.item = item;
this.next = next;
}
}
Node<E> dummy = new Node<>(null, null);
AtomicReference<Node<E>> head = new AtomicReference<>(dummy);
AtomicReference<Node<E>> tail = new AtomicReference<>(dummy);
}
class LinkedQueue<E> {
static class Node<E> {
E item;
AtomicReference<Node<E>> next;
Node(E item, AtomicReference<Node<E>> next) {
this.item = item;
this.next = next;
}
}
Node<E> dummy = new Node<>(null, null);
AtomicReference<Node<E>> head = new AtomicReference<>(dummy);
AtomicReference<Node<E>> tail = new AtomicReference<>(dummy);
}
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tail
dummy 1 2
head
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tail
dummy 1 2 item
head
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNode
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // true
tail.CAS(curTail, curTail.next.get()); // true
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // true
tail.CAS(curTail, curTail.next.get()); // false
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // false
tail.CAS(curTail, curTail.next.get()); // false
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
another
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // false
tail.CAS(curTail, curTail.next.get()); // true
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
another
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // false
tail.CAS(curTail, curTail.next.get()); // true
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
anotherHELP
Synchronization
Blocking
lock + unlock
Invariant: before & after
lock-based
Synchronization
Blocking Non-blocking
lock + unlock CAS-loop
Invariant: before & after Semi-invariant
CAS-basedlock-based
public void put(E item) {
Node<E> newNode = new Node<>(item, null);
while (true) {
Node<E> currentTail = tail.get();
Node<E> tailNext = currentTail.next.get();
if (currentTail == tail.get()) {
if (tailNext != null) {
tail.compareAndSet(currentTail, tailNext);
} else {
if (currentTail.next.compareAndSet(null, newNode)) {
tail.compareAndSet(currentTail, newNode);
return;
}
}
}
}
}
public E poll() {
while (true) {
Node<E> first = head.get();
Node<E> last = tail.get();
Node<E> next = first.next.get();
if (first == head.get()) {
if (first == last) {
if (next == null) return null;
tail.compareAndSet(last, next);
} else {
E item = next.item;
if (head.compareAndSet(first, next))
return item;
}
}
}
}
Non-blocking Queue in JDK
ConcurrentLinkedQueue is
based on Michael-Scott queue
— based on CAS-like operations
— use CAS-loop pattern
— threads help one another
Non-blocking algorithms. Summary
Non-blocking Queue.
Questions?
ArrayBlockingQueue
ArrayBlockingQueue
0 1 2 3 4 N-1
...
void put(E e) throws InterruptedException {
checkNotNull(e);
final ReentrantLock lock = this.lock;
lock.lockInterruptibly();
try {
while (count == items.length)
notFull.await();
final Object[] items = this.items;
items[putIndex] = x;
if (++putIndex == items.length)
putIndex = 0;
count++;
notEmpty.signal();
} finally {
lock.unlock();
}
}
ArrayBlockingQueue.put()
void put(E e) throws InterruptedException {
checkNotNull(e);
final ReentrantLock lock = this.lock;
lock.lockInterruptibly();
try {
while (count == items.length)
notFull.await();
final Object[] items = this.items;
items[putIndex] = x;
if (++putIndex == items.length)
putIndex = 0;
count++;
notEmpty.signal();
} finally {
lock.unlock();
}
}
ArrayBlockingQueue.put()
Modifications
Ladan-Mozes, Shavit, 2004, 2008
Key IDEA: use Doubly Linked List to avoid 2nd CAS
Optimistic	Approach
http://people.csail.mit.edu/edya/publications/OptimisticFIFOQueue-journal.pdf
Hoffman, Shalev, Shavit, 2007
Baskets	Queue
http://people.csail.mit.edu/shanir/publications/Baskets%20Queue.pdf
— Throughput is better
— no FIFO any more
— usually you don’t need strong FIFO in real life
Baskets Queue
Summary
— Non-blocking algorithms are complicated
— Blocking algorithms are easier
— correctness checking is difficult
— difficult to support
— Sometimes it has better performance
Summary
— Non-blocking algorithms are complicated
— Blocking algorithms are easier
— correctness checking is difficult
— difficult to support
— Sometimes it has better performance
Summary
— Non-blocking algorithms are complicated
— Blocking algorithms are easier
— correctness checking is difficult
— difficult to support
— Sometimes it has better performance
Summary
Engineering is the art of trade-offs
Links & Books
Books
Links
• Nitsan Wakart — http://psy-lob-saw.blogspot.com/	
• Alexey	Shipilev— https://shipilev.net/	
• concurrency-interest	mailing	list:	
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Q & A

Non-blocking Michael-Scott queue algorithm

  • 1.
  • 2.
    • Programming • Algorithms •Concurrency What is this talk about?
  • 3.
    • Programming • Algorithms •Concurrency Are you sure you need it? What is this talk about?
  • 4.
  • 5.
    For concurrency beginners Sorry Please goto another room For non-blocking programming beginners A short introduction
  • 6.
    For concurrency beginners Sorry Please goto another room For non-blocking programming beginners A short introduction For advanced concurrent programmers CAS-based queue algorithm
  • 7.
    You have anotherroom! 12:10 Non-blocking Michael- Scott queue algorithm Alexey Fyodorov Easily scale enterprise applications using distributed data grids Ondrej Mihaly
  • 8.
    Main Models Shared Memory write+ read Similar to how we program it Concurrent Programming
  • 9.
    Main Models Shared MemoryMessaging write + read send + onReceive Similar to how we program it Similar to how a real hardware works Distributed Programming Concurrent Programming
  • 10.
    Advantages of Parallelism Resourceutilization Utilization of several cores/CPUs aka PERFORMANCE
  • 11.
    Advantages of Parallelism Resourceutilization Simplicity Complexity goes to magic frameworks • ArrayBlockingQueue • ConcurrentHashMap • Akka Utilization of several cores/CPUs aka PERFORMANCE
  • 12.
    Advantages of Parallelism Resourceutilization Async handling Simplicity Utilization of several cores/CPUs aka PERFORMANCE Complexity goes to magic frameworks • ArrayBlockingQueue • ConcurrentHashMap • Akka Responsible services, Responsible UI
  • 13.
  • 14.
    Disadvantages of Locking •Deadlocks • Priority Inversion
  • 15.
    Disadvantages of Locking •Deadlocks • Priority Inversion • Reliability • What will happen if lock owner die?
  • 16.
    Disadvantages of Locking •Deadlocks • Priority Inversion • Reliability • What will happen if lock owner die? • Performance • Scheduler can push lock owner out • No parallelism inside a critical section!
  • 17.
    Amdahl’s Law α non-parallelizablepart of the computation 1-α parallelizable part of the computation p number of threads
  • 18.
    Amdahl’s Law α non-parallelizablepart of the computation 1-α parallelizable part of the computation p number of threads S = # α$ %&α '
  • 19.
    If-Modify-Write volatile int value= 0; Can we run it in multithreaded environment? if (value == 0) { value = 42; }
  • 20.
    If-Modify-Write volatile int value= 0; No atomicity if (value == 0) { value = 42; } }
  • 21.
    Compare-And-Set int value =0; LOCK if (value == 0) { value = 42; } UNLOCK
  • 22.
    Introducing a MagicOperation value.compareAndSet(0, 42); int value = 0;
  • 23.
    Simulated CAS long value; synchronizedlong get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue) { return expected == compareAndSwap(expected, newValue); }
  • 24.
    Simulated CAS long value; synchronizedlong get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue) { return expected == compareAndSwap(expected, newValue); }
  • 25.
    Simulated CAS long value; synchronizedlong get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue) { return expected == compareAndSwap(expected, newValue); }
  • 26.
    Simulated CAS long value; synchronizedlong get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue){ return expected == compareAndSwap(expected, newValue); }
  • 27.
    Compare and Swap— Hardware Support compare-and-swap CAS load-link / store-conditional LL/SC cmpxchg ldrex/strex lwarx/stwcx
  • 28.
    Atomics in JDK AtomicReference •ref.get() • ref.compareAndSet(v1, v2) • ... AtomicLong • i.get() • i.compareAndSet(42, 43) • i.incrementAndGet(1) • i.getAndAdd(5) • ... java.util.concurrent.atomic
  • 29.
    Atomics in JDK AtomicReference •ref.get() • ref.compareAndSet(v1, v2) • ... AtomicLong • i.get() • i.compareAndSet(42, 43) • i.incrementAndGet(1) • i.getAndAdd(5) • ... java.util.concurrent.atomic
  • 30.
    Example. Atomic Counter AtomicLongvalue = new AtomicLong(); long get() { return value.get(); } void increment() { long v; do { v = value.get(); } while (!value.compareAndSet(v, v + 1)); }
  • 31.
    AtomicLong value =new AtomicLong(); long get() { return value.get(); } void increment() { long v; do { v = value.get(); } while (!value.compareAndSet(v, v + 1)); } Example. Atomic Counter
  • 32.
  • 33.
  • 34.
    Non-blocking Guarantees Wait-Free Per-threadprogress is guaranteed Lock-Free Overall progress is guaranteed
  • 35.
    Non-blocking Guarantees Wait-Free Per-threadprogress is guaranteed Lock-Free Overall progress is guaranteed Obstruction-Free Overall progress is guaranteed if threads don’t interfere with each other
  • 36.
    CAS-loop do { v =value.get(); } while (!value.compareAndSet(v, v + 1)); A. Wait-Free B. Lock-Free C. Obstruction-Free
  • 37.
    CAS-loop do { v =value.get(); } while (!value.compareAndSet(v, v + 1)); A. Wait-Free B. Lock-Free C. Obstruction-Free *for modern hardware supporting CAS or LL/SC
  • 38.
  • 39.
    class Node<E> { finalE item; Node<E> next; Node(E item) { this.item = item; } } ...
  • 40.
    class Node<E> { finalE item; Node<E> next; Node(E item) { this.item = item; } } E3 E1 E2
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
    E3 E1 E2 item2item1 Thread 1 Thread2 We need a synchronization top
  • 48.
  • 49.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 top
  • 50.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top
  • 51.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 item AtomicReference<Node<E>> top; top newHead
  • 52.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 AtomicReference<Node<E>> top; item top newHead oldHead
  • 53.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top newHead oldHead
  • 54.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top newHead oldHead
  • 55.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top
  • 56.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top newHead oldHead
  • 57.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 AtomicReference<Node<E>> top; top itemnewHead oldHead
  • 58.
    void push(E item){ Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 AtomicReference<Node<E>> top; top item
  • 59.
    E pop() { Node<E>newHead; Node<E> oldHead; do { oldHead = top.get(); if (oldHead == null) return null; newHead = oldHead.next; } while (!top.compareAndSet(oldHead, newHead)); return oldHead.item; } E3 E1 E2 top
  • 60.
  • 61.
  • 62.
    Michael and Scott,1996 https://www.research.ibm.com/people/m/michael/podc-1996.pdf Threads help each other Non-blocking queue
  • 63.
    class LinkedQueue<E> { staticclass Node<E> { E item; AtomicReference<Node<E>> next; Node(E item, AtomicReference<Node<E>> next) { this.item = item; this.next = next; } } Node<E> dummy = new Node<>(null, null); AtomicReference<Node<E>> head = new AtomicReference<>(dummy); AtomicReference<Node<E>> tail = new AtomicReference<>(dummy); }
  • 64.
    class LinkedQueue<E> { staticclass Node<E> { E item; AtomicReference<Node<E>> next; Node(E item, AtomicReference<Node<E>> next) { this.item = item; this.next = next; } } Node<E> dummy = new Node<>(null, null); AtomicReference<Node<E>> head = new AtomicReference<>(dummy); AtomicReference<Node<E>> tail = new AtomicReference<>(dummy); }
  • 65.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tail dummy 1 2 head
  • 66.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tail dummy 1 2 item head
  • 67.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNode
  • 68.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 69.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 70.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 71.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 72.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 73.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 74.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 75.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // true tail.CAS(curTail, curTail.next.get()); // true } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 76.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 77.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // true tail.CAS(curTail, curTail.next.get()); // false } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 78.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 79.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // false tail.CAS(curTail, curTail.next.get()); // false } while (!success); } tailhead dummy 1 2 item newNodecurTail another
  • 80.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 81.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // false tail.CAS(curTail, curTail.next.get()); // true } while (!success); } tailhead dummy 1 2 item newNodecurTail another
  • 82.
    void put(E item){ Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // false tail.CAS(curTail, curTail.next.get()); // true } while (!success); } tailhead dummy 1 2 item newNodecurTail anotherHELP
  • 83.
  • 84.
    Synchronization Blocking Non-blocking lock +unlock CAS-loop Invariant: before & after Semi-invariant CAS-basedlock-based
  • 85.
    public void put(Eitem) { Node<E> newNode = new Node<>(item, null); while (true) { Node<E> currentTail = tail.get(); Node<E> tailNext = currentTail.next.get(); if (currentTail == tail.get()) { if (tailNext != null) { tail.compareAndSet(currentTail, tailNext); } else { if (currentTail.next.compareAndSet(null, newNode)) { tail.compareAndSet(currentTail, newNode); return; } } } } }
  • 86.
    public E poll(){ while (true) { Node<E> first = head.get(); Node<E> last = tail.get(); Node<E> next = first.next.get(); if (first == head.get()) { if (first == last) { if (next == null) return null; tail.compareAndSet(last, next); } else { E item = next.item; if (head.compareAndSet(first, next)) return item; } } } }
  • 87.
    Non-blocking Queue inJDK ConcurrentLinkedQueue is based on Michael-Scott queue
  • 88.
    — based onCAS-like operations — use CAS-loop pattern — threads help one another Non-blocking algorithms. Summary
  • 89.
  • 90.
  • 91.
  • 92.
    void put(E e)throws InterruptedException { checkNotNull(e); final ReentrantLock lock = this.lock; lock.lockInterruptibly(); try { while (count == items.length) notFull.await(); final Object[] items = this.items; items[putIndex] = x; if (++putIndex == items.length) putIndex = 0; count++; notEmpty.signal(); } finally { lock.unlock(); } } ArrayBlockingQueue.put()
  • 93.
    void put(E e)throws InterruptedException { checkNotNull(e); final ReentrantLock lock = this.lock; lock.lockInterruptibly(); try { while (count == items.length) notFull.await(); final Object[] items = this.items; items[putIndex] = x; if (++putIndex == items.length) putIndex = 0; count++; notEmpty.signal(); } finally { lock.unlock(); } } ArrayBlockingQueue.put()
  • 94.
  • 95.
    Ladan-Mozes, Shavit, 2004,2008 Key IDEA: use Doubly Linked List to avoid 2nd CAS Optimistic Approach http://people.csail.mit.edu/edya/publications/OptimisticFIFOQueue-journal.pdf
  • 96.
    Hoffman, Shalev, Shavit,2007 Baskets Queue http://people.csail.mit.edu/shanir/publications/Baskets%20Queue.pdf
  • 97.
    — Throughput isbetter — no FIFO any more — usually you don’t need strong FIFO in real life Baskets Queue
  • 98.
  • 99.
    — Non-blocking algorithmsare complicated — Blocking algorithms are easier — correctness checking is difficult — difficult to support — Sometimes it has better performance Summary
  • 100.
    — Non-blocking algorithmsare complicated — Blocking algorithms are easier — correctness checking is difficult — difficult to support — Sometimes it has better performance Summary
  • 101.
    — Non-blocking algorithmsare complicated — Blocking algorithms are easier — correctness checking is difficult — difficult to support — Sometimes it has better performance Summary Engineering is the art of trade-offs
  • 102.
  • 103.
  • 104.
    Links • Nitsan Wakart— http://psy-lob-saw.blogspot.com/ • Alexey Shipilev— https://shipilev.net/ • concurrency-interest mailing list: http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
  • 105.