Motivation
Synchronization is Coming Back But is it the Same?
• Synchonization is coming back, but is it the same? • Lots of new concepts and mechanisms since locks!
Michel RAYNAL • The multicore (r)evolution
[email protected]
• The synchronization world has changed
Institut Universitaire de France IRISA, Universit´ e de Rennes, France Dpt of Computing, Polytechnic Univ., Hong Kong c Michel Raynal
Synchronization is coming back!
1
A recent book on the topic
c Michel Raynal
Synchronization is coming back!
2
Part I
Concurrent programming: Algorithms, Principles, and Foundations
PART 1: Lock-Based Synchronization • Chapter 1: The Mutual Exclusion Problem
by Michel Raynal
• Chapter 2: Solving Exclusion Problem
Springer, 515 pages, 2013
• Chapter 3: Lock-Based Concurrent Objects
ISBN: 978-3-642-32026-2 6 Parts, composed of 17 Chapters Balance: algorithms vs foundations
c Michel Raynal
Synchronization is coming back!
3
c Michel Raynal
Synchronization is coming back!
4
Part II
Part III
PART 3: Mutex-free Synchronization • Chapter 5: Mutex-Free Concurrent Objects
PART 2: On the Foundations Side: the Atomicity Concept
• Chapter 6: Hybrid Concurrent Objects • Chapter 7: Wait-Free Objects from Read/write Registers Only
• Chapter 4:
• Chapter 8: Snapshot Objects from Read/Write Registers Only
Atomicity: Formal Definition and Properties
• Chapter 9: Renaming Objects from Read/Write Registers Only
c Michel Raynal
Synchronization is coming back!
5
Part IV
c Michel Raynal
Synchronization is coming back!
6
Part V
PART 5: On the Foundations Side: from Safe Bits to Atomic Registers PART 4: The Transactional Memory Approach
• Chapter 11: Safe, Regular, and Atomic Read/Write Registers • Chapter 12: From Safe Bits to Atomic Bits: a Lower Bound and an Optimal Construction
• Chapter 10: Transactional Memory
• Chapter 13: Bounded Constructions of Atomic b-Valued Registers
c Michel Raynal
Synchronization is coming back!
7
c Michel Raynal
Synchronization is coming back!
8
Part V
Summary
• Computing Model: Processes and Concurrent Objects
PART 6: On the Foundations Side: the Computability Power of Concurrent Objects
• Base read/write model, more powerful models • On the safety side: Linearizability
• Chapter 14: Universality of Consensus
• On the liveness side: Progress conditions
• Chapter 15: The Case of Unreliable Base Objects
• Mutex-free concurrent objects
• Chapter 16: Consensus Numbers and the Consensus Hierarchy
• Comuptability power, consensus number, etc. • Hybrid concurrent objects
• Chapter 17: The Alpha(s) and Omega of Consensus
c Michel Raynal
Synchronization is coming back!
• Conclusion
c Michel Raynal
9
Part I
Synchronization is coming back!
10
Asynchronous process model (1)
• Computing model: n sequential processes Π = p1, . . . , pn (Turing machines) • Timing model: Asynchrony: No upper bound on the time required to execute a computation step
MODEL and DEFINITIONS
• Failure model ⋆ No failure ⋆ Crash failure
c Michel Raynal
Synchronization is coming back!
11
c Michel Raynal
Synchronization is coming back!
12
Asynchronous process model (2)
Asynchronous Base Communication Model (1)
Shared memory provides processes with • Process crash: a process behaves according to its specification until it possibly crashes, i.e., halts prematurely (after it has crashed a process is definitely stopped) • Terminology wrt a run:
⋆ The operations on a register appear as if they have been executed sequentially ⋆ Each operation appears as being executed instantaneously at some point of the time line between its start event and its end event
process is a pi that crashes
c Michel Raynal
• Reliable Compare&Swap atomic registers • Atomic means that:
⋆ A Correct process is a pi that never crashes ⋆ A Faulty
• Reliable Read/Write atomic registers
Synchronization is coming back!
c Michel Raynal
13
Atomic register
Synchronization is coming back!
14
Other base (hardware) operations
READ: 2 READ: 1 WRITE: 1
WRITE: 2
READ: 2
• Test&set
WRITE: 3
• Fetch&add Real time line
Value is 1
3
• Compare&swap • LL/SC
2
• Swap, Mem-to-Mem SWAP - Lamport L., On interprocess Communication (part 1: Basic Formalism, Part 2: Algorithms), Distributed Computing, 1(2):77-101, 1986
• etc.
- Herlihy M.P. and Wing J.L., Linearizability: a Correctness Condition for Concurrent Objects. ACM Trans. on Programming Languages and Systems, 12(3):463-492, 1990
c Michel Raynal
Synchronization is coming back!
15
c Michel Raynal
Synchronization is coming back!
16
Part II
Structural view (Base level) p1 local memory
p1
pi local memory
pn local memory
On the SAFETY side
pn
pi
Asynchronous shared memory Abstraction (could be shared disks, SAN) c Michel Raynal
Synchronization is coming back!
c Michel Raynal
17
SAFETY PROPERTY
Synchronization is coming back!
18
Concurrent Object: example An object accessed by concurrent processes p1
LINEARIZABILITY - Herlihy M.P. and Wing J.M., Linearizability: a correctness condition for concurrent objects. ACM Toplas, 12(3):463492, 1990 - Lamport L., On interprocess Communication (part 1: Basic Formalism, Part 2: Algorithms), Distributed Computing, 1(2):77-101, 1986
c Michel Raynal
Synchronization is coming back!
19
pn
pi
Enqueue (v)
r ← Dequeue ()
1111111111111111111111111 0000000000000000000000000 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111 0000000000000000000000000 1111111111111111111111111
c Michel Raynal
Synchronization is coming back!
20
Sequential vs Concurrent (1)
History (Run, Execution) Enq (a)
Enq (b)
Deq (a|b|c) ?
p1
SEQUENTIAL: Enq (a)
Enq (c)
Enq (b)
Deq (a) Deq (c)
Enq (c)
Deq (a|b|c) ?
p2
CONCURRENT: Enq (a)
Enq (b)
Deq (a|b|c) ?
p1
History H
Sequences of events
Inv (Enq(b)) Res (Enq(b))
Enq (c)
Deq (a|b|c) ?
p2
A history defines a partial order on the operations c Michel Raynal
Synchronization is coming back!
21
Two types of Concurrent Objects
c Michel Raynal
Synchronization is coming back!
22
Example: Interactive consistency
• A concurrent object encapsulates a particular synchro problem
Each process pi proposes a value vi
• Concurrent objects with a sequential specification and has to decide a value such that:
Queue, file, graph, tree, stack, ... ⋆ Fifo queue: producer/consumer problem ⋆ File (register): Readers/writers problem
• Validity. Let D[1..n] be the vector decided by a process.
• Concurrent objects with a not-sequential specification Rendezvous Object, Interactive Consistency, Non Blocking Atomic Commit Object, ...
c Michel Raynal
Synchronization is coming back!
∀i ∈ [1..n] : D[i] ∈ {vi, ⊥} and D[i] = vi if pi is correct • Agreement. No two processes decide different vectors • Termination. Every correct process decides
23
c Michel Raynal
Synchronization is coming back!
24
Linearizability
Sequential vs Concurrent (2)
Enq (a)
An execution (or “history”) H is linearizable if the operations issued by the processes appear as if they have been executed in some sequential order that respects:
Enq (b)
Deq (a|b|c) ?
p1 Enq (c)
Deq (a|b|c) ?
p2 • The seq order in each process • The seq specification of the object • The real-time order of the non-overlapping operations
Enq (a)
Enq (c) Enq (b)
c Michel Raynal
Synchronization is coming back!
25
Sequential vs Concurrent (3)
Enq (a)
Enq (b)
26
Deq (a|b|c) ?
• A property P of a concurrent system is Local if the system as a whole satisfies P whenever each individual object satisfies P
Deq (a|b|c) ?
Enq (c)
Synchronization is coming back!
Locality of the Linearizability property (1)
p1
p2
c Michel Raynal
Deq (b) Deq (a)
Locality means that, given two objects X and Y , each satisfying the property P , the composite object [X, Y ] satisfies the property P • Linearizability is a local property (consistency criterion) Enq (c)
Enq (a)
Deq (a) Enq (b)
c Michel Raynal
Deq (c)
Synchronization is coming back!
27
c Michel Raynal
Synchronization is coming back!
28
Locality of the Linearizability property (2)
Sequential Consistency
• Theorem: A history H is linearizable iff, for each object X, H restricted to the operations on X is linearizable
• An execution (or “history”) H is sequentially consistent if the operations issued by the processes appear as if they have been executed in some sequential order that respects each process order and the seq specification of every object
Locality means that, given two linearizable objects X and Y , the composite object [X, Y ] is linearizable • It follows that each object can be implemented independently from the others (this is a fundamental property from both theoretical and practical point of views)
Q.Enq(a)
Q.Enq(b)
Q.Deq(b)
• Sequential consistency, some forms of serializability are not local properties A ”witness” seq history: Q.Enq(b) c Michel Raynal
Synchronization is coming back!
29
Sequential Consistency is not Local Q.Enq(a)
Q′.Enq(a′ )
Q′.Enq(b′)
c Michel Raynal
Q.Enq(a)
Q.Deq(b)
Synchronization is coming back!
30
Part III
Q′.Deq(b′)
Q.Enq(b)
Q.Deq(b)
On the LIVENESS side PROGRESS CONDITIONS
• Q and Q′ are sequentially consistent • the whole history H is not sequentially consistent Impossibility to produce a “witness” sequential history • See distributed caches
c Michel Raynal
Synchronization is coming back!
31
c Michel Raynal
Synchronization is coming back!
32
Two cases
MUTEX-FREEDOM
• Reliable systems: • No code is protected by a critical section
⋆ Lock-based implementations ⋆ Progress conditions
O.op1() by p1
∗ Deadlock-freedom ∗ Starvation-freedom
O.op2(b) by p2 O.op1() by p3
• Crash-prone systems: ⋆ Mutex-free implmentations ⋆ Associated progress conditions
c Michel Raynal
Synchronization is coming back!
R1
33
Abortable object
• If an operation executes in a concurrency context, it terminates and returns a correct value or it returns ⊥ and has no effect on the object This definition is different from the one used by Aguilera M.K., Frolund S., Hadzilacos V., Horn S.L. and Toueg S., Abortable and Query-abortable Objects and their Implementations. PODC’07, pp. 23-32, 2007
Synchronization is coming back!
c Michel Raynal
R2 R1
Synchronization is coming back!
34
Obstruction-freedom
• If an operation executes alone (concurrency-free context), it terminates and returns a correct value
c Michel Raynal
R3 R2 R3R3 R2 R3 R1
35
• If an operation executes alone (concurrency-free context), it terminates and returns a correct value • If an operation executes in a concurrency context, ⋆ it can never termination ⋆ if it terminates, its result is correct Abortability is a stronger notion than obstruction-freedom
c Michel Raynal
Synchronization is coming back!
36
Non-blocking
Starvation-free
• Any operation invocation terminates • In a concurrency context, at least one operation terminates
• Failure-prone systems where t process may crash: called t-resilient systems
• In a failure-free system: Non-blocking = Deadlock-free
• Systems where t = n − 1: called wait-free systems • Difference between wait-freedom and starvation-freedom is failure-free systems
c Michel Raynal
Synchronization is coming back!
37
Hierarchy of progress conditions in failure-prone systems
• Abortable ≺ Wait-freedom
Synchronization is coming back!
Synchronization is coming back!
38
Part IV
Simple mutex-free implementations of concurrent objects
• Obstruction-freedom ≺ Non-blocking ≺ Wait-freedom
c Michel Raynal
c Michel Raynal
39
c Michel Raynal
Synchronization is coming back!
40
Introductory example: the SPLITTER
Splitter: Specification
x processes
• At most one process exits with stop
stop
right
≤ 1 process
• At most (x − 1) processes exit with right ≤ x − 1 processes
• At most (x − 1) processes exit with down
down
Lamport 1987, Anderson-Moir 1995.
≤ x − 1 processes
c Michel Raynal
Synchronization is coming back!
c Michel Raynal
41
Splitter: WAIT-FREE Implementation
Synchronization is coming back!
42
Splitter: Proof (very easy)
Two shared registers: LAST (init ∀) and DOOR (init open) procedure direction() % issued by pi %
LAST ← i
LAST ← idi; if DOOR = closed then movei ← right else DOOR ← closed if (LAST = idi) then movei ← stop else movei ← down endif endif; return (movei)
c Michel Raynal
Synchronization is coming back!
43
DOOR ← closed
LAST = i
No process pj has modified the atomic register LAST
c Michel Raynal
Synchronization is coming back!
44
A NON-BLOCKING timestamp object: definition
A non-blocking timestamp object: base objects
• NEXT : next integer timestamp value (init to 1) • Validity. No two invocations of get timestamp() return the same value • Consistency. Let gt1() and gt2() be two distinct invocations of get timestamp(). If gt1() returns before gt2() starts, the timestamp returned by gt2() is greater than the one returned by gt1() • Termination. Obstruction-freedom
c Michel Raynal
Synchronization is coming back!
45
A non-blocking timestamp object: implementation
operation get timestamp(i) is k ← NEXT ; repeat forever LAST [k] ← i; if (¬COMP [k]) then COMP [k] ← true ; if (LAST [k] = i) then NEXT ← NEXT + 1; return(k) end if end if; k ←k+1 end repeat end operation.
c Michel Raynal
Synchronization is coming back!
• LAST : unbounded array of atomic registers. A process pi deposits its index i in LAST [k] to indicate it is trying to obtain the timestamp k • COMP : unbounded array of atomic Boolean registers (all init to false ) A process pi sets COMP [k] to true to indicate that it is competing for the timestamp k (several processes can write true into COMP [k])
c Michel Raynal
Synchronization is coming back!
46
A wait-free stack: model and base objects
• Read/write model enriched with Fetch&add and swap operations • Base atomic objects ⋆ REG [0..∞): array of atomic registers which contains the elements of the stack (init to ⊥) REG [0] sentinel ⋆ NEXT : atomic register that contains the index of the next entry where a value can be deposited (init to 1)
47
c Michel Raynal
Synchronization is coming back!
48
A wait-free stack: push algorithm
A wait-free stack: pop algorithm
operation Q.pop() is last ← NEXT − 1; for x from last to 0 do aux ← REG [x].swap(⊥); if (aux 6= ⊥) then return(aux) end if end for, return(empty) end operation.
operation push(v) is in ← NEXT .fetch&add() − 1; REG [in] ← v; return() end operation.
wait-freedom vs bounded wait-freedom
c Michel Raynal
Synchronization is coming back!
49
Part V
c Michel Raynal
Synchronization is coming back!
51
Boosting progress with contention managers
• From Obstruction-freedom to wait-freedom
Boosting IMPLEMENTATIONS
Failure detector: ✸P (eventually perfect) • From non-blocking to wait-freedom Failure detector: ΩX
c Michel Raynal
Synchronization is coming back!
52
c Michel Raynal
Synchronization is coming back!
53
Boosting progress with contention managers
operation1()
From an obstruction-free to a non-blocking timestamp
operation get timestamp(i) is k ← NEXT ; repeat forever LAST [k] ← i; if (¬COMP [k]) then COMP [k] ← true ; if (LAST [k] = i) thenNEXT ← NEXT + 1; CM.stop help(i); return(k) end if end if; k ← k + 1; CM.need help(i) end repeat end operation.
operationm()
need help() Obstruction-free implementation
stop help()
c Michel Raynal
Failure detector-based Contention manager
Synchronization is coming back!
54
From obstruction-freedom to non-blocking
c Michel Raynal
Synchronization is coming back!
55
Part VI
operation CM .need help(i) is NEED HELP [i] ← true ; repeat x ← {j | NEED HELP [j]} until (ev leader(x) = i) end repeat; return() end operation.
Read/write-based WAIT-FREE IMPLEMENTATIONS
operation CM .stop help(i) is NEED HELP [i] ← false ; return() end operation.
c Michel Raynal
Synchronization is coming back!
56
c Michel Raynal
Synchronization is coming back!
57
Moir-Anderson’s Protocol
The (Static) Renaming Problem • id1, . . . , idn are the initial identities of the processes • id1, . . . , idn ∈ [0..N − 1], where N >>>>> n • Each process has to acquire a new name in the set [0..M − 1] • No two processes can get the same name
• Considers M = n(n + 1)/2
• Type of renaming:
• Basic idea: a grid of splitters
⋆
Static renaming: a name is acquired once for all
⋆
Dynamic renaming: names are (acquired; released)∗
• Theoretical result: M ≥ 2n − 1 (Herlihy-Shavit)
c Michel Raynal
Synchronization is coming back!
58
Grid of Splitters
ki
li
0
1
2
3
c Michel Raynal
Synchronization is coming back!
59
The wait-free algorithm
4 operation get name(idi)
0
0
1
1
5
6
7
2
9
10
11
3
12
13
4
14
c Michel Raynal
2
3
4 di ← 0; ri ← 0; movei ← down; while (movei 6= stop) do movei ← splitter[di, ri].direction(); case (movei = right) then ri ← ri + 1 (movei = down) then di ← di + 1 (movei = stop) then exit loop end case end while return (n × di + ri − (di(di − 1)/2)) % the new name is the position [di, ri] in the grid %
8
Synchronization is coming back!
60
c Michel Raynal
Synchronization is coming back!
61
A snapshot Object
Snapshot operations
SM [i]
• Keeps data provided by processes
SM [n]
SM [1]
• When pi invokes store(v) it defines v as its last deposited value • A process invokes snapshot to get the values deposited by the processes • Everything has to appear as if the operations were executed instantaneously (at some time between their invocation and their termination) update(v) by pi snapshot() by pj , ∀j
c Michel Raynal
Synchronization is coming back!
c Michel Raynal
62
Underlying idea
Synchronization is coming back!
63
Snapshot: Partial proof (easy)
REG [1]
operation update (v) sni ← sni + 1; % local seq number generator % SM [i] ← (v, sni) % atomic write %
REG [2]
operation snapshot while true do Ai ← scan; Bi ← scan; % double “asynchronous” scan % if (∀j : Ai[j].sn = Bi[j].sn) then return (Ai.val) end if % Ai.val = [Ai[1].val, . . . , Ai[n].val] % end while
REG [3] REG [4]
aai[1].sn = a = SM [1].sn
aai[2].sn = b
aai [3].sn = c aai[4].sn = d
first scan()
bbi[1].sn = a
bbi[2].sn = b
bbi[3].sn = c
bbi[4].sn = d = SM [4].sn
second scan() snapshot() operation
time line linearization point of the snapshot() operation
c Michel Raynal
Synchronization is coming back!
64
c Michel Raynal
Synchronization is coming back!
65
How an update can help a snapshot
up1 pj
pi
Afek et al.s algorithm (1)
up2
operation update (v): help arrayi ← snapshot(); sni ← sni + 1; SM [i] ← (v, sni, help arrayi)
snap int
snap
c Michel Raynal
Synchronization is coming back!
c Michel Raynal
66
operation snapshot: could helpi ← ∅; while true do Ai ← scan; Bi ← scan; % double “asynch” collect % if (∀j : Ai[j].sn = Bi[j].sn) then return (Ai.val) else for j : 1 ≤ j ≤ n do if (Ai[j].sn 6= Bi[j].sn) then if (j ∈ could helpi) then return (Bi[j].help array) else could helpi ← could helpi ∪ {j} end if end if end for end if end while Synchronization is coming back!
67
Snapshot: Proof
Afek et al.s algorithm (2)
c Michel Raynal
Synchronization is coming back!
snapshot()
pi
help array
update()
update()
pj
snapshot()
help array
pk
update()
update() snapshot() successful double scan
68
c Michel Raynal
Synchronization is coming back!
69
Part VII
Definition
An implementation of a concurrent object is • Mutex-free if it does not use locks
HYBRID IMPLEMENTATIONS
• Static Hybrid if ⋆ It uses locks only for some operations • Dynamic Hybrid if ⋆ It never uses locks in “favorable circumstances”
c Michel Raynal
Synchronization is coming back!
70
Example
c Michel Raynal
Synchronization is coming back!
71
Hybridism vs adaptivity
• Static hybrid set: ⋆ add() and remove(): lock-based ⋆ belong(): no lock
• Hybridism is wrt to lock and concurrency • Adaptivity is wrt the concurrency degree (cost has to depends on the numb of competing processes only)
• Static hybrid stack/queue ⋆ In concurrency-free contexts: no lock and a bounded number of steps ⋆ In concurrency contexts: locks can be used
c Michel Raynal
Synchronization is coming back!
72
c Michel Raynal
Synchronization is coming back!
73
Part VII
Underlying system
A MUTEX-FREE ABORTABLE STACK
c Michel Raynal
Synchronization is coming back!
• Atomic read/write registers • Compare&swap objects
c Michel Raynal
74
Compare&Swap: definition
Synchronization is coming back!
75
Compare&Swap: use by pi
• Let X = a Compare&Swap is a conditional write
• pi reads a from register X • It then does computation and computes a new value c for X
primitive X.C&S(old, new): if (X = old) then X ← new; return(true ) else return(false ) end if.
c Michel Raynal
Synchronization is coming back!
• Finally pi wants to assign c to X if and only if X has not been modified since it read it To that end pi issues X.C&S(a, c)
76
c Michel Raynal
Synchronization is coming back!
77
Compare&Swap: the ABA problem
Solving the ABA problem
Unfortunately the previous use of Compare&Swap is incorrect!
Associate a new sequence number with every X.C&S
• Initially X = a
• X is now a pair ha, sni
• At time τ1: pi reads a from X
• At time τ1: pi reads ha, sni from X
• At time τ2 > τ1: pj successfully executes X.C&S(a, b) (X = b)
• At time τ2 > τ1: pj successfully executes X.C&S(ha, sni, hb, sn + 1i)
• At time τ3 > τ2: pj successfully executes X.C&S(b, a) (X = a) • At time τ4 > τ3: pi successfully executes X.C&S(a, b) and erroneously believes that X has not been modified by another process in the interval [τ1..τ4] c Michel Raynal
Synchronization is coming back!
78
Stack operations
• At time τ3 > τ2: pk successfully executes X.C&S(hb, sn + 1i, ha, sn + 2i) • At time τ4 > τ3: when pi executes X.C&S(ha, sni, hc, sn + 1i), the write into X fails and returns false to pi c Michel Raynal
Synchronization is coming back!
Stack representation (1)
• The stack is of size k
• An array STACK [0..k] of atomic registers
• Operation push(v)
• ∀x : 0 ≤ x ≤ k : STACK [x] has two fields ⋆ STACK [x].val contains a value ⋆ STACK [x].sn contains a seq number (used to prevent the ABA problem on this register) It counts the nb of successful writes on STACK [x]
⋆ returns full if the stack is full, otherwise ⋆ adds v to the top of the stack and returns done • Operation pop() ⋆ returns empty if the stack is empty, otherwise ⋆ suppresses the value from the top of the stack and returns it
c Michel Raynal
Synchronization is coming back!
79
80
∀x : 1 ≤ x ≤ k : STACK [x] initialized to h⊥, 0i • STACK [0] always stores a dummy entry (init to h⊥, −1i)
c Michel Raynal
Synchronization is coming back!
81
Principle: laziness + helping mechanism
Stack representation (2)
• A push or pop operation • A register TOP that contains the index of the top of the stack plus the corresponding pair hv, sni • TOP initialized to h0, ⊥, 0i • Both STACK [x] and TOP are modified with Compare&Swap
⋆ updates TOP , and ⋆ leaves to the next operation the corresponding update of the stack Hence it helps the previous (push or pop) operation by modifying the stack accordingly Shafiei N., Non-blocking Array-based Algorithms for Stacks and Queues. Proc. th Int’l Conference on Distributed Computing and Networking (ICDCN’09), Springer Verlag LNCS #5408, pp. 55-66, 2009
c Michel Raynal
Synchronization is coming back!
82
operation weak push(v): (index, value, seqnb) ← TOP ; help(index, value, seqnb); if (index = k) then return(full ) end if; sn of next ← STACK [index + 1].sn; newtop ← hindex + 1, v, sn of next + 1i; if TOP .C&S(hindex, value, seqnbi, newtop) then return(done) else return(⊥) end if.
Synchronization is coming back!
Synchronization is coming back!
83
Abortable stack: help procedure
Abortable push: weak push()
c Michel Raynal
c Michel Raynal
procedure help(index, value, seqnb): stacktop ← STACK [index].val; STACK [index].C&S(hstacktop, seqnb − 1i, hvalue, seqnbi).
84
c Michel Raynal
Synchronization is coming back!
85
From an abortable to a non-blocking stack
Abortable pop: weak pop()
operation weak pop(): (index, value, seqnb) ← TOP ; help(index, value, seqnb); if (index = 0) then return(empty) end if; belowtop ← STACK [index − 1]; newtop ← hindex − 1, belowtop.val, belowtop.sn + 1i; if TOP .C&S(hindex, value, seqnbi, newtop) then return(value) else return(⊥) end if.
c Michel Raynal
Synchronization is coming back!
operation non blocking push(v): repeat res ← weak push(v) until res 6= ⊥ end repeat; return(res). operation non blocking pop(): repeat res ← weak pop() until res 6= ⊥ end repeat; return(res).
86
Part VIII
Synchronization is coming back!
Synchronization is coming back!
87
From a non-blocking lock to a starvation-free lock (1)
• A non-blocking lock denoted LOCK
A Dynamic Hybrid STARVATION-FREE STACK
c Michel Raynal
c Michel Raynal
• A boolean register FLAG [i ] (initialized to false ) that process pi sets to true when it wants to obtain the starvation-free lock
88
c Michel Raynal
Synchronization is coming back!
89
From a non-blocking lock to a starvation-free lock (2)
operation starvation free lock(): FLAG [i] ← true ; wait ((TURN = i) ∨ (¬FLAG [TURN ])); LOCK .lock().
Operation and additional data structures
• Operations strong push() or strong pop() denoted strong push or pop()
operation starvation free unlock(): FLAG [i] ← false ; if (¬FLAG [TURN ]) then TURN ← (TURN mod n) + 1 end if; LOCK .unlock();
c Michel Raynal
Synchronization is coming back!
• A boolean register CONTENTION (initialized to false ) that is set to true by a process when it executes the underlying weak operation(par) operation.
90
The implementation (par = v for push() and ⊥ for pop())
Synchronization is coming back!
Synchronization is coming back!
91
Proof
• Lemma 1: If a process pi returns from its strong push(v) or strong pop() invocation, it returns a non-⊥ value
operation strong push or pop(par): if (¬CONTENTION ) then res ← weak push or pop(par); if (res 6= ⊥) then return(res) end if end if; starvation free lock(); CONTENTION ← true ; repeat res ← weak push or pop(par) until (res 6= ⊥); CONTENTION ← false ; starvation free unlock(); return(res).
c Michel Raynal
c Michel Raynal
• Lemma 2: If, while executing a strong push(v) or strong pop(), a process pi reads true from CONTENTION at first line or obtains res = ⊥ at the 2nd line, it eventually obtains the lock • Theorem: Any invocation of strong push() or strong pop() returns a non-⊥ value, and all invocations are linearizable Moreover, the algorithm is contention-sensitive: any operation invoked in a contention-free context is lock-free and accesses six times the shared memory
92
c Michel Raynal
Synchronization is coming back!
93
When there are failures
Part IX
• The abortable and non-blocking implementations cope with any number of process crashes • The contention-sensitive implementation is wait-free IF no process crashes between the invocation of lock() and the return of unlock()
c Michel Raynal
Synchronization is coming back!
THE fundamental ISSUE
c Michel Raynal
94
The fundamental problem
Synchronization is coming back!
95
Example
• Wait-free implement linearizable concurrent objects in presence of process crashes Compose base objects to get “more powerful” objects
• Example: Build a wait-free queue A from read/write atomic registers in a system where up to n − 1 processes can crash X: t-fault tolerant Compare&Swap object k = t + 1
Problem: Given two objects A and X, is there a wait-free implementation of A by X in a system of n processes (prone to crashes)?
• Wait-free = live with respect to any number of process crashes
c Michel Raynal
Synchronization is coming back!
96
x base[1]
x base[2]
···
···
x base[k]
• Is it possible??
c Michel Raynal
Synchronization is coming back!
97
Herlihy’s Result
The CONSENSUS Problem: Definition
• An object is universal if it can be used to wait-free implement any object (that has a seq specification)
Each process proposes a value and has to decide a value in such a way that:
• Main result (Herlihy 1991):
• Termination: Every correct process eventually decides some value
Consensus is a universal object • Any concurrent object (that has a sequential specification) can be built from a consensus object (and read/write registers)
• Validity: If a process decides v, then v was proposed by some process
• A Universal construction is an algorithm that, given the sequential specification of an object, constructs a corresponding concurrent object (from consensus objects and registers)
• Uniform Agreement: No two (correct or not) processes decide differently
c Michel Raynal
Synchronization is coming back!
98
c Michel Raynal
Synchronization is coming back!
99
... But there is a bad news: The Main Result
CONSENSUS in Action! (Concept) Let X be a consensus object (a single operation Propose(v)) • Consensus is a Synchronization tool: Consensus makes a decision irreversible (only one action/value succeeds) x1 ← X.Propose(a) at t1 and x2 ← X.Propose(b) at t2 we always have x1 = x2 (either a or b) whatever the invocation times t1 and t2
Fischer-Lynch-Paterson’s Impossibility result (1985)
There is no protocol that solves the consensus problem in asynchronous systems (shared memory or message passing) that is subject to even a single process crash failure
• Consensus solves Non-determinism: Consensus makes a result UNIQUE Let f be a non-deterministic function
Fischer M.J., Lynch N.A. and Paterson M.S., Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32(2):374-382, 1985.
If x1 ← X.Propose(f (a)) and x2 ← X.Propose(f (a)) we always have x1 = x2 c Michel Raynal
Synchronization is coming back!
100
c Michel Raynal
Synchronization is coming back!
101
Herlihy’s Results
Herlihy’s Results
• Which objects allow implementing a consensus object?
• As the synchronization power of read/write atomic variables is too weak to implement consensus, are they synchronization primitives powerful enough to solve consensus in a SM asynchronous system?
• Idea: investigate the synchro power of base objects Each object has a
• A few synchronization primitives (synchr objects):
consensus number
The consensus number of X is the largest n for which X solves consensus among n processes • FLP ⇒ CN (Read/Write atomic variables)=1 Which means that read/write operations are not powerful enough to solve consensus in presence of even a single crash when n > 1 •
Which is the synchronization power needed to solve consensus in a SM asynchronous system?
c Michel Raynal
Synchronization is coming back!
102
Herlihy’s Results
⋆ Test&Set (shared)= [prev ← shared; shared ← 1; return (prev)] ⋆ Swap (local, shared)= [local ↔ shared] ⋆ Move (shared1, shared2)= [shared1 ↔ shared2] ⋆ Compare&Swap (shared, old, new)= [prev ← shared; if prev = old then shared ← new fi; return (prev)] c Michel Raynal
Synchronization is coming back!
103
Consensus with Test&Set (2 processes)
• Let shared shared variable init to 0 • Let prefer [0], prefer [1] be two shared variables init to ⊥ • Consensus numbers define a hierarchy on the power of synch primitives ⋆ CN (Test&Set) = CN (Swap) = CN (Stack)=
Propose (v) = % issued by pi (i = 0 or 1) %
prefer [i] ← v; val ← Test&Set (shared); case val = 0 then return (v) val = 1 then return (prefer [1 − i]) endcase
CN (Fetch&Add) = CN (Fifo Queue) = . . . = 2 ⋆ CN (Move) = CN (Compare&Swap)= +∞
The “winner” is the first that executes Test&Set(shared)
c Michel Raynal
Synchronization is coming back!
104
c Michel Raynal
Synchronization is coming back!
105
Consensus with a FIFO Queue (2 processes)
Consensus with Compare&Swap (n processes)
• Let queue be a shared queue init to < 0, 1 > (return from an empty queue returns ⊥)
• Let shared shared variable init to ⊥
• Let prefer [0], prefer [1] be two shared variables init to ⊥ Propose (v) = % issued by pi (i = 0 or 1) %
Let f irst be a local variable; f irst ←Compare&Swap (shared, ⊥, v) if f irst = ⊥ then return (v) else return (f irst) endif
prefer [i] ← v; val ←Dequeue (queue); case val = 0 then return (v) val = 1 then return (prefer [1 − i]) endcase
The “winner” is the first that deposits its v in shared
The “winner” is the first process that dequeues 0
c Michel Raynal
Synchronization is coming back!
Propose (v) = % issued by pi (i = 1, 2, . . . , n) %
106
An Observation
c Michel Raynal
Synchronization is coming back!
Herlihy’s Hierarchy
CN = 1
Read/Write objects
• With Test&Set or FIFO Queue: the shared register used to synchronize (determine a winner) and the shared register used to store the decided value are distinct registers • With Compare&Swap: the shared register used to synchronize (determine a winner) and the shared register used to store the decided value are the same register
c Michel Raynal
Synchronization is coming back!
107
108
Test-and-Set, Swap, Queue, Stack, ...
CN = 2
Herlihy’s Hierarchy
CN = +∞ Consensus object Comp-and-Swap, LLCS, Move, ...
c Michel Raynal
Synchronization is coming back!
109
What has been learnt from Herlihy’s Hierarchy
Part X
• Test&Set, Swap, Fetch&Add, etc. are synchronization primitives that are too weak to implement reliable objects in presence of process crashes
A UNIVERSAL CONSTRUCTION
A hierarchy on the power of synchr primitives when addressing fault-tolerance • Additional synchr power is required to design objects tolerant to process crashes. Object combination does not always work! (e.g., atomic read/write objects do not allow the construction of more sophisticated objects) FLP means (here) that fault-masking can be impossible to achieve when solving non-trivial problems if we do not rely on powerful enough synch primitives
c Michel Raynal
Synchronization is coming back!
- Chandra T. and Toueg S., Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43(2):225-267, 1996. - Herlihy M.P., Wait-free synchronization. ACM Toplas, 11(1):124-149, 1991. - Guerraoui R. and Raynal M., Fault-Tolerance Techniques for Concurrent Objects. Tech Report # 1667, 22 pages, IRISA, Universit´ e de Rennes 1 (France), December 2004 http://www.irisa.fr/bibli/publi/pi/2004/1667/1667.html
c Michel Raynal
110
The Specification of the Objects
111
Local data structures • sxi is a local variable containing the state of the object X as currently known by pi
• X an object of type T • Operations: X .op(param) (returns always a response) • The type of X is defined by a transition function δ() ⋆ δ(sx,op(param)) is a non-empty set of (sx′, res) pairs defining all the possible “results” we can obtain when the object X is in the state sx ⋆ Each pair (sx′, res) is such that sx′ is a possible new state of X , and then res is the corresponding result returned by op(param) ⋆ If the set has only one pair, the type T is deterministic Otherwise, it is non-deterministic
c Michel Raynal
Synchronization is coming back!
Synchronization is coming back!
112
• next sni[1..n] is a local array local of sequence numbers next sni[j] (initialized to 1) is the next sequence number that, to pi’s knowledge, pj will associate with its next operation on X • propi (a list), execi (a list), resulti (an invocation result) and ki (an integer) are auxiliary variables Notation: execi[r] = rth element of the list execi |execi| = size of the list execi
c Michel Raynal
Synchronization is coming back!
113
Shared Base Objects
Universal construction (1); User interface
• LAST OP [1 ..n ]: shared array of 1WnR atomic registers Only pi can write LAST OP [i ] Each register LAST OP [j ] has two fields:
when operation X .op(param) is invoked by pi: resulti ← ⊥; LAST OP [i] ← (op(param),next sni[i]); wait until (resulti 6= ⊥); return (resulti)
⋆ LAST OP [j].op: last operation invoked by pj ⋆ LAST OP [j].sn: associated sequence number ⋆ Each entry of the array is initialized to (⊥, 0) • A list of consensus objects CONS [k ] for k = 1, 2, . . . (used sequentially by each process) A process invokes CONS [k ].propose (v)
c Michel Raynal
Synchronization is coming back!
114
Universal construction (2): Background task
Synchronization is coming back!
Synchronization is coming back!
115
Universal construction (3): Background task
while (true ) do Step 1: Build a proposal propi ← ǫ; % empty list % for 1 ≤ j ≤ n do if (LAST OP [j].sn ≥ next sni[j]) then add (LAST OP [j].op, j) to propi end if; end for; Step 2: Try to commit the proposal if (propi 6= ǫ) then ki ← ki + 1; ....... end if end while
c Michel Raynal
c Michel Raynal
116
Assume first that the type T of the object is deterministic execi ← CONS [ki ].propose (propi); let ℓ = |execi|; for r from 1 to ℓ do (sxi, res) ← δ(sxi, exec[r].op); let j = execi[r].proc; next sni[j] ← next sni[j] + 1; if (i = j) then resulti ← res end if end for
c Michel Raynal
Synchronization is coming back!
117
The case of non-deterministic types
Correctness proof: The object is live and safe
• Wait-free property: Show that each operation invoked by a correct process terminates whatever the behavior of the other processes
• Brute force strategy:
This is the liveness property stating that the implementation ensures the operations do terminate
Consider a deterministic reduction δ ′() of δ()
• Linearizability (semantics of the objet):
• A nicer solution: Use the consensus invocation to solve ordering + single result of each operation + single resulting state
From an external observer point of view, the operations on the concurrent object occur as if the object was accessed sequentially by the processes This is the safety property stating that the implementation of the object is linearizable
c Michel Raynal
Synchronization is coming back!
118
A Variant
when operation X .op(param) is invoked by pi: resulti ← ⊥; propi ← op(param) ; wait until (resulti 6= ⊥); return (resulti)
119
• The previous construction: ⋆ Satisfies the non-blocking property: if processes propose operations at least one process progress ⋆ Does not satisfy the wait-free property: the progress of a correct process cannot be ensured • The universal construction algorithm ensures the waitfree property thanks to a helping mechanism
while (true ) do % Background Task % ki ← ki + 1; execi ← CONS [ki ].propose (propi); (sxi, res) ← δ(sxi, execi.op); let j = execi[r].proc; if (i = j) then resulti ← res; propi ← ⊥ end if end while Synchronization is coming back!
Synchronization is coming back!
Wait-free vs Non-Blocking
“Simplified” version of the construction where the shared array LAST OP [1 : n] and the seq numbers are suppressed, propi is a simpe variable (init. to ⊥) and δ(sxi , ⊥) = sxi
c Michel Raynal
c Michel Raynal
The shared array LAST OP [1 : n] (and the associated seq numbers) allows a process to propose to the consensus instances not only its own operations but all the pending operations Helping mechanisms: feature of wait-free computing
120
c Michel Raynal
Synchronization is coming back!
121
Part XI
From Process Failures to Object Failures
When OBJECTS CAN FAIL - Jayanti P., Chandra T.D. and Toueg S., Fault-Tolerant Wait-Free Shared Objects. Journal of the ACM, 45(3):451500, 1998 - Guerraoui R. and Raynal M., Fault-Tolerance Techniques for Concurrent Objects. Tech Report # 1667, 22 pages, IRISA, Universit´ e de Rennes 1 (France), December 2004 http://www.irisa.fr/bibli/publi/pi/2004/1667/1667.html
c Michel Raynal
Synchronization is coming back!
• Until now: we have considered that only processes may fail, objects (atomic registers) were implicitly assumed reliable • From now on: base objects can fail
c Michel Raynal
122
Synchronization is coming back!
123
Object Failure Modes
Failure Mode Hierarchy
• Crash failure of an object: after some time all operations answer ⊥ (responsive failure mode)
• An implementation of a shared object is t-fault tolerant with respect to a failure mode F (crash, omission, arbitrary) if the object remains correct and wait-free despite the occurrence of up to t base objects that fail according to F
• Omission failure of an object: after some time the answers to some processes are always ⊥ • Arbitrary failure of an object: the answers can be arbitrary
• Failure mode F is less severe than the failure mode G (denoted F ≺ G) if a protocol that is t-fault tolerant for the failure mode G is also t-fault tolerant for the failure mode F •
c Michel Raynal
Synchronization is coming back!
124
crashes ≺ omissions ≺ arbitrary failures
c Michel Raynal
Synchronization is coming back!
125
State Machine Replication
A Wait-Free t-Omission Tolerant Implementation t + 1 copies of the base object: base cons[1..(t + 1)]
• Replication management using the State Machine approach (Thomas, Lamport-Schneider’s approach): concurrent RPC-like + votes • Replication + Sequential iteration on replicas The type of control structure applied to replicas becomes decisive (concurrent=independent vs sequential=dependent)
c Michel Raynal
Synchronization is coming back!
126
sequential traversal of the base consensus objects procedure PROPOSE v % est, aux, k: local variables of the invoking process % est ← v; for k from 1 to (t + 1) do aux ← base cons[k].propose est; if (aux 6= ⊥) then est ← aux endif endfor; return(est)
c Michel Raynal
Synchronization is coming back!
127
Graceful Degradation
Graceful Degradation: Example
• A wait-free implementation of a shared object is gracefully degrading if it never fails more severely than the base objects it is derived from, whatever the number of base objects that fail
• Let us assume that base objects can fail by crashing. If the implementation remains wait-free and correct despite the crash of any number of processes and the crash of up to t base objects, then this implementation is t-fault tolerant with respect to the crash failure mode
• Remark: the “severity” relation on failure modes (≺) involves only the existence of a fault-tolerant protocol (it does not involve the notion of graceful degradation)
c Michel Raynal
Synchronization is coming back!
128
• If the implementation is wait-free and fails only by crash (if it fails) despite the crash of any number of processes and the crash of any number of base objects, then it is gracefully degrading
c Michel Raynal
Synchronization is coming back!
129
A Gracefully Degrading t-Omission Tolerant Impl.
Impossibility for the Crash Failure Mode
2t + 1 copies of the base object: base cons[1..(2t + 1)] procedure PROPOSE v % V [1..2t + 1], est, k: local variables of the invoking process % est ← v; for k from 1 to (2t + 1) do V [k] ← base cons[k].propose est; if (V [k] 6= ⊥) ∧ (V [k] 6= est) then est ← V [k]; V [1..(k − 1)] ← [⊥, . . . , ⊥] endif endfor; if (#⊥(V ) > t) then return(⊥) else return(est) endif
c Michel Raynal
Synchronization is coming back!
130
• It is impossible to design gracefully degrading t-fault tolerant wait-free implementations for the crash failure mode (for any object) • It is possible to design gracefully degrading t-fault tolerant wait-free implementations for the omission (or arbitrary) failure mode • Hence (Jayanti-Chandra-Toueg 1998): Combining fault-tolerance and graceful degradation is not possible for all failure modes
c Michel Raynal
Conclusion
Synchronization is coming back!
131
Conclusion
• Linearizability • Wait-free computing
• “Wait-Free” concept
• Consensus number, Herlihy Hierarchy
• Wait-free objects
• Universal construction
• Process failure vs Object failures
• Object failure modes
• Fault-tolerant wait-free objects
• Process failure vs object failures
• Gracefully Degrading wait-free objects
• t-Resilient wait-free objects • Gracefully degrading objects
c Michel Raynal
Synchronization is coming back!
132
c Michel Raynal
Synchronization is coming back!
133
Three books
The other “only slide to remember” !
• Taubenfeld G., Synchronization algorithms and concurrent programming. Pearson Education/Prentice Hall, 423 pages, 2006 (ISBN 0-131-97259-6) • Herlihy M. and Shavit N., The art of multiprocessor programming. Morgan Kaufmann, 508 pages, 2008 (ISBN 978-0-12-370591-4)
Asynchrony and failures do modify our view of synchronization
• Raynal M., Concurrent programming: algorithms, principles and foundations. Springer 510 pages, November 2012 (ISBN 978-3-642-32026-2)
c Michel Raynal
Synchronization is coming back!
134
c Michel Raynal
Synchronization is coming back!
135