ETH Zurich Systems Group Prof. G. Alonso

12 May - 15 May Data Modeling and Databases Exercise Sheet 11

Transactions - Solution 1

Read Only Transactions

Transactions are used to manage parallel changes to a database. In cases where all applications only read the database, transactions can be omitted. However, if some applications make changes to the database, can transactions still be omitted for read-only applications? If not, why? Can you give some examples?

Solution In general, read operations are not in conflict. Therefore, in read-only application there is no need for synchronization (i.e. the history is always serializable). However, a write operation conflicts with all read operations that read the same data object. Hence, not all such histories are serializable and therefore transactions are required.

2

2PL and Snapshot Isolation

Let T1 , T2 , T3 , T4 be transactions that operate on objects A, B, C, D, E. Now consider the following histories: H1 H2 H3

H4

r1 (A), r2 (B), r3 (B), r3 (C), w2 (A), r2 (D), r1 (A), w1 (B), w2 (D), r1 (A), w2 (C), w2 (B), r2 (B), w3 (B), r2 (B), c2 , w4 (C), r4 (C), w4 (A), c1 , c4 , c3 r1 (A), r2 (C), w3 (D), w1 (A), r1 (D), w2 (A), r2 (B), r2 (C), w2 (B), w3 (C), r2 (A), w1 (B), r1 (B), r3 (D), w1 (B), c1 , r3 (B), c2 , c3 r1 (E), r2 (B), r2 (A), w2 (B), w2 (A), w1 (B), r2 (D), r2 (E), r3 (E), r2 (A), r2 (C), w2 (A), w2 (D), r1 (A), w2 (C), w1 (A), r1 (C), r2 (E), r3 (D), r1 (A), w3 (D), w1 (A), r3 (A), w1 (C), r3 (A), w1 (B), r3 (C), r3 (B), r3 (C), w3 (A), c1 , c2 , c3 r3 (A), r2 (C), r1 (B), w1 (A), r1 (C), r2 (A), a1 , w2 (C), c2 , r3 (C), c3

Now for each of the above histories, answer the following questions: 1. Is the history serializable? 2. If yes, what is the equivalent serial history? (i.e. sequential history with no transaction overlap in time) 3. Is it possible to execute the given history using 2PL? How does 2PL behave? 4. How does snapshot isolation behave? If the history is accepted, are there any inconsistencies?

1

Solution H1 1. The history is not serializable. The subsequence r1 (A), w2 (A) indicates T1 → T2 , however r2 (B), w1 (B) indicates T2 → T1 . Hence, there is a cycle in the dependency graph. 3.

• 2PL: Non-serializable histories cannot be generated by 2PL. • SI: The history could not have been generated by Snapshot Isolation; both T1 and T2 write to B and T2 commits after the beginning of transaction T1 , but before the commit of T1 .

H2 1. The history is not serializable. The subsequence w1 (A), . . . , w2 (A) indicates T1 → T2 whereas the subsequence r2 (B), . . . , w1 (B) indicates T2 → T1 . Hence, there is a cycle in the dependency graph. 3.

• 2PL: Non-serializable histories cannot be generated by 2PL. • SI: The history could not have been generated by Snapshot Isoluation; T2 needs to be aborted, since both T1 and T2 write A and B and T1 commits after the beginning, but before the commit of transaction T2 .

H3 1. The history is serializable. The corresponding dependency graph is:

T1 T2

T3

2. Hence, an equivalent serial history is: T2 → T1 → T3 . 3.

• SI: The history could not have been generated by Snapshot Isolation: Both T1 and T2 write A and B and T1 committed after the beginning of transaction, but before the commit of T2 . • 2PL: The history could have been generated by 2PL. In the following, LS (A) and LX (A) denote the attempt of acquiring a lock on data object A for shared and exclusive access, respectively; US (A) and UX (A) denote the release of the shared and exclusive lock:

2

T1 BOT LS (E), r1 (E)

T2

T3

BOT LX (A), LX (B), LX (C), LX (D), LS (E) r2 (B), r2 (A), w2 (B), w2 (A), UX (B)

LX (B), w1 (B) r2 (D), r2 (E)

BOT LS (E), r3 (E)

r2 (A), r2 (C), w2 (A), w2 (D) UX (A), UX (D) LX (A), r1 (A) w2 (C), UX (C) LX (C), w1 (A), r1 (C) r2 (E), US (E) LX (D), r3 (D) r1 (A) w3 (D) w1 (A), UX (A) LX (A), r3 (A) w1 (C), U X(C) r3 (A) w1 (B), UX (B) LS (B), LS (C) r3 (C), r3 (B), r3 (C), w3 (A) UX (A), UX (D), US (B), US (C), US (E) Commit Commit Commit H4 1. The history is not serializable: The subsequence r3 (A), w1 (A), r2 (A) indicates the dependency T3 → T1 → T2 and the subsequence r2 (C), r1 (C), w2 (C), c2 , r3 (C) indicates the dependency T1 → T2 → T3 . The resulting dependency graph thus already contains a cycle, however it is not complete. An abort of a transaction is similar to a write on all object which have been written by the aborted transaction. Hence, since T1 was aborted, the history H4 can be rewritten by replacing a1 by w1 (A): H4′ :

r3 (A), r2 (C), r1 (B), w1 (A), r1 (C), r2 (A), w1 (A), w2 (C), c2 , r3 (C), c3

Now, the subsequence r2 (A), w1 (A) indicates an additional dependency T2 → T1 , resulting in the following dependency graph:

T2 T1 3.

T3

• 2PL: The history is not serializable and thus could not have been generated by 2PL. • SI: The history can be generated by SI. T1 is aborted, so it can be neglected. T3 is read-only, therefore T2 and T3 do not conflict.

3

3

Serializability reloaded

Is the following history serializable? If yes, what is the equivalent serial history? If it is not serializable, indicate why. 1: 2: 3: 4: 5:

T1 T2 T2 T1 T1

select sum(balance) from account; insert into account(no, balance) values (4711, 0); commit select avg(balance) from account; commit

Solution Two histories are equivalent if • all read-operations of committed transactions return the same result, and • at the end, the state of the database is the same. The above history is equivalent to a serial history in which T2 is executed before T1 : the database state is the same for both histories and the read operations (select sum and select avg) return the same result (the newly inserted value is 0 so it’s neutral with respect to sum). Although the above history and the serial history T2, T1 do not order the conflicts in the same way (the sum on the accounts and the insert are not in the same order in both), they are equivalent. This makes the condition in the lecture’s lemma on histories equivalence sufficient but not necessary.

4

ACID

Why is it reasonable to check the integrity constraints at the end of a transaction? Which if the ACIDproperties is required for this? Solution The integrity constraints are only checked at the end of a transaction, because during the lifetime of a transaction the database could be in an inconsistent state. A transaction should transform a consistent database state to another consistent database state (the answer to the question is therefore Consistency).

5

ACID II

The following is a true story. On May 30th 2009, D.A.K. flew from Zurich to San Fransisco via Munich. At the Munich airport, there are electronic barriers equipped with bar-code readers. At the gate, the passengers insert their boarding pass into the bar-code reader, the barrier opens, the passenger can pass and the barrier closes. Unfortunately, just when D.A.K. inserted his boarding pass, the electronic barrier broke down. Hence, he tried again at a different barrier, only to find out that he was rejected because the same boarding pass had allegedly already been used. Nonetheless, the staff let D.A.K. enter the airplane. Later on, when D.A.K. was already seated in the airplane, enjoying a cool drink, it was announced that the take-off was delayed for 30 minutes: Some passenger’s luggage had to be removed from the airplane, because he or she apparently had not boarded the airplane. It was only in San Fransisco that D.A.K. realized that it was actually his luggage that was removed from the airplane. This only happened because an important principle of transaction management was violated. Which one is it? (The answer comprises only one word).

4

Solution Atomicity: the last part of the transaction (passing the barrier) was not executed but still the transaction was committed to the database.

6

2PL vs. Snapshot Isolation

Give an example of a history that is accepted by Snapshot Isolation but rejected by 2PL. Is your history serializable? Does Snapshot Isolation lead to inconsistencies? Solution History H4 from exercise 3 is an example; Snapshot Isolation cannot generate inconsistencies in the given history because after reading C (the only object that could generate inconsistencies because it is concurrently written by T2 ), T3 makes no changes to the database. Another example: T1

T2 r2 (A)

r1 (B) w1 (A) w2 (B) This history could not have been generated by 2PL, because it leads to a deadlock. T1 waits on A and T2 waits on B. However, the history could have been generated by Snapshot Isolation, since the two transactions write to different objects. Inconsistencies could arise if, for example, in transaction T2 the value written to B depends on the value read for A (A was changed in the meantime by T1 ).

5