Theorem 2: A schedule S is conflict-serializable iff the corresponding precedence graph of schedule S does not contain any cycles. (Proof)
How to prove serializability?
Based on Theorems 1 and 2:
acyclic precedence graph => conflict-serializable => serializable
Thus, one way to prove that a system is serializable is to show that the precedence graphs of ALL possible schedules of the system are acyclic.
Serializability of FoundationDB
We are not going to cove all details of FoundationDB in this post. If you are interested, you can refer to this earlier post. Here, we briefly review the overall mechanism that FoundationDB uses to guarantee serializability. Then, we should the proof of serializability. Note that in this post, we abstract away the details of FoundationDB such as scaling components, crash/recovery, etc., and focus on the main OCC mechanism used by FoundationDb to provide an example of how one can prove serializability. Summary of Concurrency Control used by FoundationDB
FoundationDB uses Optimistic Concurrency Control (OCC) for enforcing serializability. In this approach, we don't lock any item as the client is performing its read/write operations. Instead, the client keeps track of its reads and buffers its writes in the read and write sets, respectively. For the read set, the client keeps track of keys it read with the version number of the value that it read for each key. For the write set, the client buffers <key, value> pairs that it wants to write. Note that before starting its transaction, the client gets a read version from the database. The read version is guaranteed to be larger than any previously issued committed version. The client cannot perform a read operation on replica if the replica has not received all the versions with commit versions smaller than or equal to the read version of the transaction. Once the client is ready to commit its transaction, it submits its read and write sets to one of the proxies. The proxy gets a commit version from a special node called Master. Then, the proxy submits the read and write sets and the commit version to the resolver. The resolver keeps track of the latest version written for each key. To decide whether to commit or abort the transaction, the resolver checks the read set of the transaction. If any of the versions read by the client is overwritten by a newer version, the resolver aborts the transaction. Otherwise, the resolver commits the transaction. When the transaction commits a transaction, it updates its cache of versions by the write set of the transaction and its commit version. Specifically, for each key in the write set of the transaction, it updates the version to the commit version of the transaction. Once the proxy receives the commit decision from the resolver, it appends the write set of the transaction to the transaction log and returns success to the client. The storage nodes later pull updates from the transaction log and apply them.
Obviously, a resolver cannot keep the write versions forever, as this way the memory will be exhausted. Thus, in FoundationDB, each resolver keeps track of the written version only for a limited amount of time which is 5 (s) by default. After 5 seconds cache entries will be deleted. This cleanup requires us to abort any transaction that lives longer than 5 (s). Otherwise, the serializability may be violated. We prove serializability first without the cache cleanup and then with cache cleanup.
Proof of Serializability without Cache Cleanup
We have to show that the precedence graph of any arbitrary history H: ...opi, ..., opj, ... (where opi belongs to committed transaction Ti and opj belongs to committed transaction Tj) of the system is acyclic.
FoundationDB guarantees that it appends the write set of the transaction to the transaction log in the order with which it conflict-checks the transaction and storage nodes also apply writes in the order defined by the transaction log. If you are interested to learn how it guarantees this ordering despite having multiple proxies and resolvers you can refer to this earlier post and this conversation on FoundationDB forum. Here, we assume that. We denote Ti <conflict-checking Tj , when transaction Ti is checked by the resolver before Tj. Since conflict-checking is done sequentially, <conflict-checking is a total order.
For any two conflicting operations opi and opj in H: ...opi, ..., opj ..., such that opi and opj belong to different transactions, based on Definition 4, both operations should be accessing the same element and one of them should be a write operation. Thus, we have one of the following cases:
Case 1: opi = W(A), opj = W(A)
Since storage nodes apply transactions in the order of the transaction log, and the transactions are appended to the transaction log in the order the resolver conflict-check them, and since opi happens before opj in the history, we conclude Ti <conflict-checking Tj.
Case 2: opi = W(A), opj = R(A)
The storage servers apply a transaction only after a transaction is conflict-checked and appended to the log. Thus, opi is done after Ti is conflict-checked. Thus,
time (conflict-checking of Ti) < time (opi) (1)
Since opi happens before opj in the history
time (opi) < time (opj) (2)
As we explained above, the client performs all of its read operations before submitting and conflict-checking the transaction. Thus, the conflict-checking of transaction Tj is done after R(A) operation. Thus,
time(opj) < time (conflict-checking of Tj) (3)
from (1), (2), and (3), we conclude Ti <conflict-checking Tj.
Case 3: opi = R(A), opj = W(A)
For this case, we prove Ti <conflict-checking Tj by contradiction. Assume Tj <conflict-checking Ti :
Since opi happens before opj in H, opi could not have read the value written by opj. Since storage servers apply transactions in the order of the transaction log, opi must have read a value older than what is written by opj. By assuming Tj <conflict-checking Ti, we know the write set of Tj is already applied in the cache of the resolver when the resolver is conflict-checking opi. Thus, since opi reads a value older than what is written by opj, the resolver will abort Ti (contradiction!). Thus, the assumption of Tj <conflict-checking Ti cannot be valid. Hence, Ti <conflict-checking Tj.
In all cases, we showed for any two conflicting operations opi and opj in H: ...opi, ..., opj ..., Ti <conflict-checking Tj. Thus, in the precedence graph of H, for any edge Ti --> Tj, we have Ti <conflict-checking Tj. Since the order of conflict-checking is a total order, the precedence graph does not have any cycle. Thus, according to Theorem 1, H is conflict-serializable, and according to Theorem 2, it is serializable. Thus, we proved any arbitrary schedule H is serializable.
Proof of Serializability with Cache Cleanup and 5s Limit
The proof with cache cleanup is similar to the case without cache cleanup except for Case 3:
Case 3: opi = R(A), opj = W(A)
Like before, we prove Ti <conflict-checking Tj by contradiction. Assume Tj <conflict-checking Ti :
Since opi happens before opj in H, opi could not have read the value written by opj. Since storage servers apply transactions in the order of the transaction log, opi must have read a value older than what is written by opj. By assuming Tj <conflict-checking Ti, we know the write set of Tj is already applied in the cache of the resolver when the resolver is conflict-checking opi.
Now, we have two cases:
Case 3.1: The write-set of Tj is not yet removed from the cache at the time of conflict-checking of Ti
Since opi reads a value older than what is written by opj, and the write-set of Tj is in the cache, the resolver will abort Ti (contradiction!).
Case 3.2: The write-set of Tj is removed from the cache at the time of conflict-checking of Ti
Since the write-set of Tj is removed from the cache, we know that Tj must be older than 5s. Since Ti is committed, we know that at the time of conflict-checking, it is younger than 5s. Since Tj is older than 5s and Ti is younger than 5s, we know that the read-version of Ti must be larger than the commit version of Tj. Thus, Ti must perform its read operation (i.e. opi) after the version written by Tj is applied. Thus, opi happens after opj (contradiction!).
Thus, in both cases, the assumption of Tj <conflict-checking Ti cannot be valid. Hence, Ti <conflict-checking Tj.
Proof of Linearizability
To show
strict serializability, we have to also show that the system is linearizable. To prove linearizability we have to show that for any committed transactions T1 and T2, when T1 starts after T2 commits, T1 sees the effect of T2. Suppose T1 does not see the effect of T2. Since T1 starts after T2 commits, we know that conflict-checking of T1 is done after that of T2. With the same argument provided for Case 3 above (with/without cache cleanup), we reach a contradiction.
Summary
In this post, we took a closer look into serializability. We reviewed the definition of conflict-serializability that is stricter than serializability but is easier to verify. Specifically, using the precedence graph, we can easily check whether a schedule is conflict-serializable or not. Finally, we showed how we can prove the serializability of the OCC mechanism used by FoundationDB by showing that the precedence graph of any history of the system is acyclic. If you are interested to see how to prove conflict-serializability for 2PL as another approach for concurrency control see this. References
[1] Hector Garcia-Molina, Jeffrey Ullman, and Jennifer Widom
, "Database Systems: The Complete Book (2nd Edition)" (buy on
Amazon)
Comments
Post a Comment