Chapter 17 notes:
Transactions
What is a transaction? an executing program that forms a logical unit of database processing
A transaction needs to be fully done or fully undone and needs to be consistent, having integrity.
Why are we concerned with transactions? We have multiprogramming environments in which more than one user may be trying to access the same data, and there maybe transaction failures that need recovery.
Access Operations: read_item and write_item
read_item(X) -
1) Find the address of the disk block that contains item X
2) Copy that disk block into a buffer in main memory (if that disk block is not already in main memory)
3) Copy item X from the buffer to the program variable X
write_item(X)
1) Find the address of the disk block that contains item X
2) Copy that disk block into a buffer in main memory (if that disk block is not already in main memory)
3) Copy item X from the program variable X into its correct location in the buffer (X has been modified by the program)
4) Store the updated buffer block from the buffer back to disk (either immediately or at some later point in time)
Why do we need concurrency control?
Lost Update Problem
Temporary Update (Dirty Read) Problem
Incorrect Summary Problem
Why do we need recovery?
Types of Failures:
1) Computer failure/system crash
2) Transaction or system error
3) Local errors or exception conditions detected by transaction
4) Concurrency control enforcement
5) Disk failure
6) Physical problems/catastrophes
Transaction States -
Begin_transaction
Read or Write
End_transaction
Commit_transaction
Rollback (or abort)
How progress steps of transactions are recorded in the System Log -
[start_transaction, T1]
[write_item, T1, X, old_value, new_value]
[read_item, T1, X]
[commit, T1]
[abort, T1]
Desirable properties of Transactions -
1) Atomicity
2) Consistency preservation
3) Isolation
4) Durability or permanency
Schedules of Transactions -
Concept of conflict between 2 operations/transactions in a schedule -
2 operations are in conflict if -
1) They belong to 2 different transactions
2) They access the same data item X
3) one of those operations is a write_item
Serial, Nonserial and Conflict-Serializable Schedules -
A schedule is (conflict) serializable if it is equivalent to some serial schedule.
Using a precedence graph to test for conflict serializability -
How to construct the graph -
1) for each transaction Ti participating in schedule S, create a node Ti in the precedence graph
2) for each case in S where Tj executes a read_item(X) after Ti executes a write_item(X), create an edge Ti --> Tj in the precedence graph
3) for each case in S where Tj executes a write_item(X) after Ti executes a read_item(X), create an edge Ti --> Tj in the precedence graph
4) for each case in S where Tj executes a write_item(X) after Ti executes a write_item(X), create an edge Ti --> Tj in the precedence graph
5) the schedule S is serializable if and only if the precedence graph has no cycles
Cascading Rollback - Occurs when a transaction has to be rolled back because it read an item from a transaction that failed. For e.g. - on P. 565 we have the schedule -
Se: r1(X);w1(X);r1(Y); w2(X);w1(Y);a1; a2
The a2 was forced (is a part of a cascading rollback) since T2 must also be rolled back since it read X after T1 altered it and T1 was rolled back.