Chapter 17 notes:

Transactions

 

What is a transaction?  an executing program that forms a logical unit of database processing

 

A transaction needs to be fully done or fully undone and needs to be consistent, having integrity.

 

Why are we concerned with transactions?  We have multiprogramming environments in which more than one user may be trying to access the same data, and there maybe transaction failures that need recovery.

 

Access Operations:  read_item  and write_item

 

read_item(X) -

 

1) Find the address of the disk block that contains item X

2) Copy that disk block into a buffer in main memory (if that disk block is not already in main memory)

3) Copy item X from the buffer to the program variable X

 

write_item(X)

 

1) Find the address of the disk block that contains item X

2) Copy that disk block into a buffer in main memory (if that disk block is not already in main memory)

3) Copy item X from the program variable X into its correct location in the buffer (X has been modified by the program)

4) Store the updated buffer block from the buffer back to disk (either immediately or at some later point in time)

 

Why do we need concurrency control?

 

Lost Update Problem

Temporary Update (Dirty Read) Problem

Incorrect Summary Problem

 

Why do we need recovery?

Types of Failures:

1) Computer failure/system crash

2) Transaction or system error

3) Local errors or exception conditions detected by transaction

4) Concurrency control enforcement

5) Disk failure

6) Physical problems/catastrophes

 

Transaction States -

 

Begin_transaction

Read or Write

End_transaction

Commit_transaction

Rollback (or abort)

 

How progress steps of transactions are recorded in the System Log -

 

[start_transaction, T1]

[write_item, T1, X, old_value, new_value]

[read_item, T1, X]

[commit, T1]

[abort, T1]

 

Desirable properties of Transactions -

 

1) Atomicity

2) Consistency preservation

3) Isolation

4) Durability or permanency

 

Schedules of Transactions -

 

Concept of conflict between 2 operations/transactions in a schedule -

2 operations are in conflict if -

1) They belong to 2 different transactions

2) They access the same data item X

3) one of those operations is a write_item

 

Serial, Nonserial and Conflict-Serializable Schedules -

 

A schedule is (conflict) serializable if it is equivalent to some serial schedule.

 

Using a precedence graph to test for conflict serializability -

 

How to construct the graph -

 

1) for each transaction Ti participating in schedule S, create a node Ti in the precedence graph

2) for each case in S where Tj executes a read_item(X) after Ti executes a write_item(X), create an edge Ti -->  Tj in the precedence graph

3) for each case in S where Tj executes a write_item(X) after Ti executes a read_item(X), create an edge Ti -->  Tj in the precedence graph

4) for each case in S where Tj executes a write_item(X) after Ti executes a write_item(X), create an edge Ti -->  Tj in the precedence graph

5) the schedule S is serializable if and only if the precedence graph has no cycles

 

Cascading Rollback - Occurs when a transaction has to be rolled back because it read an item from a transaction that failed.  For e.g. - on P. 565 we have the schedule -

Se: r1(X);w1(X);r1(Y); w2(X);w1(Y);a1; a2

The a2 was forced (is a part of a cascading rollback) since T2 must also be rolled back since it read X after T1 altered it and T1 was rolled back.