A transaction is
a mechanism for describing logical units of database processing.
Transaction processing systems are systems with large databases and hundreds of concurrent
users. Some examples include:
Airline Reservations
Banking Systems
Credit Card Processing
Online retail Purchases
Stock Markets
Supermarket Checkouts
Single-User vs Multi-User Systems
A DBMS is single
user if only one user at a time can use the system
A DBMS is
multi-user if the system is access by multiple users at the same time (concurrently)
Concurrency is
achieved using the concept of multiprogramming. It allows the operating system to execute
multiple processes at the same time
Interleaving
Process A runs,
at some time later is switched out (or on I/O)
Process B runs,
at some point later is switched out (or on I/O)
Process A is
switched in and continues, at some point later is switched out (or on I/O)
This process
repeats for the number of processes
Hence, although
we think these processes are running at the same time, they are actually interleaving
Transactions
Transactions:
are executing programs that forms a logical unit of database processing
include one or more database access operations
can be embedded within the application program or within the SQL server
Whenever a
transaction is submitted to a DBMS for execution, the system is responsible for making sure
that all the operations in the transaction are completed successfully (COMMITTED) or
If something goes wrong the database must not be updated (ABORTED)
A transaction is
a logical unit of instructions, if the transaction fails, all changes for that transaction
must be rolled back
Transaction
States and Additional Operations:
Concurrency Control
Several Problems
can occur when concurrent transactions execute in an uncontrolled manner
The Lost Update
The Temporary Update (Dirty Read)
The Incorrect Summary
The Unrepeatable Read
ACID Properties
Transactions
should possess several properties which are called the ACID properties
Atomicity: A sequence of operations either all complete successfully or all
fail
Consistency: Data is never in an invalid state
Isolation: Each operation is independent of other concurrent operations
Durability: Data is safely stored in case of a system failure
Schedules
A scheduled is
defined as: S of n transaction T1, T2, … , Tn.
When executing
transactions concurrently in an interleaved fashion we need to create a schedule.
A shorthand
notation describing a schedule uses symbols b, r, w, e, c and a.
Example:
Conflicting
Operations in a Schedule could result in inconsistent data! Two operations in a schedule are
said to conflict if they satisfy all three conditions
1: They belong to different transactions
2: They access the same item X
3: At least one of the operations is a write
Example:
A Complete Schedule
The goal is to
establish a complete schedule, which is defined as:
The operations in S are exactly those operations in T1, T2, …
, Tn.
For any pair of operations from the same transaction, their relative order in S is
preserved
For any two conflicting operations, one of the two must occur before the other in
the schedule
The Easy Approach:
Serial Schedules which run after the other in series.
With two transactions there can only be two possible schedules using a
serial schedule.
Unacceptable in practice due to long I/O waits (no switching). A large
transaction might hog the CPU
Allowing Interleaving:
Interleaving of Transactions results in many possible schedules, here are
two examples:
Non-serial is a bit hit or miss and can give us wrong results but we want
non-serial because of concurrency.
Serialisable Schedule:
- A schedule S of n transactions is serialisable if it is equivalent to some
serial schedule of the same n transactions
There are n! possible schedules of serial transactions (many more for non
serial schedules)
Result Equivalent: In the simplest terms, compare the results of a serial
schedule to that of a non serial schedule.
The safe approach is to focus only on the read and write operations of the
transactions. Don't make assumptions about the other internal operations
included within the transactions.
Two schedules are equivalent if the operations applied to each data item
affected by the schedules should be applied to that item in both schedules
in the same order.
The most common approach to equivalence of schedules is the conflict
equivalence.
Conflict Equivalence Of Two Schedules & Serialisable
Schedules
Two schedules
are said to be conflict equivalent if the relative order of any two conflicting operations
is the same in both schedules.
Different transactions
Access the same data item
Both or one is a write operation
Not Conflicting
Equivalence is when conflicting operations are applied in different orders in two schedules.
S is
serialisable if its (conflict) equivalent to some serial schedule
Therefore, by
reordering the operations in a schedule it could become serialisable
Practice Question
Which of the following
statements about transaction processing and scheduling in database systems are correct?
End Of Week
Quiz
Note: A score of 70% or higher would mean you
have successfully completed the week 8 workshop