Explain recovery techniques with example.
Of course. Here is a detailed explanation of database recovery techniques, starting with the foundational concept of the transaction log and then moving to specific techniques with a complete example.
Database Recovery is the process of restoring a database to its last correct and consistent state after a system failure. Failures can range from a software crash or hardware malfunction to a power outage. The primary goal of recovery is to ensure the ACID properties of transactions are maintained, specifically Atomicity (all-or-nothing) and Durability (committed changes are permanent).
Recovery techniques are almost entirely dependent on one critical component: the Transaction Log (or journal). This is a special file on the disk where the DBMS writes a record of all modifications before they are written to the actual database files.
The log contains entries for each transaction, including:
Transaction start (<T1, begin>
).
The data modification itself, often including the before-image (the value before the change) and the after-image (the value after the change).
* Transaction end (<T1, commit>
or <T1, abort>
).
This log is written sequentially and is the single source of truth for what has happened in the database.
Here are some of the main techniques a DBMS uses to handle recovery.
commit
record in the log, their changes might need to be redone to ensure they are durable (in case the disk write didn't complete before the crash). This is the REDO operation.commit
record, their changes must be undone from the database to ensure atomicity. This is the UNDO operation.commit
record.<CHECKPOINT>
record to the transaction log.Let's walk through a scenario using the Immediate Modification technique with checkpoints.
We have a simple Accounts
table.
| AccountID | Holder | Balance |
| :--- | :--- | :--- |
| 101 | Alice | 1000 |
| 102 | Bob | 500 |
The following events occur in order, and each is written to the transaction log.
BEGIN TRANSACTION T1
T1: UPDATE Accounts SET Balance = 900 WHERE AccountID = 101
(Alice's balance changes from 1000 to 900).COMMIT T1
<CHECKPOINT>
- The system performs a checkpoint. At this moment, we are certain that Alice's balance of 900 is saved permanently on disk.BEGIN TRANSACTION T2
T2: UPDATE Accounts SET Balance = 600 WHERE AccountID = 102
(Bob's balance changes from 500 to 600).COMMIT T2
BEGIN TRANSACTION T3
T3: UPDATE Accounts SET Balance = 850 WHERE AccountID = 101
(Alice's balance changes from 900 to 850).(At this exact moment, a power failure occurs and the system crashes!)
When the database restarts, the recovery manager takes over.
Find the Last Checkpoint: It scans the log backward and finds the last <CHECKPOINT>
record (at step 4). It knows the database was consistent at that point.
Scan Forward from the Checkpoint: It scans the log forward from the checkpoint to identify all transactions that were active after it.
It finds Transaction T2
, which has a COMMIT
record.
It finds Transaction T3
, which has a BEGIN
record but no COMMIT
record.
Create UNDO and REDO Lists:
REDO-list: Contains all transactions that committed after the last checkpoint. -> [T2]
UNDO-list: Contains all transactions that started but did not commit. -> [T3]
Execute the REDO Phase: The recovery manager re-applies all operations for transactions in the REDO-list
.
It re-executes the operation from T2
: UPDATE Accounts SET Balance = 600 WHERE AccountID = 102
.
Purpose: This ensures the durability of T2
. Even if this change hadn't made it to the disk before the crash, it is now guaranteed to be there.
Execute the UNDO Phase: The recovery manager reverts all operations for transactions in the UNDO-list
in reverse order.
It undoes the operation from T3
by using the "before-image" from the log. It changes Alice's balance back from 850 to 900.
Purpose: This ensures the atomicity of T3
. Since it didn't complete, none of its changes should be permanent.
After the recovery process is complete, the Accounts
table is in the following state:
| AccountID | Holder | Balance |
| :--- | :--- | :--- |
| 101 | Alice | 900 |
| 102 | Bob | 600 |
The database is now correct and consistent. The committed changes from T2
are present, and the uncommitted changes from T3
have been completely removed.