Recovery algorithms are techniques to ensure database consistency, transaction atomicity, and durability despite failures.
The key primitives that used in recovery algorithms are UNDO and REDO.
- UNDO: The process of removing the effects of an incomplete or aborted transaction.
- REDO: The process of re-instating the effects of a committed transaction for durability.
1. Failure Classsification
<aside>
💡 Storage Types
- Volatile Storage
- Non-volatile Storage
- Stable-Storage → A non-existent form of non-volatile storage that survives all possible failures scenarios.
</aside>
1.1 Transaction Failures
A transaction reaches an error and must be aborted
- Logical Errors → Transaction cannot complete due to some internal error condition (e.g., integrity constraint violation).
- Internal State Errors → DBMS must terminate an active transaction due to an error condition
1.2 System Failures
Unintented failures in hardware or software that must also be accounted for in crash recovery protocols.
- Software Failure → There is a problem with the DBMS implementation (e.g., uncaught divide-by-zero exception) and the system has to halt.
- Hardware Failure → The computer hosting the DBMS crashes (e.g., power plug gets pulled).
- Fail-stop Assumption: Non-volatile storage contents are assumed to not be corrupted by system crash.
1.3 Storage Media Failure
Non-repairable failures that occur when the physical storage machine is damaged.
When the storage media fails, the DBMS must be restored from an archived version.
- Non-Repairable Hardware Failure → A head crash or similar disk failure destroys all or parts of non-volatile storage.
- Destruction is assumed to be detectable.