Database Recovery
Database recovery is the method of restoring the
database to its correct state in the event of a failure at the time of the
transaction or after the end of a process. Database systems, like any other
computer system, are subject to failures but the data stored in it must be available
as and when required. When a database fails it must possess the facilities for
fast recovery. It must also have atomicity i.e. either transactions are
completed successfully and committed (the effect is recorded permanently in the
database) or the transaction should have no effect on the database.
Kinds of failures
1. Transaction failure: A transaction needs to abort once it fails
to execute or once it reaches to any further extent from wherever it can’t go
to any extent further. This is often known as transaction failure wherever
solely many transactions or processes are hurt. The reasons for transaction
failure are:
·
Logical errors
·
System errors
Logical errors: Where a transaction cannot
complete as a result of its code error or an internal error condition.
System errors: Wherever the information
system itself terminates a transaction as a result of the DBMS isn’t able to
execute it, due to some system condition.
2. System crash: There are issues − external to the system
− that will cause the system to prevent abruptly and cause the system to crash.
For instance, interruptions in power supply might cause the failure of
underlying hardware or software package failure. Examples might include OS
errors.
3. Disk failure: In early days of technology evolution, it
had been a typical drawback wherever hard-disk drives or storage drives
accustomed to failing oftentimes. Disk failures include the formation of
dangerous sectors, unreachability to the disk, disk crash or the other failure,
that destroys all or a section of disk storage.
The
sources of failure are:
- Due to hardware or software
errors, the system crashes which ultimately resulting in loss of main
memory.
- Failures of media, such as head crashes or unreadable media that results
in the loss of portions of secondary storage.
- There can be application
software errors, such as logical errors which are accessing the
database that can cause one or more transactions to abort or fail.
- Natural physical disasters can also occur such as fires, floods, earthquakes, or
power failures.
- Carelessness or unintentional
destruction of data or directories by
operators or users.
- Damage or intentional
corruption or hampering of data (using
malicious software or files) hardware or software facilities.
Whatever
the reasons of the failure are, there are two principal things that have to be
considered:
- Failure of main memory
including that database buffers.
- Failure of the disk copy of
that database.
Failure Control Methods
There
are 2 forms of techniques, which may facilitate a database management system in
recovering as well as maintaining the atomicity of a transaction:
- Maintaining the logs of every
transaction, and writing them onto some stable storage before truly
modifying the info.
- Maintaining shadow paging, wherever
the changes are done on a volatile memory, and later, and the particular
info is updated.
Log-based recovery Or (Manual Recovery):
Log
could be a sequence of records, which maintains the records of actions
performed by dealing. It’s necessary that the logs area unit written before the
particular modification and hold on a stable storage media, that is failsafe.
Log-based recovery works as follows:
- The log file is unbroken on a
stable storage media.
- When a transaction enters the
system and starts execution, it writes a log regarding it.
Recovery with concurrent transactions (Automated Recovery):
When
more than one transaction are being executed in parallel, the logs are
interleaved. At the time of recovery, it would become hard for the recovery system
to backtrack all logs, and then start recovering. To ease this situation, most
modern DBMS use the concept of 'checkpoints'.
Checkpoint:
Keeping and maintaining logs in real time and in real environment may fill out
all the memory space available in the system. As time passes, the log file may
grow too big to be handled at all. Checkpoint is a mechanism where all the
previous logs are removed from the system and stored permanently in a storage
disk. Checkpoint declares a point before which the DBMS was in consistent
state, and all the transactions were committed.
Recovery
When
a system with concurrent transactions crashes and recovers, it behaves in the
following manner
- The recovery system reads the
logs backwards from the end to the last checkpoint.
- It maintains two lists, an
undo-list and a redo-list.
- If the recovery system sees a
log with <Tn, Start> and <Tn, Commit> or
just <Tn, Commit>, it puts the transaction in the
redo-list.
- If the recovery system sees a
log with <Tn, Start> but no commit or abort log found, it
puts the transaction in undo-list.
All
the transactions in the undo-list are then undone and their logs are removed.
All the transactions in the redo-list and their previous logs are removed and
then redone before saving their logs.
No comments:
Post a Comment