Database Management Systems: Database Recovery - Kinds of failures - Failure Control Methods

Database Recovery

Database recovery is the method of restoring the database to its correct state in the event of a failure at the time of the transaction or after the end of a process. Database systems, like any other computer system, are subject to failures but the data stored in it must be available as and when required. When a database fails it must possess the facilities for fast recovery. It must also have atomicity i.e. either transactions are completed successfully and committed (the effect is recorded permanently in the database) or the transaction should have no effect on the database.

Kinds of failures

1. Transaction failure: A transaction needs to abort once it fails to execute or once it reaches to any further extent from wherever it can’t go to any extent further. This is often known as transaction failure wherever solely many transactions or processes are hurt. The reasons for transaction failure are:

· Logical errors

· System errors

Logical errors: Where a transaction cannot complete as a result of its code error or an internal error condition.

System errors: Wherever the information system itself terminates a transaction as a result of the DBMS isn’t able to execute it, due to some system condition.

2. System crash: There are issues − external to the system − that will cause the system to prevent abruptly and cause the system to crash. For instance, interruptions in power supply might cause the failure of underlying hardware or software package failure. Examples might include OS errors.

3. Disk failure: In early days of technology evolution, it had been a typical drawback wherever hard-disk drives or storage drives accustomed to failing oftentimes. Disk failures include the formation of dangerous sectors, unreachability to the disk, disk crash or the other failure, that destroys all or a section of disk storage.

The sources of failure are:

Due to hardware or software errors, the system crashes which ultimately resulting in loss of main memory.
Failures of media, such as head crashes or unreadable media that results in the loss of portions of secondary storage.
There can be application software errors, such as logical errors which are accessing the database that can cause one or more transactions to abort or fail.
Natural physical disasters can also occur such as fires, floods, earthquakes, or power failures.
Carelessness or unintentional destruction of data or directories by operators or users.
Damage or intentional corruption or hampering of data (using malicious software or files) hardware or software facilities.

Whatever the reasons of the failure are, there are two principal things that have to be considered:

Failure of main memory including that database buffers.
Failure of the disk copy of that database.

Failure Control Methods

There are 2 forms of techniques, which may facilitate a database management system in recovering as well as maintaining the atomicity of a transaction:

Maintaining the logs of every transaction, and writing them onto some stable storage before truly modifying the info.
Maintaining shadow paging, wherever the changes are done on a volatile memory, and later, and the particular info is updated.

Log-based recovery Or (Manual Recovery):

Log could be a sequence of records, which maintains the records of actions performed by dealing. It’s necessary that the logs area unit written before the particular modification and hold on a stable storage media, that is failsafe. Log-based recovery works as follows:

The log file is unbroken on a stable storage media.
When a transaction enters the system and starts execution, it writes a log regarding it.

Recovery with concurrent transactions (Automated Recovery):

When more than one transaction are being executed in parallel, the logs are interleaved. At the time of recovery, it would become hard for the recovery system to backtrack all logs, and then start recovering. To ease this situation, most modern DBMS use the concept of 'checkpoints'.

Checkpoint: Keeping and maintaining logs in real time and in real environment may fill out all the memory space available in the system. As time passes, the log file may grow too big to be handled at all. Checkpoint is a mechanism where all the previous logs are removed from the system and stored permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions were committed.

Recovery

When a system with concurrent transactions crashes and recovers, it behaves in the following manner

The recovery system reads the logs backwards from the end to the last checkpoint.
It maintains two lists, an undo-list and a redo-list.
If the recovery system sees a log with <T_n, Start> and <T_n, Commit> or just <T_n, Commit>, it puts the transaction in the redo-list.
If the recovery system sees a log with <T_n, Start> but no commit or abort log found, it puts the transaction in undo-list.

All the transactions in the undo-list are then undone and their logs are removed. All the transactions in the redo-list and their previous logs are removed and then redone before saving their logs.

Posts

Tuesday, 24 December 2019

Database Recovery - Kinds of failures - Failure Control Methods - Recovery

No comments:

Post a Comment

Updates