Monday, 01 September 2008
I got a question at a .NET Community Event a few days ago about what would happen if SQL Server were to lose the log (LDF) or data (MDF/NDFs) file for a database while in operation (e.g. the disk with the data or log file on crashes). If I've got my SQL Server disaster recovery right, this should be what would happen:

First, if both data and log are lost, it's simple - SQL Server will stop servicing requests for that DB and we'll need to restore everything from our last backup (possibly some minutes/hours/days old, depending on your backup scheme).

Second, if the data file is lost, while the log is good, SQL Server will probably stop servicing requests pretty quickly here too, but we shouldn't lose any data (assuming we're running under the full recovery model and have taken at least one full backup and have the log chain intact - that is, we haven't truncated the transaction log and we've got all log backups since the last full or differential backup ready for restore). We can just restore the last full backup, then the last differential one and then all log backups consecutively, up to and including the tail of the log that is still good.

Third, if the log file is lost, while the data file is good, we may have bigger problems. SQL Server will at least stop servicing any requests involving writing to the database, and we now have the potential to lose data.
But wait - we have the complete data file - why would we lose data? The reason for this is the way SQL Server handles buffering and recovery, using the ARIES algorithm. ARIES uses a so-called STEAL/NO-FORCE approach to optimize performance for the buffer pool (SQL Server's in-memory data cache), which basically means that data from uncommitted transactions can be written to the MDF/NDFs on disk and that data from committed transactions can still only reside in memory.

This means that if there are open transactions or any transactions have been writing data to the database since the last checkpoint at the time of the crash (and possibly more scenarios), the data file is potentially in an inconsistent state. Losing the log file in such a situation can cause database corruption, broken constraints, half-finished transactions, lost data and all sorts of crap, since SQL Server will not be able to roll back uncommitted transactions or roll forward committed ones.

If the log is lost, it can be rebuilt using Emergency Mode Repair, but as Paul S. Randall (former SQL Server employee) describes here, this is something that shouldn't be done unless you're out of other options.

So, the only way to ensure you don't lose data is, once again, a plan for backup and disaster recovery. Murphy states that if you don't, you WILL find yourself in deep shit at some time in the future.

And when we're on the topic of losing the log - I've seen some pretty ridiculous ways of reducing the size of your log file around different forums. I've seen posts advising people to just delete or rebuild the log file whenever it gets too big. That is a pretty bad piece of advise (unless you know what you're doing and are checkpointing or detaching the database first). Rebuilding the log is, due to the reasons above, a pretty quick and handy way of inducing corruption into your database. To reduce the size of your transaction log, back it up using the BACKUP LOG statement, optionally shrinking the log files afterward.

So, do you agree with me? Feel free to post comments if I've got something wrong.
Monday, 05 January 2009 08:24:02 (W. Europe Standard Time, UTC+01:00)
Usually i used automatic backup system, so i can restore it when it broken.
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):

Live Comment Preview