Friday, January 30, 2009
I'm in my last semester here at the Norwegian University of Science and Technology, which means that I'm starting on my master thesis these days. Actually, I'm not starting now, I'm continuing on the "pre-project" I worked on together with my friend, Alex Brasetvik from August to December last year.

Our master thesis is about building a query optimizer for Fast's (now a subsidiary of Microsoft, they bought them) new Enterprise Search solution.

The report itself can be found here: MasterProjectReport.pdf, and the abstract is included below:
This document is the report for the authors’ joint effort in researching and designing a query optimizer for fast’s next-generation search platform, known as MARS. This work was done during the pre-project to the master thesis at the Department of Computer and Information Science at the Norwegian University of Science and Technology, autumn 2008.

MARS does not currently employ any form of query optimizer, but does have a parser and a runtime system. The report therefore focuses on the core query optimizing aspects, like plan generation and optimizer design. First, we give an introduction to query optimizers and selected problems. Then, we describe previous and ongoing efforts regarding query optimizers, before shifting focus to our own design and results.

MARS supports
DAG-structured query plans, which means that the optimizer must do so too. This turned out to be a greater task than what it might seem like. The optimizer also needed to be extensible, including the ability to deal with query operators it does not know, as well as supporting arbitrary cost models.

During the course of the project, we have laid out the design of an optimizer we believe satisfies these goals. DAGs are currently not
fully supported, but the design can be extended to do so. Extensibility is solved by loose coupling between optimizer components. Rules are used to model operators, and the cost model is a separate, customizable component. We have also implemented a prototype that demonstrates that the design actually works.

posted on Friday, January 30, 2009 3:08:08 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
 Monday, September 1, 2008
Here are the slides and T-SQL code I used during Thursday's Norwegian .NET User Group presentation.
I presented basic transaction processing with emphasis on concurrency and isolation. I hope everyone had a good time. I definitely had a good time presenting.

I also got some really good questions, one of them being what would happen if SQL Server were to lose the log file for one of its databases during operation. Since I didn't give the full explanation at the presentation, I've written a blog post about it. It can be found here.

Also, I mentioned that SQL Server 2008 RTM'ed (it's done!) on or sometime before 6th of August with build number 10.0.1600.22. I didn't blog about it here since I was OOF at the time :-)

Transaksjoner, isolasjonsnivåer og låsing i SQL Server.pptx (151.12 KB) (2.92 KB)

posted on Monday, September 1, 2008 9:32:15 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [3]
I got a question at a .NET Community Event a few days ago about what would happen if SQL Server were to lose the log (LDF) or data (MDF/NDFs) file for a database while in operation (e.g. the disk with the data or log file on crashes). If I've got my SQL Server disaster recovery right, this should be what would happen:

First, if both data and log are lost, it's simple - SQL Server will stop servicing requests for that DB and we'll need to restore everything from our last backup (possibly some minutes/hours/days old, depending on your backup scheme).

Second, if the data file is lost, while the log is good, SQL Server will probably stop servicing requests pretty quickly here too, but we shouldn't lose any data (assuming we're running under the full recovery model and have taken at least one full backup and have the log chain intact - that is, we haven't truncated the transaction log and we've got all log backups since the last full or differential backup ready for restore). We can just restore the last full backup, then the last differential one and then all log backups consecutively, up to and including the tail of the log that is still good.

Third, if the log file is lost, while the data file is good, we may have bigger problems. SQL Server will at least stop servicing any requests involving writing to the database, and we now have the potential to lose data.
But wait - we have the complete data file - why would we lose data? The reason for this is the way SQL Server handles buffering and recovery, using the ARIES algorithm. ARIES uses a so-called STEAL/NO-FORCE approach to optimize performance for the buffer pool (SQL Server's in-memory data cache), which basically means that data from uncommitted transactions can be written to the MDF/NDFs on disk and that data from committed transactions can still only reside in memory.

This means that if there are open transactions or any transactions have been writing data to the database since the last checkpoint at the time of the crash (and possibly more scenarios), the data file is potentially in an inconsistent state. Losing the log file in such a situation can cause database corruption, broken constraints, half-finished transactions, lost data and all sorts of crap, since SQL Server will not be able to roll back uncommitted transactions or roll forward committed ones.

If the log is lost, it can be rebuilt using Emergency Mode Repair, but as Paul S. Randall (former SQL Server employee) describes here, this is something that shouldn't be done unless you're out of other options.

So, the only way to ensure you don't lose data is, once again, a plan for backup and disaster recovery. Murphy states that if you don't, you WILL find yourself in deep shit at some time in the future.

And when we're on the topic of losing the log - I've seen some pretty ridiculous ways of reducing the size of your log file around different forums. I've seen posts advising people to just delete or rebuild the log file whenever it gets too big. That is a pretty bad piece of advise (unless you know what you're doing and are checkpointing or detaching the database first). Rebuilding the log is, due to the reasons above, a pretty quick and handy way of inducing corruption into your database. To reduce the size of your transaction log, back it up using the BACKUP LOG statement, optionally shrinking the log files afterward.

So, do you agree with me? Feel free to post comments if I've got something wrong.
posted on Monday, September 1, 2008 6:37:09 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
 Saturday, June 7, 2008
A few months ago I wrote that a SQL Server 2008 RC (Release Candidate) was scheduled for Q2 this year.

Looks like Microsoft is staying on their schedule - RC0 was just released to MSDN and TechNet subscribers!

EDIT: RC0 is now available here for the public as well.

posted on Saturday, June 7, 2008 4:58:05 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
 Wednesday, June 4, 2008
Microsoft has published SQL Server's new logo:

I think it looks good :-)

Courtesy of Wesley from

posted on Wednesday, June 4, 2008 2:58:52 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [1]
 Friday, May 9, 2008
I recently wrote a paper at school about how flash memory impacts the database world. Those who are interested can read it here: How Flash Memory Changes the DBMS World - An Introduction

posted on Friday, May 9, 2008 3:30:30 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [4]
 Thursday, March 6, 2008
As promised, here are the resources from the SQL Server 2008 presentation I did at NTNU 5/3. For those of you that didn't attend, I still recommend you to take a look at the links below. I hope everyone had a good time hearing about PowerShell, Reporting Services, and geospatial support - I sure did have a good time presenting it.

Visual Studio
How to configue Visual Studio to Debug .NET Framework Source Code:

SQL Server and PowerShell:
The PowerShell script for selecting data:
Microsoft SQL Server Homepage:

Reporting Services
The report I created with Reporting Services: (rename to .rdl)

The spatial queries:
Utilities for loading shapefiles into SQL Server and visualizing spatial query results:
Background image for the visualization tool:
Some shapefiles:

Another visualization tool that I didn't use:
Isaac's blog on MSDN:
The SQL Server Battleship game that I mentioned:

Imagine Cup
More information about Imagine Cup:

Feel free to send me questions at hansolav *AT*

posted on Thursday, March 6, 2008 2:44:03 AM (W. Europe Standard Time, UTC+01:00)  #    Comments [2]
 Thursday, February 21, 2008
posted on Thursday, February 21, 2008 3:50:37 PM (W. Europe Standard Time, UTC+01:00)  #    Comments [3]