Hans Olav's Repeatable Read

Monday, 01 September 2008

Slides and Code From Thursday's NNUG Session

Here are the slides and T-SQL code I used during Thursday's Norwegian .NET User Group presentation.
I presented basic transaction processing with emphasis on concurrency and isolation. I hope everyone had a good time. I definitely had a good time presenting.

I also got some really good questions, one of them being what would happen if SQL Server were to lose the log file for one of its databases during operation. Since I didn't give the full explanation at the presentation, I've written a blog post about it. It can be found here.

Also, I mentioned that SQL Server 2008 RTM'ed (it's done!) on or sometime before 6th of August with build number 10.0.1600.22. I didn't blog about it here since I was OOF at the time :-)

Transaksjoner, isolasjonsnivåer og låsing i SQL Server.pptx (151.12 KB)
NNUGDemos-2008-08-28-HON.zip (2.92 KB)

Presentations | SQL Server

posted on Monday, 01 September 2008 21:32:15 (W. Europe Standard Time, UTC+01:00)

Comments [3]

What Happens If You Lose the Data File (MDF), Log File (LDF) or Both?

I got a question at a .NET Community Event a few days ago about what would happen if SQL Server were to lose the log (LDF) or data (MDF/NDFs) file for a database while in operation (e.g. the disk with the data or log file on crashes). If I've got my SQL Server disaster recovery right, this should be what would happen:

First, if both data and log are lost, it's simple - SQL Server will stop servicing requests for that DB and we'll need to restore everything from our last backup (possibly some minutes/hours/days old, depending on your backup scheme).

Second, if the data file is lost, while the log is good, SQL Server will probably stop servicing requests pretty quickly here too, but we shouldn't lose any data (assuming we're running under the full recovery model and have taken at least one full backup and have the log chain intact - that is, we haven't truncated the transaction log and we've got all log backups since the last full or differential backup ready for restore). We can just restore the last full backup, then the last differential one and then all log backups consecutively, up to and including the tail of the log that is still good.

Third, if the log file is lost, while the data file is good, we may have bigger problems. SQL Server will at least stop servicing any requests involving writing to the database, and we now have the potential to lose data.
But wait - we have the complete data file - why would we lose data? The reason for this is the way SQL Server handles buffering and recovery, using the ARIES algorithm. ARIES uses a so-called STEAL/NO-FORCE approach to optimize performance for the buffer pool (SQL Server's in-memory data cache), which basically means that data from uncommitted transactions can be written to the MDF/NDFs on disk and that data from committed transactions can still only reside in memory.

This means that if there are open transactions or any transactions have been writing data to the database since the last checkpoint at the time of the crash (and possibly more scenarios), the data file is potentially in an inconsistent state. Losing the log file in such a situation can cause database corruption, broken constraints, half-finished transactions, lost data and all sorts of crap, since SQL Server will not be able to roll back uncommitted transactions or roll forward committed ones.

If the log is lost, it can be rebuilt using Emergency Mode Repair, but as Paul S. Randall (former SQL Server employee) describes here, this is something that shouldn't be done unless you're out of other options.

So, the only way to ensure you don't lose data is, once again, a plan for backup and disaster recovery. Murphy states that if you don't, you WILL find yourself in deep shit at some time in the future.

And when we're on the topic of losing the log - I've seen some pretty ridiculous ways of reducing the size of your log file around different forums. I've seen posts advising people to just delete or rebuild the log file whenever it gets too big. That is a pretty bad piece of advise (unless you know what you're doing and are checkpointing or detaching the database first). Rebuilding the log is, due to the reasons above, a pretty quick and handy way of inducing corruption into your database. To reduce the size of your transaction log, back it up using the BACKUP LOG statement, optionally shrinking the log files afterward.

So, do you agree with me? Feel free to post comments if I've got something wrong.

SQL Server

posted on Monday, 01 September 2008 18:37:09 (W. Europe Standard Time, UTC+01:00)

Comments [1]

Saturday, 07 June 2008

SQL Server 2008 RC0

A few months ago I wrote that a SQL Server 2008 RC (Release Candidate) was scheduled for Q2 this year.

Looks like Microsoft is staying on their schedule - RC0 was just released to MSDN and TechNet subscribers!

EDIT: RC0 is now available here for the public as well.

SQL Server

posted on Saturday, 07 June 2008 04:58:05 (W. Europe Standard Time, UTC+01:00)

Comments [1]

Wednesday, 04 June 2008

New SQL Server Logo

Microsoft has published SQL Server's new logo:

I think it looks good :-)

Courtesy of Wesley from http://blogs.msdn.com/wesleyb/archive/2008/06/03/sql-server-logo.aspx

Microsoft | SQL Server

posted on Wednesday, 04 June 2008 02:58:52 (W. Europe Standard Time, UTC+01:00)

Comments [1]

Thursday, 06 March 2008

Resources from SQL Server 2008 Presentation

As promised, here are the resources from the SQL Server 2008 presentation I did at NTNU 5/3. For those of you that didn't attend, I still recommend you to take a look at the links below. I hope everyone had a good time hearing about PowerShell, Reporting Services, and geospatial support - I sure did have a good time presenting it.

Visual Studio
How to configue Visual Studio to Debug .NET Framework Source Code: http://blogs.msdn.com/sburke/archive/2008/01/16/configuring-visual-studio-to-debug-net-framework-source-code.aspx

SQL Server and PowerShell:
The PowerShell script for selecting data: http://hansolav.net/blog/content/binary/powershellsql.txt
Microsoft SQL Server Homepage: http://www.microsoft.com/sql

Reporting Services
The report I created with Reporting Services: http://hansolav.net/blog/content/binary/report.rdl.txt (rename to .rdl)

Geospatial
The spatial queries: http://hansolav.net/blog/content/binary/spatial.sql.txt
Utilities for loading shapefiles into SQL Server and visualizing spatial query results: http://www.sharpgis.net/post/2007/12/Shapefile-to-SqlServer-2008-to-WPF.aspx
Background image for the visualization tool: http://visibleearth.nasa.gov/view_rec.php?id=7105
Some shapefiles:
http://www.census.gov/geo/www/cob/bdy_files.html
http://www.geog.ucsb.edu/~ta176/g176b/lab6/lab6_data.zip
http://www.cipotato.org/diva/data/MoreData.htm
http://biogeo.berkeley.edu/bgm/gdata.php

Another visualization tool that I didn't use: http://conceptdevelopment.net/Database/Geoquery/
Isaac's blog on MSDN: http://blogs.msdn.com/isaac/default.aspx
The SQL Server Battleship game that I mentioned: http://www.conceptdevelopment.net/Database/Battleship2008/

Imagine Cup
More information about Imagine Cup: http://msstudent.no/ImagineCupNorge/ImagineCup2008/tabid/647/Default.aspx

Feel free to send me questions at hansolav *AT* hansolav.net.

SQL Server

posted on Thursday, 06 March 2008 02:44:03 (W. Europe Standard Time, UTC+01:00)

Comments [2]

Sunday, 27 January 2008

News on SQL Server 2008 RTM date

Francois Ajenstat has published news about the SQL Server 2008 release roadmap. Summarized:

Feature complete CTP for the launch wave in Q1 2008
Release Canidate (RC) in Q2 2008
Release to manufacturing (RTM) expected in Q3 2008

Exciting!

SQL Server

posted on Sunday, 27 January 2008 04:59:40 (W. Europe Standard Time, UTC+01:00)

Comments [0]

Tuesday, 08 January 2008

Visual Studio 2008 for Database Professionals and SQL Server 2008 CPT5

When living on the edge (read: running the latest betas of everything) as I do, you get trouble from time to time. This time it was Visual Studio 2008 "Data Dude" combined with SQL Server 2008. For those of you not familiar with it, "Data Dude" is an extension to Visual Studio for working with databases; unit testing of your database, automatic data generation etc.

My problem was that I kept getting a message saying "Object reference not set to an instance of an object." all the time when I tried to create a new database project. I suspected that SQL 2008 was the cause, and I was right. It turns out that "Data Dude" creates a temporary database when you're opening or creating a project. In my case this happened in the default SQL Server instance on the machine, which is SQL Server 2008 CTP5 Dev Edition. "Data Dude" didn't like that very much.

The solution (at least in my case) was to configure it to use the installed SQL Server 2005 Express Edition instead. You do that by going to Tools -> Options -> Database Tools -> "Data Connections" and
"Design-time Validation Database". Fill in the instance name of a SQL 2005 instance in the text boxes. So now you know!

Coding | Microsoft | SQL Server

posted on Tuesday, 08 January 2008 01:23:28 (W. Europe Standard Time, UTC+01:00)

Comments [0]

Tuesday, 11 December 2007

Managing SQL Compact Edition Databases using SSMS

I've been working a little with SQL Server Compact Edition (which is a *very* cool product, by the way) lately. For those of you that haven't heard about SQL Compact: It's a small version of SQL Server that you host in-process in your .NET applications (it's just a couple of MBs) with full support for tables, indexes, transactions and so on (but no views, stored procedures or triggers). Much like SQLLite and Apache Derby, just better integrated with .NET.

So far I've been using Visual Studio to create and manage databases. This works great, but I didn't like to start up VS and everything just to peek into a database, and I've always felt more at home in SQL Server Management Studio (SSMS) when working with databases. Well, it turns out that SSMS can be used to manage SQL Compact databases as well! Cool, I didn't know that.

This is basically what you do:

Robert has some more on this on his blog.

SQL Server

posted on Tuesday, 11 December 2007 12:38:13 (W. Europe Standard Time, UTC+01:00)

Comments [0]

Monday, 19 November 2007

SQL Server 2008 "Katmai" November CTP Released!

Yep, the November CTP was released just hours, maybe minutes ago! As of now, they still haven't posted the link on the SQL Server home page. The downloads have been posted on the Microsoft Download Center for easier availability, follow this direct link to get to the bits:
http://www.microsoft.com/downloads/details.aspx?FamilyId=3BF4C5CA-B905-4EBC-8901-1D4C1D1DA884&displaylang=en

There you can find DVD images and self extracting executables for x86, x64 and IA64. This CTP includes SQL Server Express Edition as well.

Happy downloading!

SQL Server

posted on Monday, 19 November 2007 20:21:55 (W. Europe Standard Time, UTC+01:00)

Comments [0]

Wednesday, 14 November 2007

Two Small Enhancements to T-SQL in SQL Server 2008

While we're waiting for the next SQL Server CTP, it thought I could mention a couple of small things already in the current CTP: Multiple row Insert and Declare-Set combined.

To insert multiple rows in SQL Server 2005 and before, you had to execute multiple INSERT statements, or use UNION. The first would look like this:

INSERT INTO MyTable (Col1, Col2)
VALUES (1, 'One')
INSERT INTO MyTable (Col1, Col2)
VALUES (2, 'Two')
INSERT INTO MyTable (Col1, Col2)
VALUES (3, 'Three')

And using UNION it would be

INSERT INTO MyTable (Col1, Col2)
          SELECT 1, 'One'
UNION ALL SELECT 2, 'Two'
UNION ALL SELECT 3, 'Three'

In SQL Server 2008, this can be done like this:

INSERT INTO MyTable (Col1, Col2)
VALUES (1, 'One'), (2, 'Two'), (3, 'Three')

I did some quick performance testing and found that UNION and multiple inserts runs on about the same time, while the single inserts are slower. But this is only true as long as the numbers of rows being inserted isn't to high. If inserting 10 000 rows in one go, for instance, the individual inserts are actually faster, probably because of the amount of in-memory data the unions and the multiple row inserts produce, as well as that the last two commands run the whole batch within on large transaction.

Then, Declare-Set combined: It has long irritated me that I had to use two lines to declare a variable and assign a value to it. This is what it looks like i SQL Server 2005 and earlier:

DECLARE @var int
SET @var = 50

Now, in SQL 2008, we can do this:

DECLARE @var int = 50

We can also declare and assign to multiple variables in one go:

DECLARE @var1 int         = 50,
        @var2 varchar(10) = 'Hello!'

Pretty basic, and it's probably long overdue, but it's neat and it's finally here :-)

SQL Server

posted on Wednesday, 14 November 2007 05:06:28 (W. Europe Standard Time, UTC+01:00)

Comments [0]

Tuesday, 30 October 2007

Running SQL Server 2008 July CTP on Windows Server 2008 RC0...

...does not work, it looks like.

I upgraded my installation (hosting this site) of Windows Server 2008 from Beta 3 (June CTP) to RC0 today - over remote desktop! My server is located in a locked server room, so I wanted to try a remote upgrade before asking for the key, and it went well. I extracted the ISO onto the hard drive, started the installation from remote desktop and selected the upgrade option. It turns out that the Windows setup doesn't (luckily) show any dialogs where you have to click next during the upgrade. I had a continuous ping trace running and observed the server going up and down a few times, before it completed the installation and enabled incoming remote desktop connections again. Pretty cool!

But, back to the title. Everything worked well after the upgrade, except SQL Server 2008 July CTP. I kept getting error messages from Management Studio when connecting saying "No process is on the other end of pipe", and this led me into thinking that I had a certificate problem. See this blog post.

But, it turns out that this wasn't the issue. The issue is that Windows Server 2008 RC0 ships with a version of SQL Native Client that is newer than what the SQL Server 2008 July CTP supports, so it just doesn't work. This thread says it will be addressed in the next SQL Server CTP. Until then you have two options:

Use Management Studio 2008 from another machine to administer SQL Server. This will probably still give you problems if you run SharePoint on the same server, at least my SharePoint installation gets the "No process in end of pipe"-error
Install SQL Server 2005

For now, I'm going with SQL Server 2005 while I'm waiting for the next CTP of SQL Server 2008. I read somewhere that it is expected in the next one to two weeks. It will probably contain lots of new stuff, like spatial data support, Intellisense in Management Studio and so on. Looking forward to it!

Microsoft | SQL Server | Windows Servers

posted on Tuesday, 30 October 2007 18:37:16 (W. Europe Standard Time, UTC+01:00)

Comments [0]

Wednesday, 20 December 2006

Implementing Dijkstra's algorithm using T-SQL

School is out for Christmas (I had my last exam last Friday), so now I have time do fun things, like coding (surprise).

I think algorithmic programming and databases and SQL queries are cool things, so why not combine them? Yesterday I got an idea of implementing some well-known algorithm in SQL, and I figured out that Dijkstra's Shortest Path algorithm should be fun to implement.

Dijkstra's shortest path algorithm finds, well, the shortest path from one vertex to the other vertexes in a weighted graph. The edges have lengths (or costs or whatever), and the shortest path from one vertex to another is the path where the sum of these lengths are as small as possible. Take a look at the illustration below, showing a graph with some Norwegian cities. The shortest path from Trondheim to Fredrikstad has been highlighted (for those of you that know Norway, not very realistic, but let's pretend it is just for the fun of it).

The algorithm works like a breadth first search that takes the edge weights into account, starting at one vertex and traversing through the graph.

So, how do we implement this in Transact-SQL (MS SQL Server's SQL dialect)? Well, first we need some way to represent the graph. I've created two tables:

City

Road

The City table is pretty straightforward. The Road table contains one row for every road from one city to another, and the length of that road. Notice that we have two rows for every two cities we have a road between them, one each way. But, now to the real stuff: The implementation of the algorithm:

CREATE PROCEDURE [dbo].[Dijkstra]
    @StartCity Int
AS
BEGIN
    -- Automatically rollback the transaction if something goes wrong.
    SET XACT_ABORT ON
    BEGIN TRAN

    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Create a temporary table for storing the estimates as the algorithm runs
    CREATE TABLE #CityList
    (
        CityId Int NOT NULL,    -- The City Id
        Estimate Int NOT NULL,    -- What is the distance to this city, so far?
        Predecessor Int NULL,    -- The city we came from to get to this city with this distance.
        Done bit NOT NULL        -- Are we done with this city yet (is the estimate the final distance)?
    )

    -- Fill the temporary table with initial data
    INSERT INTO #CityList (CityId, Estimate, Predecessor, Done)
    SELECT CityId, 2147483647, NULL, 0 FROM City

    -- Set the estimate for the city we start in to be 0.
    UPDATE #CityList SET Estimate = 0 WHERE CityID = @StartCity
    IF @@rowcount <> 1
    BEGIN
        RAISERROR ('Couldn''t set start city', 11, 1)
        ROLLBACK TRAN
        RETURN
    END

    DECLARE @FromCity Int, @CurrentEstimate Int

    -- Run the algorithm until we decide that we are finished
    WHILE 1=1
    BEGIN
        -- Reset the variable, so we can detect getting no records in the next step.
        SELECT @FromCity = NULL

        -- Select the CityID and current estimate for a city not done, with the lowest estimate.
        SELECT TOP 1 @FromCity = CityId, @CurrentEstimate = Estimate
        FROM #CityList WHERE Done = 0 AND Estimate < 2147483647
        ORDER BY Estimate

        -- Stop if we have no more unvisited, reachable cities.
        IF @FromCity IS NULL BREAK

        -- We are now done with this city.
        UPDATE #CityList SET Done = 1 WHERE CityId = @FromCity

        -- Update the estimates to all neighbour cities of this one (all the cities
        -- there are roads to from this city). Only update the estimate if the new
        -- proposal (to go via the current city) is better (lower).
        UPDATE #CityList SET #CityList.Estimate = @CurrentEstimate + Road.Distance,
            #CityList.Predecessor = @FromCity
        FROM #CityList INNER JOIN Road ON #CityList.CityID = Road.ToCity
        WHERE Road.FromCity = @FromCity AND (@CurrentEstimate + Road.Distance) < #CityList.Estimate

    END

    -- Select the results.
    SELECT City1.Name AS ToCity, Estimate AS Distance, city2.Name AS Predecessor FROM #CityList
    INNER JOIN City city1 ON #CityList.CityId = City1.CityID
    LEFT OUTER JOIN City city2 ON #CityList.Predecessor = city2.CityID

    -- Drop the temp table.
    DROP TABLE #CityList

    COMMIT TRAN
END

If we run it with Trondheim as start city (@StartCity = 1), we get this result table:

This says that from Trondheim, we have a distance 0 to Trondheim, 2 to Bergen and so on, and 6 to Fredrikstad. The Predecessor column says what city we came from when we went to each city. We can see that to get to Fredrikstad, we came from Oslo, and to get to Oslo, we came from Bergen. To get to Bergen, we came from Trondheim. Therefore, to get to Fredrikstad, we took the path Trondheim, Bergen, Oslo, Fredrikstad.

I have included the SQL script to create the database:

Dijkstra.txt (8,03 KB)
TestScript.txt (1,26 KB)

Coding | Cool stuff | SQL Server

posted on Wednesday, 20 December 2006 17:45:10 (W. Europe Standard Time, UTC+01:00)

Comments [12]