Deadlocked!

You may have noticed that my posting frequency has declined over the last three weeks. That's because I've been busy building that Stack Overflow thing we talked about.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2008/08/deadlocked.html

As many people have already pointed out, the advantage of nolock is higher performance, the risk is the possibility of reading uncommitted data.

Prior to implementing with (nolock) or Set Transaction Isolation Level Read Uncommitted, you should ask 2 questions:

  1. What (bad things) can happen if my application reads dirty data?
  2. How often will a dirty read occur?

For most applications (except banking, airline reservations, etc) the answer to #1 is nothing and the answer to #2 is hardly ever.

This is probably WAY too far down for anyone to read it now, but another possibility is that it ISN’T a deadlock - the message just SAYS that it is. Maybe (deep down in the SQL stack) an exception is thrown, and something slightly higher up just ASSUMES that all exceptions thrown by x() are deadlocks, and spits out the message.

No Syd - SQL Server does not falsely report a deadlock. As I mentioned earlier, it is a very specific condition and simple to identify, but impossible to solve. So it kills one of the queries. If SQL Server says you have a deadlock, that’s what it is.

oh and Jeff, this line:
With nolock / read uncommitted / dirty reads, data may be out of date at the time you read it, but it’s never wrong or garbled or corrupted in a way that will crash you.
is plain wrong.

read this to see why:
http://blogs.msdn.com/craigfr/archive/2007/06/12/query-failure-with-read-uncommitted.aspx

So… why are you not just using MySQL ???

Just for the record, unlike SQL Server and MySQL, Oracle does not have a READ UNCOMMITTED isolation level. As Tom Kyte says, The goal of providing a dirty read is to supply a nonblocking read, whereby queries are not blocked by, and do not block, updates of the same data. However, Oracle Database doesn’t need dirty reads to achieve this goal, nor does it support them. Dirty reads are an implementation other databases must use to provide nonblocking reads…

Simplest case :

tran1 :
begin tran
select from [A]
update [B]
commit tran

tran2 :
begin tran
select from [B]
update [A]
commit tran

this would deadlock if you happen to enter into these two simultaneously.

I suspect that the framework you are using is intiating transactions even if you are not doing so explicitly. I ran into this when I created a middle layer using .Net and the end users (developers) were facing this even though they were not using transactions for their SQL sets.

I say this because just deadlocking on selects v/s updates is very rare (if at all) and certainly not in the vicinity of the rate that you are experiencing.

Scoble is a moron

Not sure if this has already been submitted above (I didn’t read each post in that much detail). You could be seeing an index deadlock. While deadlocks between tables is easier to understand you can get into the same situations with indexes as well.

For example lets say you have to indexes A and B. Lets say that A is a clustered index on Id so it contains all the data (lets say B is an index on ParentResultId and LastActivityDate).
So the select statement starts with index B, puts a read lock on it and then tries to go get the rest of the data from the data page (trying to get a read lock on the page). The update statement does the opposite, first it gets a lock on the data page (index A) and then tries to get a lock on index B so it can update it. Classic deadlock.

Not sure if this is what is happening here but it is a less straight forward version of deadlock that sometimes you miss.

Also I think the claims above of two select statements deadlocking is impossible. You need some process requesting exclusive access to a data page to get a deadlock. Any number of connections can hold read locks on the same page which makes sense because one process reading the data shouldn’t prevent another process from reading data.

I’d say that the fact that SQL Server deadlocks has twice the results amount versus MySql deadlocks is because the amount of users that SQL Server has much more than MySql.

The stupid question was why should my database need to lock on a read. If calling that out flips the bozo bit, I’m happier without interviewing with your organization.

To elaborate slightly: Everything works this way. Try to get a read lock on a file locked for writing. By default you can’t. Try to get a read lock on a semaphore-controlled variable. By default you can’t. Seeking a read lock by default before reading is one of the most common patterns of concurrency that exists.

I bet you do not have an index on [Posts].[ParentId]. I would also not be surprised at all to find that adding that index fixed your problem.

Personally I’m pretty darn happy that Sql Server has good defaults that are concerned with data integrity and doesn’t hope programmers will know what to do. Maybe you’re happy otherwise, but people would scream bloody murder if it was otherwise.

Automatic retries … you really don’t want that do you? Retry it yourself. If you had the system software do it for you, it most certainly isn’t going to make the correct choice. Additionally, should every layer retry between your app and the database? What if n classes wait rand(m) seconds between p retries? You press a button and the app goes away for 5 minutes because a failure takes n * m * p seconds to get back to you? No thanks. Fail fast and allow the application to retry if it can at the appropriate level. As long as you have concurrency detection mechanisms in place then there should be nothing wrong with retrying the write.

From an application standpoint, read-committed locks are only necessary if you plan to update the database using that data, because you must first see a snapshot the data you need to change, and then change it. If you’re reading just to grab replies to a post, then you should be able to use nolock. Make sure your code can handle the case when the count is not equal to what was returned.

Also to avoid locking, you could read from views and you can partition the hot data out of the table (ie statistics, counts, etc) so you’re not locking pages on a heavily accessed table to update a count that is, for most purposes, probably useless.

But is this a premature optimization? Is your goal of storing the count trying to save you time, but in effect is causing the deadlocks and slowing you down? The statistics are pretty simple, and I have to ask, why aren’t you using a view? It’s what they’re meant for; let Sql Server work it out.

Sounds like you also need some application-level caching. While it’s good to optimize the lower layers, you should be preventing every request from hitting the database without reason. Even a light cache is better than nothing. e.g. if you’re getting 1000 hits a minute and the data is not expected to change in that time range, even a 30 or 60 second cache lifetime will save you 500 to 1000 roundtrips to the database during that cache period. That’s something to think about.

Writing software is easy. Writing good software isn’t.

A great read on isolation levels, concurrency, and locking:

http://www.oracle.com/technology/oramag/oracle/05-nov/o65asktom.html

While Oracle centric, this article is not necessarily Oracle specific. It compares how Oracle performs (or avoids) locking using multi-versioning to other databases. Detailed examples are provided.

Actually - I think this article is even more relevant to developers NOT using Oracle, since with Oracle, deadlocks are rare. In any case, understanding the basics can go a long way to solving application issues.

Regarding SQL Server vs MySQL deadlocks, doesn’t MySQL use some form of multi-versioning?

wow + suck

I think there’s a good reason that Microsoft put SQL Server where it is on the Mistic Scale (opti-pessi) - it’s better to design to support the high-risk environment at the expense of the low-risk one than vice versa. Since MS can’t know how smart their clients’ DBAs are (and I’ve met some donkeys responsible for databases running billions of dollars’ worth of trading positions - even been one myself) they make it ultra-pessimistic out of the box.

Better financially, say, that stackoverflow.com gets some tricky deadlocks than a trader makes the wrong decision because he doesn’t have full-on query consistency. Especially if the only problem is a configuration option.

Of course, MS should be providing better out-of-the-box configuration options - reasonable safe setups for several common types of application, with explanations of the pros and cons of each, would be a start.

But simplicity of installation and configuration is something Microsoft often get wrong. Rick Brewster’s exposition of Paint.Net’s worsdt-case installation scenario is a case in point.

Why doesn’t each philosopher take the chopstick to his immediate left, break it in half, and use two small chopsticks?

We use NOLOCK a fair bit here for reporting - given that our production systems are doing quite a few writes per second… that said, I’ve found good indexing and using views to normalise data, rather than using normalised tables is a good way to go.

You can also throw the nolock hint inside the view - and for a few situations like you describe, it might be better to use a view to cache a users last update, x answers, etc into its own table rather than keeping in a denormalised structure - that way you aren’t doing the reads directly on the tables.

Also, for any detailed stored procs, etc, I’d go about using the WITH keyword and use CTEs rather than doing any temp/in memory/@table tables.

I’ve found that they speed my queries up to no end.

Also, remember when using Transaction scope in LINQ, you have to wrap the first call to the data inside your transaction USING, otherwise the call to get the data is fired outside of the transaction scope:
http://www.madprops.org/blog/linq-to-sql-and-nolock-hints/

Looking forward to getting my sign up approved for the site! :slight_smile: