Deadlocked!

Steve · August 25, 2008, 12:00am

The last place I worked had a strict nolock on every query policy.

Eric_Kibbee · August 25, 2008, 12:00am

I find that the fact that most databases treat everything like a banking system seems to be a problem that affects performance greatly. The percentage of applications that require that level of transactional stability is extremely small. If you remove the locking, and assume that reading old data is not detrimental, then you can speed things up quite a bit.

DanielB · August 25, 2008, 12:00am

This is all well and good, but I would refer you to the following post (you may not be familiar with the blogger, but he makes a good point):

http://www.codinghorror.com/blog/archives/001079.html

If you have a simple bit of code with relatively low usage, and you’re getting deadlock, then I’d be very concerned that something very bad is going on under the surface. Closing your eyes, applying a bandaid and hoping that the underlying issue goes away is the road to lots of future pain.

Andy · August 25, 2008, 12:00am

Why not just cache some of the reads into memcached? should minimize stress on the DB and thus make deadlocks less likely.

ratboy666 · August 25, 2008, 12:00am

To quote:

Part 1:

UPDATE [Posts]
SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3
WHERE [Id] = @p0

Part 2:

SELECT *
FROM [Posts]
WHERE [ParentId] = @p0

And, the problem is that you are seeing a deadlock…

First, we must consider WHY this is the case… Note that a SINGLE lock can’t deadlock, unless the resource is never returned. Neither of these SQL statements qualifies as an infinite loop. So, we must presume at least two resources. Note that, equally, we cannot assume NO locks, as that cannot deadlock either.

A possible answer to this is that there are separate READ and WRITE locks (there must be at least two locks). Any number of READ locks can be held by readers of a table. A WRITE lock is then exclusive (The WRITE lock can be obtained if there are no READERs). This would make sense; it is a normal design pattern.

This gives us the necessary two resources. Now, the READ resource should be acquired by statement 2. The WRITE resource should be acquired by statement 1. Aquiring WRITE of course implies ownership of READ as well. Note that the second statement is an UPDATE, which will of course use READ – but it isn’t needed because the WRITE resource implies the READ resource. Indeed, it would deadlock if this were not the case.

If the two resources (READ and WRITE) were not acquired atomically (relative to each other),then there may be a subtle race condition within the SQL service itself. Of course, this implies that the resources are implemented as two separate resources.

Another issue could be that the SQL optimizer could be mis-compiling the request and issuing a sequence like aquire(WRITE) … do something … aquire(READ) … which would also lead to the deadlock.

If the optimizer ASSUMED that no updates where going on, it may begin with aquisition of READ, and begin the transaction. Later, it may determine a result set, and attempt to aquire WRITE – which of course would fail (this would be a nasty bug indeed). Of course, it may assume that IT owns READ, and thus attempting to aquire WRITE is ok (and this naive optimization would work, AS LONG AS THERE ARE NOT CONCURRENT READERS).

Actually, there may be other causes to this. But, (and this is important) - if the problem is as a result of select/update, it is probably a defect within the database itself.

Indeed, there is NO direct solution to this deadlock, under the circumstances you have outlined. There is, however, a work-around.

Ensure that the database is NEVER UPDATEd if concurrent reads could be going on. Writes are fine, Deletes are fine, but ANY UPDATE would be suspect with this database library.

Alternatively (and my STRONG recommendation is to immediately get rid of the database), use explicit locking.

RobertB · August 25, 2008, 12:00am

Seems to me that anyone doing any kind of database work for more than a few minutes has come across this deadlock situation. Locks on a database table are never taken because of a write, they are always taken because of a read; reads are what lock database table rows. Again, this is a fundamental concept of database design that I would expect a seasoned developer to know.

MS SQL Server’s default isolation level (which controls the locking scheme) is SERIALIZABLE. Which means every read locks the rows that are read until the transaction commits. For reference, MySQL’s default isolation level is REPEATABLE-READ, one down from SERIALIZABLE, but would also cause the deadlock situation described in this article.

Since nolock hints in MS SQL Server can be ignored by the query optimizer, it is advisable to change the isolation level instead, which can usually be done either in clint code (the isolation level can be set in the TransactionScope object in .net), in a stored procedure, or often at the connection object level. In this scenario, the READ UNCOMMITTED isolation level is perfectly acceptable since, as stated, no one really cares if the post count of a particular user is off by one or so (and the data is not used in a subsequent transaction, which is KEY) and will save you some memory on the database server (in contrast to READ COMMITTED SNAPSHOT). In a banking scenario, however, I’m sure the bank (but not the customers) would be somewhat perturbed if a customer made two quick money transfers between two of his accounts and suddenly had double the amount of money he started out with.

Lastly, this is why database transactions should be deliberately designed rather than be the bastard step-child of the code’s business layer use-cases.

RobertB · August 25, 2008, 12:00am

In my previous comment, I must ammend my first paragraph such that it is in the context of the deadlock situation given in the post. The update of course attempts to lock the row which is what eventually manifests the deadlock situation, however, it is not the root cause of the problem. The root cause is the locks aquired during the initial read and a subsequent read of the identical rows, with the intervening update waiting on the initial read’s lock to release. Since the data read in the read situations is not used in a subsequent database transaction to update any data, it is perfectly acceptable to permit dirty reads in that situation.

kashif1 · August 25, 2008, 12:00am

Jeff,

I understand your concerns - you may be able to tolerate slightly older data,because most likely you only have ONE instance of your webapp that sees the old data.

Problems will occur if suppose you have multiple instances of your webapp, where you can run into a problem if your multiple instances see different data [or versions of the data] from the database. Then you run into data consistency issues.

When you have multiple application servers, you could run into a problem where at a given time some
I can tell that web applications use the DB as shared memory, that they rely on the fact that once an update is done

You have to pick a locking strategy for your applications.

Vizeroth · August 25, 2008, 12:00am

Of course, as much as everyone likes to talk about banking transactions, the reality is that banks heavily optimize their systems to not have these problems most of the time anyway. The majority of deposits and withdrawals in banking systems do not happen in real time. Most of the transactions begin at midnight, with deposits running first for a given account, followed by withdrawals. You have problematic categories like deposits that come from another account within the bank, but those are probably still handled as seperate transactions with the appropriate penalties if someone doesn’t have the funds to cover their withdrawals after their deposits are processed.

In other words, in real time your bank statements don’t do a whole lot anyway, and most people aren’t checking their account balance at midnight.

Of course, investment banking and similar ventures are different, but even they have long periods of inactivity in a given 24-hour period that permit them to do large batch processes that have fewer chances of locking rows and tables in a database.

BugFree · August 25, 2008, 12:00am

Is this post for all those who dabble in code without ever havign set foot in a CompSci class?

Is this the real audience for stackoverflow?

If you are getting deadlocks in your app, most likely cause is poor application design. Try queuing requests, get with mutexes, … it has been done before.

BLACK_HAL0 · August 25, 2008, 12:00am

Liar! Posting has declined because you are too busy twittering! You my friend have become a twitter whore.

Daniel · August 25, 2008, 12:00am

Well, MySQL was certainly much more optimistic, to the point of not handling transactions at all. That has changed a while back (a very recent while, IMHO). Whether they do optimize for lots of reads and few writes is another story.

And speaking of stories, normalized databases are optimized for heavy reads, low writes. Is it that practical experience teach people not to normalize their databases? Mmmmm.

As for deadlocks, the write lock can’t be taken until all shared locks are released… or made to release. Emphasis on all, not on the time each one takes.

N1112 · August 25, 2008, 12:00am

If you’ve got a SELECT that operates on tables A and B and an UPDATE
that operates on the same ones, then the SELECT will grab a lock on
table A, the UPDATE will grab table B, and then both will sit there
while they try to get a lock on the other table.

The database should be smart enough to handle that deadlock. It should just execute the first incoming query first and then the other. It may be complicated to implement (or they would have already), but i should be possible.

Al18 · August 25, 2008, 12:00am

Why do read locks by default? So people that don’t understand the issues with dirty reads don’t sit around for hours trying to figure out why they get inconsistent results.

In the examples given I wonder why there would be a common set of rows returned?

Josh · August 25, 2008, 12:00am

Where is the value of the [AnswerCount] coming from?

Mark · August 25, 2008, 12:00am

I work with MySQL/PostgreSQL/Firebird.
Joel F
is the only mention of Postgresql from all these entries. I would like to know why no one is using it. I find it much easier than SQLServer, its free, and it an awesome optimizing tool.

Mark · August 25, 2008, 12:00am

forgot the word has between an awesome

Ed_Stegman · August 25, 2008, 12:00am

There is more than just a dirty read issue when using NOLOCK.

Rare, but you can actually get 2 records instead of one if your NOLOCK’d query executes during a page split operation.

slapout · August 25, 2008, 12:00am

DB2 also has an unconditional read (WITH UR) that you can add to the end of your SQL.

Jasmine · August 25, 2008, 12:00am

As we know from reading this blog for many years:

SQL Server is not broken
You, the developer, have caused this problem

I love the comments on this one - assumptions, conjecture, talk about unrelated things. A deadlock is an extremely rare and specific condition. It is not caused by inadequate hardware, or even ‘over-adequate’ hardware. I seriously doubt it could be cause by LINQ either.

There is a very specific situation that causes this, and I think the wrong queries have been focused on here. I can’t imaging the select statement causing any kind of deadlock condition, because it only locks one row at the least, and one table at the worst. There has to be two locks on separate objects in both transactions (total of 4 locks - 2 complete and 2 waiting), for a deadlock to occur. SQL Server does not falsely report deadlocks. If it says there is a deadlock, there is. I love the way developers always take these discussions into unrelated territory - there’s a blog for you Jeff.