This episode takes a close look at transactions from different angles, starting with their fundamental properties of Atomicity, Consistency, Isolation, Durability but also investigating advanced topics like distributed or business transactions.
If youâve got a SELECT that operates on tables A and B and an UPDATE that operates on the same ones, then the SELECT will grab a lock on table A, the UPDATE will grab table B, and then both will sit there while they try to get a lock on the other table. Or if youâve got a block of statements in a single transaction, youâll get the same effect.
I get that, but I donât understand why simple read SELECTS need to lock at all. I mean, maybe if that database was storing my bank account balance, in which case transactional orders are hugely important.
I guess what Iâm saying is that Starbucks Doesnât Use Two-Phase Commit
Thatâs ok to store number of replies for every post.
But instead of storing it into Post table, store number of replies to PostCommentCount table.
(PostId uniqueidentifier, CommentCount int)
One record in Post table would match with one or zero records in PostCommentCount table.
Another problem with locks in SQL Server is that typically SQL Server locks not just one record, but whole Page or even whole table.
P.S.: Iâm really missing ability to edit my comments (or at least delete them so I can replace them with corrected ones).
I agree with Hoffmann, this sounds like a hack. I deployed dozens of enterprise level .NET/SQL and Classic ASP/SQL applications and have not come across this issue.
It sounds like a transaction is being created on the call to the database, either by LINQ or in code, and it not being released. I would pose the question to Scott Guthrie, I bet he can point to one of the LINQ developers that will shed more light on this subject.
I think you misunderstand what a deadlock really is. A deadlock is not a lock that is taking a long time. It is a lock that will never be released until one of the two processes is killed. Hereâs what happens in a deadlock:
Query 1 begins by locking resource A
Query 2 begins by locking resource B
Query 1, in order to continue, needs a lock on resource B, but Query 2 is locking that resource, so Query 1 starts waiting for it to release
In the meantime, Query 2 tries to finish, but it needs a lock on resource A in order to finish, but it canât get that because Query 1 has the lock on that.
So, you have two queries fighting over SEPARATE objects, not fighting over the same object (in which case one query simply has to wait). Thatâs the important part - these queries will never finish. This is quite different from something that is simply taking a long time. There is no solution to the deadlock race condition other than to kill one of the processes.
For this reason, I would suggest focusing only on queries which involve multiple rows or multiple tables. Single-row, single-table queries, such as the one in your example, can never be deadlocked. They can be waiting on a locked resource which is involved in a deadlock condition involved two further queries, but in this case, killing the single-lock query still wonât fix the problem.
I do not advise using the NOLOCK hint, but you are correct that it canât really hurt anything on website reads. Someone will see out of date information, but in most cases, that should be ok.
SQL Server does not re-issue your command because the command itself could be dependent on the data changes that were being made by the query that wasnât killed. This gives the client application a chance to run the transaction again, including any reads which may need to be done to data that was changed and is now in a consistent state.
SQL Server 2005 sucks but you only find out after youâve got several millions records and itâs too late to migrate off. Iâve lost all faith in SQL Server as an enterprise DB when performance started going in the toilets with heavy reads despite the fact that the behemoth itâs running on has almost no disk and CPU activity.
SQL Server 2005 sucks but you only find out after youâve got several millions records and itâs too late to migrate off. Iâve lost all faith in SQL Server as an enterprise DB when performance started going in the toilets with heavy reads despite the fact that the behemoth itâs running on has almost no disk and CPU activity.
I get that, but I donât understand why simple read SELECTS need to lock at all. I mean, maybe if that database was storing my bank account balance, in which case transactional orders are hugely important.
The description of the pessimistic behavior and the description of âread committed snapshotâ make the latter sound like multi-version concurrency. Near the beginning of the post, when you said that a SELECT was part of the problem, I was thinking, Why? If it was using MVC then a SELECT wouldnât even lock.
Then you got to the part about snapshots. It seems SQL Server (which I do not use*) can do MVC, but not by default.
Been there, done that. Anyone using SQL Server in a large system will run into this. As a rule, when we select from a table we use with (nolock). Hereâs another mostly unknown Gem you should know about:
[QUERY] with MAXDOP(4) â use a max of 4 processors
The command above tells SQLServer how many max processors it can use on a query. We ran into a query that would run MUCH MUCH faster on 1 processor than multiple. Our 8 core superserver was trying to be too smart with the query I guess.
I agree with Jasmine. To add to it, because you donât show the comments in coversation style but rather order of âvotesâ the appearance of out of date will be minimal as well.
It should always be okay to be viewing out of date info on the web shouldnât it? Thatâs the nature of a web page - itâs as stale as the last refresh. This web page may have had new comments added since I started viewing it two minutes ago⌠whatâs the point in using fancy locking in that case.
Wasnât the diner a drive system for some ship in one of the Hitchhikerâs Guide to the Galaxy books?
Yup. It was from Douglas Adamâs Life, The Universe, and Everything:
Bistromathics:
The most powerful computational force known to parascience. Bistromathics is a way of understanding the behavior of numbers. Just as Einstein observed that space was not an absolute, but depended on the observerâs movement in time, so it was realized that numbers are not absolute, but depend on the observerâs movement in restaurants.
The first nonabsolute number is the number of people for whom the table is reserved. This will vary during the course of the first three telephone calls to the restaurant, and then bear no apparent relation to the number of people who actually turn up, or to the number of people who subsequently join them after the show/match/party/gig, or to the number of people who leave when they see who else has turned up.
The second nonabsolute number is the given time of arrival, which is now known to be one of those most bizarre of mathematical concepts, a recipriversexclusion, a number whose existence can only be defined as being anything other than itself. In other words, the given time or arrival is the one moment of time at which it is impossible that any member of the party will arrive. Recipriversexclusions now play a vital part in many branches of math, including statistics and accountancy and also form the basic equations used to engineer the Somebody Elseâs Problem field.
The third and most mysterious piece of nonabsoluteness of all lies in the relationship between the number of items on the check, the cost of each item, the number of people at the table and what they are each prepared to pay for. (The number of people who have actually brought any money is only a subphenomenon in this field.)
Numbers written on restaurant checks within the confines of restaurants do not follow the same mathematical laws as numbers written on any other pieces of paper in any other parts of the UniverseâŚ
I see JC already suggested this, but I have worked on two financial apps using SQL Server and both have had issues with deadlocks (and yes, we tried all the usual deadlock analysis. We arrived at two approaches:
NOLOCK hint. As Jeff mentions, this is rarely as bad as people make it out to be. Gee, so the data is 500 ms out of date. Tough. There are only a few cases where this actually matters.
Put in your own retry logic (detect the deadlock error and resubmit the query). This may feel like a hack, but guess what, it works. And would you rather have this hack or have your users see stupid database errors. If you have centralized database access methods, it is fairly simple to add retry logic.
You canât fix SQL Server, so you just have to deal with the real world here and do whatever it takes to make sure your users donât experience database errors.
My guess, without having access to your logs and such, is that your deadlocked transaction (the one with the SELECT) is probably part of a longer outer transaction that acquired a shared lock on the rows that are being UPDATEed by the other one.
Setting your database transactions to snapshot isolation mode can still result in problems, since it will raise an error if you try to update rows that have changed since the beginning of your transaction (version mismatch) and also takes out X locks on writes, so you can still get deadlocks.
The correct way to fix this problem beyond using snapshot isolation mode (which will help greatly, because it removes the use of shared locks) is to a) minimize the length of your transactions to minimize the number of locks and b) always set up your web server to re-run requests in the case of a update conflict or deadlock.
Weâve built a large service (Earth Class Mail) pretty much exclusively on LINQ to SQL. Itâs worked great, except that weâre still battling all the transaction issues almost a year later (deadlocks, leaky transactions, etc). Make sure you FULLY understand the intersection between TransactionScope (if youâre using it) and SqlTransaction. We didnât, and weâre still paying for it (with daily deadlocks just like what you describe). Make especially sure youâre aware how Read Committed Snapshot (and any other tx isolation level) can leak into your connection pool, otherwise, youâll end up running stuff at the wrong isolation level (often causing more deadlocks and how the he*l did that happen- weâre not even in a transaction! moments. We ended up building a bunch of infrastructure into our own DataContext-extended base class to deal with it, and to ensure that the connection pool stays free of connections stuck in transactions.
OK, but why doesnât SQL Server re-issue the transaction on my behalf, then? Automatic retries of some kind?
Think of the banking application⌠you wouldnât want to re-issue anything if you know something slipped⌠it IS better that SQL server team be pessimistic by default.