It’s another T-SQL Tuesday. Thanks to Wayne Sheffield (b|t) for hosting this month. His T-SQL Tuesday challenge is to write about Brick Walls we have faced. (If you are unfamiliar with the T-SQL Tuesday party, check out the website for the full backstory.)
It took me a bit to figure out what to write about for this topic. But that’s not the type of brick wall I want to post about today.
The Brick Wall:
I was working with a highly transactional system. The table with the most writes was also the table with the most reads. To add to the problem, the data being updated the most was the same data we need to read the most. The next set of popular tables, i.e. the next most read and updated tables, had triggers that used the most popular table for some of the data needed. As you can guess, we had a lot of deadlocks in our system. Oh, did I mention that this was running SQL Server 2000?
So I was given the task of doing the deep dive on deadlocks to solve the problem. I spent about 2 weeks studying this. I could read the deadlock trace flag output in the error log and identify the page involved in the deadlock. I learned how SQL Server decided a query would be the victim. I became more familiar with indexing and how indexes were used – when it would scan vs. seek, etc. As you can imagine, I got to know deadlocks very well.
The problem was at the end of the two weeks, I still had the same problem. Sure I had the greater understanding of locking, blocking, and deadlocks and why they were happening but knowing all of that didn’t give me the way to get rid of them altogether.
The brick wall was the architecture and requirements of the application along with the database design. To truly get rid of all of the deadlocks, we would have to do some major restructuring and redesign, which was not going to be possible. It would take too long and this was a major issue the clients were reporting.
Finding the Hole in the Wall:
So my challenge became: how can I take the deep dive of what I had been studying and apply it to the main problem? I ended up figuring out workarounds. I identified the main queries and indexes involved with the deadlocks. I created new indexes for query hints to be used by specific queries in the triggers. I also found there was one view that was frequently involved with the deadlocks so we purposely set it to run with deadlock priority low, meaning that if it was going to be part of a deadlock chain, it would be a victim. The application would then handle any errors returned so if the view was part of a deadlock chain, it would immediately run the SELECT again without returning the error to the end user.
At the time, I was a little frustrated that I spent so much time studying something when I knew the solution all along was that we would have to rewrite and handle the errors. Looking back now, even though I couldn’t use the details of what I learned, I probably was able to make smarter choices about what indexes to create and which queries need which types of hints or to be handled in certain ways because I had a better understanding of how SQL Server worked.
In the end, the brick wall was still there – we still had frequent deadlocks in our system. But we were able to find a way to work with it rather than get rid of it altogether. It may not have been a pretty solution or even the best solution, but it was one that worked. And that’s what we needed at the time.