I admit it – I do waaayyyy too much with dynamic SQL. But I just keep running into situations that require it. That being said, I ran into an interesting problem that had me puzzled. I found a bunch of different blog posts that pointed to me to the right direction but required a little extra work to find the solution.
There are several concepts that are at play here, so I’ll try to break this out so we can put the pieces together. The first once is centered around dynamic SQL. There are two parts of this I want to make sure we understand first – how it fits into sessions and how it gets executed.
Happy T-SQL Tuesday! It’s the first one of 2023 so in the spirit of starting the new year on the right foot, it feels wrong to not join in.
This month, our host is Raul Gonzalez (b | t). His challenge for us is are there cases where the commonly agreed upon “worse cases” are cases where they are useful?
I feel like this is where I should say something like, “Hi, my name is Deborah and I’ve used nolock in production.” I would also have to confess to doing things like using correlated sub queries, not using a foreign key, implemented complicated triggers, etc. I often talk about how the first real SQL script I wrote had cursors running over temp tables in SQL Server 6.5, which I’m fairly certain was one of the first thing I read you were NOT supposed to do. And oh, hello there, denomalized table and dynamic SQL! I’m sure I’ve done more things than this too. These are just the ones I can remember doing, or at least I’m willing to admit in public.
I’m not sure if I can remember all the specifics but what I do remember is, there was a reason for it. More importantly, these were also exceptions to the rule.
Why did we use nolocks? Because after more than 2 weeks of doing deep dive investigation of the deadlock situations and being on an older version of SQL Server (or we at least had the requirement to support a legacy version) where snapshot isolation wasn’t an option, using a combination of nolock, index hints, and the application developer adding logic to immediately retry the query if a deadlock happened mitigated the problem. We tested it thoroughly as a solution before being sent to production.
Why did I OK a correlated sub query? Because I knew the SELECT statement was only going to call one or two rows and while it wasn’t great, it shouldn’t have been much of a problem. I made a note in the pull request that this should be looked at with the next change so we could keep track of the problem. There may have been other times where we were using the STUFF + FOR XML string aggregation logic to take a bunch of related rows and have them returned in a single column value as part of a larger dataset.
The cursor over the temp table? I blame a really, really bad database design that even I, a very new and junior DBA at the time, could tell was just a really, really bad design. And redesigning the database wasn’t an option. But I always come back to this code because it’s really how I learned to write SQL.
Sometimes the worse practices are there because people haven’t run into the issues yet. Or you question whether it’s a worse practice because you personally don’t like it or it’s an actually problem or it only is one under different circumstances. I like CTEs but I’m so cautious of them because I’ve been burned by bad performance. Does that fall under “worse” practice or does that fall under “do extra testing first”? What about the MERGE syntax? I know people who use it and haven’t run into an issue but I’ve just read about all the bugs with it so it really feels like something we don’t want use. So is that a “worse” practice to use or is it a “do extra testing first” or “better safe than sorry so let’s just avoid” situation? But more importantly, is it a worse practice to use CTEs or MERGE because of the problems associated with them based on different use cases or do we just make sure we use them when appropriate, with the proper testing of course?
Here’s the thing – there is a reason these things that we consider bad practices are available in the first place. There are situations where they can be useful and help solve problems. I think what makes them bad practices is implementing them as a rule instead of as an exception. “This solved\prevented this one problem so therefore it must solve\prevent all the problems.” Another issue is when those practices are embedded in either legacy code or legacy work culture. It’s harder to educate people on why you have to change to the best practices, especially if you run into that case where you really do need that “worst” practice. But when the worse practice is embedded in the code and culture to the point you can’t change it later, you start to learn the true cost of Tech Debt the hard (and very expensive) way.
The other catch to all of this is while there was a reason these “worse” practices were implemented at the time, we have to remember to keep track so that way when newer features and functionality come along that could be better solutions, we can use those to get rid of the things we shouldn’t be doing. For example, introducing snapshot isolation where nolocks were used since nolock and dirty reads will take precedence over the optimistic row versioning.
I guess it comes down to you have test your code and be willing to defend your decision to use a “worse” practice to someone else. It is very much a calculated risk so you have to make sure you understand what you’re doing and why and keep track of when you make them. More importantly, you have to make sure that you can pivot to fix the problem should you accidentally make the wrong choice. Good luck!
Thanks again to Raul for a great topic. Looking forward to reading all the “bad” things you have been doing in your databases.