Posted in T-SQL, T-SQL Tuesday

T-SQL Tuesday #158 – Using the Worse Practices

Happy T-SQL Tuesday! It’s the first one of 2023 so in the spirit of starting the new year on the right foot, it feels wrong to not join in.

This month, our host is Raul Gonzalez (b | t). His challenge for us is are there cases where the commonly agreed upon “worse cases” are cases where they are useful?

I feel like this is where I should say something like, “Hi, my name is Deborah and I’ve used nolock in production.” I would also have to confess to doing things like using correlated sub queries, not using a foreign key, implemented complicated triggers, etc. I often talk about how the first real SQL script I wrote had cursors running over temp tables in SQL Server 6.5, which I’m fairly certain was one of the first thing I read you were NOT supposed to do. And oh, hello there, denomalized table and dynamic SQL! I’m sure I’ve done more things than this too. These are just the ones I can remember doing, or at least I’m willing to admit in public.

I’m not sure if I can remember all the specifics but what I do remember is, there was a reason for it. More importantly, these were also exceptions to the rule.

Why did we use nolocks? Because after more than 2 weeks of doing deep dive investigation of the deadlock situations and being on an older version of SQL Server (or we at least had the requirement to support a legacy version) where snapshot isolation wasn’t an option, using a combination of nolock, index hints, and the application developer adding logic to immediately retry the query if a deadlock happened mitigated the problem. We tested it thoroughly as a solution before being sent to production.

Why did I OK a correlated sub query? Because I knew the SELECT statement was only going to call one or two rows and while it wasn’t great, it shouldn’t have been much of a problem. I made a note in the pull request that this should be looked at with the next change so we could keep track of the problem. There may have been other times where we were using the STUFF + FOR XML string aggregation logic to take a bunch of related rows and have them returned in a single column value as part of a larger dataset.

The cursor over the temp table? I blame a really, really bad database design that even I, a very new and junior DBA at the time, could tell was just a really, really bad design. And redesigning the database wasn’t an option. But I always come back to this code because it’s really how I learned to write SQL.

Sometimes the worse practices are there because people haven’t run into the issues yet. Or you question whether it’s a worse practice because you personally don’t like it or it’s an actually problem or it only is one under different circumstances. I like CTEs but I’m so cautious of them because I’ve been burned by bad performance. Does that fall under “worse” practice or does that fall under “do extra testing first”? What about the MERGE syntax? I know people who use it and haven’t run into an issue but I’ve just read about all the bugs with it so it really feels like something we don’t want use. So is that a “worse” practice to use or is it a “do extra testing first” or “better safe than sorry so let’s just avoid” situation? But more importantly, is it a worse practice to use CTEs or MERGE because of the problems associated with them based on different use cases or do we just make sure we use them when appropriate, with the proper testing of course?

Here’s the thing – there is a reason these things that we consider bad practices are available in the first place. There are situations where they can be useful and help solve problems. I think what makes them bad practices is implementing them as a rule instead of as an exception. “This solved\prevented this one problem so therefore it must solve\prevent all the problems.” Another issue is when those practices are embedded in either legacy code or legacy work culture. It’s harder to educate people on why you have to change to the best practices, especially if you run into that case where you really do need that “worst” practice. But when the worse practice is embedded in the code and culture to the point you can’t change it later, you start to learn the true cost of Tech Debt the hard (and very expensive) way.

The other catch to all of this is while there was a reason these “worse” practices were implemented at the time, we have to remember to keep track so that way when newer features and functionality come along that could be better solutions, we can use those to get rid of the things we shouldn’t be doing. For example, introducing snapshot isolation where nolocks were used since nolock and dirty reads will take precedence over the optimistic row versioning.

I guess it comes down to you have test your code and be willing to defend your decision to use a “worse” practice to someone else. It is very much a calculated risk so you have to make sure you understand what you’re doing and why and keep track of when you make them. More importantly, you have to make sure that you can pivot to fix the problem should you accidentally make the wrong choice. Good luck!

Thanks again to Raul for a great topic. Looking forward to reading all the “bad” things you have been doing in your databases.

Advertisement
Posted in T-SQL Tuesday

T-SQL Tuesday #156 – Production Grade Code

Happy T-SQL Tuesday, everyone! I’ve missed the past couple of ones but I’m happy to jump back in. This month, Tom Zika (t | b) asks us to talk about what makes code “production grade”. You can find his full invitation here.

I have to confess that I struggled with this idea. What does “production grade” really mean?

To me, the most obvious part is that code going out into production can pass code review and testing. You don’t want to send out code that has obvious bugs – like having incorrect joins or inaccurate calculations or poor query plans, etc. Code should also go through reviews, not just to make sure the obvious bugs are there but other basic development standards, like naming conventions and proper commenting, have been followed. Hopefully all code has been tested and gone through QA cycles before going to releases. These feel like basic requirements before code goes out. But this also assumes that your company and development team have the means to support this set up. If yours doesn’t, you can (and should) find ways to start implementing aspects of basic reviews and testing to help catch these things before they get sent to production.

This works well to help make sure the small things that we see by themselves are “production ready.” But then we start looking at the bigger picture and cases where things go out in production but they’re not well designed. It’s always “fun” to get into “discussions” with the developers on these issues, point out the problems and potential solutions, have them agree with you, and then spend the next week putting together the scripts for the bad design because they’ve already done the code and it has to go out on Monday. By the way, the promise of “we’ll go back and fix this later” almost never happens. I say “almost” because it may have happened once and I want to make sure those developers get credit for it. But is this code really production ready? Probably not, but it’s going to be anyway. As long as you can make sure it’s as clean as it can be under suboptimal conditions and can pass all of the other requirements of a code review and testing, it’s the best that you can do.

Along those same lines, we have the proof of concept code that works well enough that finds its way into production. Lately, I feel like these are also called “Minimal Viable Products”, or MVPs. I cannot stress enough how much I have grown to dislike using the same MVP initials to mean “minimally viable” or ”most valuable” – it does a disservice to both concepts. These can then be turned into the “alpha” or “beta” releases. We tested it enough that we’re OK letting clients do the real testing for us, with the caveat that they know it’s going to be buggy or not quite work the way they wanted. Hopefully, the functionality and code are designed well enough that the needed changes can be made without requiring a major redesign of the database. But again, these should be following the same basic rules that all code should pass a review and testing.

So the question becomes is “production grade” mean the best possible design along with code that would pass a review and QA? We know that even the code that does meet the standards and even well designed can still cause issues in production and have to be reworked for a variety of reasons. We also know that we may have no choice and send out code that probably shouldn’t go out into production for a variety of reasons but can still pass the basic standards we’ve set up. The basic requirement for me is that everything can pass a code review (no obvious errors in the code, the development team’s coding styles and best practices are used, etc.) and it passes QA testing. As for the other issues, the goal may just have to be creating a strategy to note things that need to be looked at as well as a plan to get the support needed to address them.

In the end, if the little things are “production grade”, it sometimes has to be enough, whether we like it or not.

Thanks for hosting this month, Tom! Looking forward to seeing what everyone else’s “production grade” standards are.

Posted in PASS Summit, Professional Development, SQL Saturday, SQL Server, T-SQL Tuesday

T-SQL Tuesday #153 – The Conference That Changed Everything

Happy T-SQL Tuesday! Kevin Kline (t | b) is our host this month. You can read his full invite here, but here’s his request for us:

Tell us the story of how attending an IT conference or event resulted in an amazing career or life opportunity.

This seemed like an easy thing for me to write about when I first sat down. It should be simple enough. But it feels a lot harder because it’s not been a straight line or a single event.

When I think about conferences or events, the one that everything seems to revolve around for PASS Summit, with SQL Saturdays running close behind. I have been to some virtual events but of the in-person conferences, I’ve only made it to those and they seem to stand out to me. But is there really one that was a game changer?

Looking back at attending conferences, I think about what are obvious moment that many would think of something they could pinpoint. An obvious one could be winning Speaker Idol at PASS Summit 2019. I know – we’re all tired of me mentioning this, including me. But I worked really hard on that presentation and to be honest, it gave me a needed distraction from some personal stuff going on at the time. Plus it’s a concrete moment that seemed like it should be a game changer at the time. But did it translate to changing the speaking game for me? Let’s see, I got at least one SQL Saturday rejection after winning and before the pandemic hit. Almost 4 months later, everything else in-person was being canceled and my presentations were all moved to virtual. My session for this upcoming Summit is still going to be pre-recorded\virtual as part of a Learning Path (my in-person abstracts were not selected), which means I have to wait longer to present at Summit in person. So it didn’t feel like an avalanche of speaking opportunities came with the title; it felt like “business as usual”. Do not misunderstand me – I’m so grateful for all of the presentations I have been selected to do and I really do love presenting. But in this context, I’m not sure I can claim winning Speaker Idol as a career or life changing event.

But on the other hand, I became engaged in what I was doing by attending Summit, SQL Saturdays, and the other virtual events that have popped up over the years. I learned how to be better at my job. I learned how to share what I was learning with my coworkers so we could see how we could do better. By blogging and speaking at SQL Saturdays and other events, I started getting involved in the community and have expanded my involvement when I’ve gotten the chance. I’ve been able to be a mentor and moderator at virtual events like New Stars of Data, which I know have been game changers for so many others. All of these activities together helped in the interview process for new jobs because my future employer knew who I was and what I had to offer before I entered the door – physical or virtual. It was all the little things that added up to be the game changer in my career and not a single event.

On a personal note, the connections that I have made to different people have meant the most. Whether it was a SQL Saturday or PASS Summit, I was able to connect to other members in the #sqlfamily community. I would meet someone at one event and get to know them better at the next one. And yes, I even married someone I met through this community. There are still some people that I’ve only really gotten to know virtually over the past couple of years. In some ways, it’s been harder to create these connections virtually, yet many have still been able to happen in spite of and because of it. But if I have questions, I know I can reach out to with questions or get feedback on something I’m working on. But more importantly, so many of these people I can now call “friends” and I hope they know they can say the same about me. And that means more to me than anything else. Which, by itself, is a true game changer.

But can I really claim any of these connections as stemming from a single event? Connections take time to develop so I don’t feel as if I can.

I guess if I did have to focus on one, it would be PASS Summit 2016. This was the conference where I started making the connections to people in the community and understood how to make the most of these events in terms of what I needed to bring back to my day job. It’s where I found my passion for this community, which led to everything else that I’m doing. I’ve written about this before and I still feel the same way. One of the things I’m looking forward to by attending Summit this year in person is that I can get back to what made this such an important conference to me to begin with – focusing on learning the things I need to learn and re-connecting with this community where I feel at home.

Thanks to Kevin for hosting this month! Looking forward to reading everyone’s contributions!

Posted in SQL Server, T-SQL Tuesday

T-SQL Tuesday #152 – Round up

It’s time to do the round up of this month’s T-SQL Tuesday entries. It was great to see so many people responding, including at least one new participant. I think there are a lot of kindred spirits here, as in we’ve all felt each other’s pain. (I know that I personally can relate to way too many of these things.) And I truly enjoyed reading everyone’s post.

So let’s go through the run down:

Continue reading “T-SQL Tuesday #152 – Round up”
Posted in Database Development, Source Control, T-SQL Tuesday

T-SQL Tuesday #152 – My rant

Welcome to T-SQL Tuesday! I’m your host this month…

You can find the full invitation here. But to summarize, I’m asking everyone to share that rant they have with their coworkers, the one that makes them throw “It Depends…” out the window.

When I said this was inspired by conversations with coworkers, I wasn’t lying. I realized that I would finish each “rant” with a “I really should write a blog about this”. (And yes, I may have a new session I want to put together that includes this.) I don’t know if this is a rant but as a philosophy I’ve developed. In either case, here’s that blog post:

My coworkers and I were talking about database deployments. The vast majority of my career has been centered around making sure that whatever we develop can be deployed to clients cleanly and repeatably (if that’s a word). I‘ve only worked for software companies so in most of these places, our clients maintained their databases on their own systems or we managed their instances for them. We have multiple versions of the software out in production at any time. One product I worked on was supported on 3 different RDBMS systems. So we had to make sure we have a reliable to make sure that the database changes deployed successfully every single time. I had to make sure that whether we were installing the database at a new client or whether we were upgrading an existing client, the database at the end are the same and have everything that they should have.

That last sentence is key: At any point and time, you should be able to create a brand new version of your database with the basic data set needed to get started. And you should be able to deploy changes to any existing database to get to that same version of your database. How do you do this? You start with a “source of truth”. <begin rant>

The “source of truth” is my newly made up phrase for whatever you are using to say this is my database schema and initial data needed to start up the application. This can be your script directory; this can be a dacpac or bacpac; this can be your data model; this can be a combination of these things. My go-to “source of truth” right now is my source control repository. I’ve got both the schema and the default data needed in the same location. In the past, I would have probably included the data model as way to help me make sure whatever database table changes I have in my source control are there, especially for that one database which only had tables and views. (A different rant for a different time.) Whatever you use, it absolutely CANNOT be an actual database. There are two main reasons for this:

First, an actual database is being used for multiple different purposes whether it’s for production or testing and is actively being modified so it no longer is a reliable source. Because you can’t trust what work is being done at any point and time. If you take a production database, you have production data that you will need to account for and be prepared to clean out, if you even have a good way of know what can be deleted, what the client can and has changed and what you must have. And as much as you like to think that no one is making changes to a production database with you knowing, you know that’s just not the case. You need to be identify what those changes are and determine if and how they need to be backported and made standard for all databases. That gets back to the Catch-22 of how do you know that a database change is standard if you don’t have a way to know what is standard? You can’t trust a QA or Release environment either because there may still be ongoing changes to make sure that the database schema is correct.

(Ask me about the time that the consultant took the QA test database, cleaned out the data, and restored to client production environments only for me to discover this when installers failed in production because he grabbed a backup that had objects that only existed for one day since I completely redesigned them the following day…)

Second, you should have your source of truth checked into source control. This is not just so you can keep track of the changes you’re making now. You will also need to have multiple versions of your “source of truth” at any time. If your clients are all on the same production version, consider yourself really lucky. But what if you need to look at the version before the one that’s currently in production? Or the one before that? I’m used to multiple versions being out in production at the same time so I had to be able to find out what’s in those versions as well. I’ve had to go back to older versions of the database schema in order to troubleshoot a problem because the client was on an older version than the one I was testing in development. This is where source control can help you keep track of what was released and when.

The implication of this is that you will end up having multiple “sources of truth” – one for the different versions for production releases and the version(s) you need as you go through the development lifecycle, such as QA and Release. Why multiple production versions? You need to know what has gone out into production at any point. But the QA and Release versions can help ensure that you know what’s being tested. Those “source of truths” are changing but that’s to be expected. You can still use them to make sure you know what is supposed to be in QA and Release and make adjustments if something is missing or was added before it should have been.

But here’s the real thing about having a “source of truth” – there are multiple people who are likely working on any database at any given point. There needs to be a reliable way to look at source control and find what the definition of the object you need. If I want to see what a table looks like, I should have a standard place for me to go to find it wherever I need to go in source control to find it. If a stored procedure is different in two different databases, I should know where I can find out what that procedure is supposed to look like to help me understand why they are different. If you are on a newer team – maybe the development teams got reshuffled or you just joined a new company, you may not have the institutional knowledge and history of the databases so you can’t guarantee what you’re working with is correct and more importantly, what’s going out into production is correct. This is what a “source of truth” is for. This creates that shared knowledge that you can trust.

My career has been about making sure that we can always do a clean install of the database and upgrade the database to the given schema version, regardless of where we started. I constantly did compares between new installs and upgrades – between the databases and between the databases and the data modeling tool. I made sure that everyone knew how to work the upgrades and was responsible for teaching developers and helping them as the process changed. I could tell you where to find the object in source control and you could be sure that it was the version. There are times where I felt like my job was “SQL Script Cop” because I spent so much time making sure that the scripts we were using and testing were correct. I couldn’t have done any of this if I didn’t create and maintain a reliable “source of truth”. This wasn’t just about making sure that I had a “source of truth”, it was taking ownership and responsibility for the database. And the “source of truth” could go through QA cycles to make sure everything was valid.

When we talk about the importance of data to our applications, it’s this level of ownership that makes our products better for our customers. It streamlines upgrades, and downtime for upgrades, and makes things more reliable. It makes our processes repeatable because we know what to expect so we can better test them to make sure everything works the way we expect it. Every process that we do regarding the database will be built around this concept of a “source of truth”.

Whatever you deem as your “source of truth”, it must be something that everyone knows and is committed to maintain. This has to be built into the development cycle. Trying to create a “source of truth” for a database that has been around for a while is a monumental task to make sure you get correct and you need heavy buy-in and support from the top.

Let’s be honest – this is not an easy thing to create. It’s an even bigger pain to maintain, especially if you don’t have the same level of support from those around you. But to me, it’s a central and essential place to start if we are going to make sure we handle database development and deployment properly. After all, you have to know what your database looks like. Because in the end, you need to be able to create a new database or upgrade any existing database to a given version at any time – consistently and reliably.

</end rant>

Thank you for coming to my TED Talk. Did I answer your question? Wait, what was the question again?

I am looking forward to reading all your posts!

Posted in SQL Server, T-SQL, T-SQL Tuesday

T-SQL Tuesday #151 – Coding Standards

It’s another T-SQL Tuesday! This month, Mala Mahadevan (t | b) is hosting. Her question for us this month is:

What are the T-SQL coding rules that are appropriate for where you work and what you do? 

When I think of what my coding standards are, I tend to think of it as how would I review scripts for a pull request (PR). I think my past coworkers can attest that I can get quite picky when I look at code. So if I were your reviewer, what sort of things would I comment on?

Continue reading “T-SQL Tuesday #151 – Coding Standards”
Posted in Professional Development, SQL Server, T-SQL Tuesday

T-SQL Tuesday #150 – My First Technical Job

Welcome to May’s T-SQL Tuesday! Thanks to Kenneth Fisher (t|b) for hosting! This month, he’d like to hear about our first technical job. As he says: “I know most DBAs don’t start out working with databases so tell us how you did start.”

My first technical job actually came out of my first job out of college. I was an English and History major who wanted nothing to do with math or science. I was always pretty good with computers; I always credited that to having a computer in my room my senior year of high school – not because I was into computers but my dad was and he got tired of me kicking him off his computer to write papers for class. But I never wanted to take a computer science class or learn to program. (Cue laughter!)

Continue reading “T-SQL Tuesday #150 – My First Technical Job”
Posted in T-SQL, T-SQL Tuesday

T-SQL Tuesday #148 – Advice for Running a User Group

Welcome to another T-SQL Tuesday!

Thanks to Rie Merritt (t) for this month’s topic. She’s asked us to write about various aspects about running a user group. You can find the official invitation from Rie here.

For those of you who don’t know me, I’m a board member of the NESQL User Group, the Microsoft Data Platform user group in the greater Boston, Massachusetts (USA) area. I mostly help out with communications – sending out the monthly tweets and posting reminders to our group on LinkedIn.

Essentially, I want to make sure you know about and remember to attend the great event that we’re hosting each month.

Continue reading “T-SQL Tuesday #148 – Advice for Running a User Group”
Posted in SQL Server, T-SQL Tuesday

T-SQL Tuesday #146 – Upending Preconceived Notions

Happy 2022! Let’s start the year off right by joining this month’s T-SQL Tuesday blog party.

Andy Yun (t | b) is this month’s host. Our challenge this month is:

 …to think about something you’ve learned, that subsequently changed your opinion/viewpoint/etc. on something.

There’s a whole lot of things I could probably talk about, but I think we’ll stick to SQL Server for now.

I was always taught to clean up after myself so I thought dropping my local temp tables after creating them in a proc did that. I knew that once the session was dropped, the table no longer was there so SQL Server would eventually take care of it. But then you had the cases where you ran the same stored proc multiple times in the same session so you wanted to make sure the table was gone at the end of your procedure to prevent issues.

Pam Lahoud (t | b) has a great session called: TempDB, The Good, The Bad, The Ugly. She presented it at the EightKB conference but I happened to catch it at a NESQL meetup. I highly recommend that you watch it; it’s linked in the title. TempDB is one of my weaker areas and I learned so much by watching this.

My “a-ha!” moment was when she mentioned that you don’t need to drop temp tables that you created at the end of the stored procedure.

Wait, wha???

Yes! In SQL Server 2005, Microsoft added temp table caching as a way to solve metadata contention in tempdb. When you “drop” the table as part of a stored procedure, it doesn’t really get dropped at all. There is a copy of the table’s metadata – and its statistics – that gets left in the cache. So when you call the stored procedure again, the statistics are available for the procedure so SQL Server can use that to help create the execution plan.

There were two things about this that kind of blew my mind. First, the statistics of that temp table are reusable. If you set up a temp table and if it’s similar in scope each time you call the stored proc, you already have an execution plan cached that can be reused. Second, I’ve been doing this wrong since SQL Server 2005. That’s a really long time to not know that you don’t have a to drop the temp table at the end of the proc. You still have to make sure your stored procedure can work properly if you run it multiple times in the same session.

I know my overly simplified explanation doesn’t go into all of the details. Luckily, there are other people who can explain it much better than I can. A couple of resources if you’re interested in learning more:

  • Watch Pam’s session. Here’s the link to her EightKB Confererence session again.
  • Paul White has a couple of great blog posts about this. You can read them here and here.
  • Franklin Yamamoto at SQL Grease also has a nice explanation here.

There is a lot that goes into this and a lot of different caveats. It’s worthwhile to read up on these.

I can’t wait to read everyone else’s posts and unlearn something that I had thought was good practice.

Thanks for hosting, Andy!