Posted in Data Governance, T-SQL Tuesday

T-SQL Tuesday #144 – Data Governance

Happy T-SQL Tuesday! I’m really excited for this month’s topic, hosted by Victoria Holt. (t | b) This month, Victoria invites us to “share our experiences on data governance.”

Almost exactly 2 years ago, I gave a lightning talk about an update statement. It wasn’t just about understanding how that update could affect more than one table but it expanded to show how that update was also part of a larger picture of the entire database ecosystem.

A data "ecosystem". Multiple databases flowed into a warehouse that was consumed by Excel, PowerBI and Azure tools while other database were also consumed by Python and Machine Learning
The data ecosystem from my Speaker Idol talk

I didn’t realize it then, but looking back, this is essentially what data governance is about: understanding your data from both the micro and macro level. It’s understanding where our data lives (data assets) and how data flows through data sources (data lineage) as well as how it’s consumed and used (data catalogs and data profiling). More importantly, this is knowledge that can be shared to make data even more valuable.

Why is data governance so important? The easy answer is so you can answer any and all questions about your data as needed. To expand on that, if regulators ask questions, you know where to look. If you have a customer with CCPA or GDPR requests, you can start figuring out how to accommodate them. If you have a new team member, they can have the tools and documentation to start understanding what’s available and where so they can contribute sooner. You can also look at your processes and understand how they work together, how they may create duplicate work or even understand how they create some of the conflicts or bugs you’re trying to straighten out. You may identify other ways to improve your data quality, which improves everything that touches your data.

If you have a few databases in your ecosystem, it’s much easier to get started documenting and understanding all of these different areas involved in implementing data governance. But start expanding the number of databases and the complexity of how your systems work, the job of governance becomes a lot harder. But this is also why it becomes that much more important.

Now, why I am so excited that Victoria chose this as her topic? I’ve also joined the roster of the #sqlfamily community members who’ve made a switch at work. I’m still at the same company but I’ve moved to a different team. Our goal is to implement data governance. I’ve always been a fan of understanding your data from the big picture perspective and now I get a chance to do that on a larger scale. I’m excited to look at new functionality, new tools, new skills, and new terminology as well as use what I already know in very different ways. With the announcement of SQL Server 2022 and some of the new functionality to look into, such as the integration with Purview, it seems that we will be hearing a lot more data governance to come.

There’s still a lot I have to learn on this subject. (To be honest, I’m already worried that I’m using the wrong terms or definitions for things. As I said, I have a lot to learn.) Getting started with data governance seems like a very daunting task. But it’s a challenge I’m looking forward to learning more about.

Thanks so much for hosting this topic, Victoria! I’m so looking forward to reading everyone’s post and learning about the different tools.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s