Posted in The Survey Project, SQL Server, data modeling

Rethinking & Redirecting

classic blunder memeI feel like I made the classic blunder. Well, one of the classic blunders at least. The good news is that I didn’t get involved in a land war in Asia or go against a Sicilian when death was on the line. But I did fall for one of the even lesser lesser known ones – I didn’t practice what I preach.

If I think about the main theme I’ve been talking about lately, it would be don’t just go through the motions of doing things but really think about why you’re doing what you’re doing. There are so many places where this applies in life and I really do think it’s important.

I feel like I’ve fallen down on that idea when it comes to this blog. Let me try to explain:

When I started, I gave myself a project based on a research paper I did in college. But I haven’t worked on it for a while now. It’s taken me some time but I think I figured out some of the road blocks I’ve been struggling with for a while. And I mean some of the problems that I was facing before I allowed the excuses of “I’ve been busy” or “I’m focusing more on speaking these days” to get in the way.

When I gave myself this project, I thought it would be good practice for building a database because I saw the potential of how it could be use. I wanted to use a database as a way to see the connections between the people who returned the survey: where they were from, what unit they served in, where they fought, who they served with, etc. The way I approached this was by taking each question, breaking it down, then building on what I came up with using the next question.

The biggest problem with this approach was that the questions aren’t written in a way to get standard answers. How do you extract the data for a relational database structure? Non-standard answers make it very hard to interpret the data points.

Another problem with the questions is that they were designed for the data to lean a certain way or left out information that a modern viewpoint needs. Unfortunately, some of that bias – especially in today’s world – makes it even harder. Some of the implications of these questions – even just having this as a subject in some cases – are beyond what I have the means to handle properly and to be honest, I’m still struggling a little about what to do with that.

By just using the questions alone dictate the database design allowed me to lose sight of what I was hoping to gain by doing this. I stopped focusing on why I was creating the database and I lost the direction I wanted to go with the project.

If I were to start over, I think I would take a different approach. Some ideas on how I might do this are:

  • I would clearly define the goals of the database:
    • As a historian, I want to see the statistics:
      • how many from the same regiment
      • how many from the same town
      • where did people enlist
    • As a genealogist, I want to be able to research family members –
      • find a particular person and find parent and grandparent information
      • find related members
      • be able to add known associations from outside sources or link these records to those 3rd party sources
  • I would divide the questions in a way to see which ones answer or partially fit into each of the goals listed above.
  • I would allow myself to modify questions in such a way that I could standardize the answers. Obviously, I can’t standardize the answers I have. But let’s face it – I’m taking something that was never designed for something like a standardized data model and forcing it into that. I have to make allowances. I’m not populating the data but creating logical models. If I had more time and more incentives, this may even be a case where using something like NoSQL databases could help create the standardization I don’t have. Plus AI may have option for interpreting the written responses and help interpret that and create data that would give us data for subjects such as literacy rates. But that’s much farther down the road than where I am; I’m just setting up the basic structure.
  • I would research other data sources where similar things have already been done to see if I can understand how those models have been set up. After one of my sessions where I mentioned this project, one of the attendees sent me a link to a similar project. Reading through that may give me some of the additional knowledge that I don’t have that can help me with this.

One of the reasons I started this project was so I could get better at database design. While I’m not working on this project directly, I have been doing sessions on database design, using first lines of a books or baseball as my examples. And I’ve enjoyed working on those.

However, I have a history of starting projects and not following through sometimes.(Would you like to see my “collection of craft supplies” or my poor neglected mandolin?) I’m not ready for this project to fall in that category. Even if I’m not working on it regularly, it is in the back of my mind and I am reminding myself to think about how I can make it better. It’s inspired me to do some of the sessions I’ve given. Maybe this just falls under one of those “Things don’t always go the way you expect them to.” But I’m not quite ready to give up but I’m not quite ready to start from scratch. It may take me a while, but I want to spend more time to figure it out what I want to do and where to take this. It’s hard work to do by yourself. Luckily, this project was for when I couldn’t find other projects to do or blog about and I definitely have found a lot of those.

So stay tuned… I’m using y’all to keep me accountable.

Advertisements
Posted in data modeling, SQL Server

How I Really Feel about Surrogate Primary Keys

When someone goes on a rant about something they truly feel strongly about – good or bad, my typical joking response is “You’re being a little wishy-washy about this. Tell me how you really feel.” I recently had a conversation with co-workers about surrogate primary keys and realized I need to use that line on myself.

So let me tell you how I really feel about surrogate primary keys. Where’s my soapbox?

soapbox

Continue reading “How I Really Feel about Surrogate Primary Keys”

Posted in data modeling, SQL Server

Designing Super Types of Tables

I want to talk a little about database design patterns. When working with a relational database, there are a couple of patterns that exist to help you normalize your data. I think one of the most useful patterns in this is the supertype-subtype relationship.

You don’t see a supertype-subtype relationship defined as such when you’re looking at the physical database. You’ll only see it explicitly in the logical data model. So what is the pattern and how do you know that you have one in your database?

Continue reading “Designing Super Types of Tables”