I am someone who works well with goals. I need that structure and something to work towards. If I’m going to do this, I think I need to follow the “Julie & Julia” concept, but without the amazing food.
One of the areas of databases that I’m always trying to do more of is design. I like the idea of reading through the requirements and figuring out how you would create the tables and if need be, how you move the data around if it was done purely within the database.
So my thought is: Let’s create a brand new database.
The next question is: What would we store in this database? And how would we get our requirements?
I remember this paper I did back in college. I mentioned I was a History major, right? Between 1914 and 1922, the state of Tennessee sent surveys to all of its Civil War veterans – regardless of the side they served on. I think it was the only state to do something on this scale. It asked 46 or 44 questions, depending on what version of the survey they got, ranging from where they grew up, their regiments, and battles and skirmishes. In the end, they collected about 1,650 responses.
I only had time (and 5-10 pages) to do a brief look at the surveys. If I remember correct, I think saw a lot of what you’d expect to find. But there’s probably enough in those surveys to do some longer thesis or book or something more than what I put together. The strange thing is over the years, these surveys keep coming into mind. And as I’ve been working with databases, my thought would always be: How would you create a database to connect the various people, places and stories in the survey responses?
So my thought is that we can look at the questions and figure out how we would want to design a database to hold the information.
There are a lot of reasons why this is a horrible choice for this type of project:
- The answers are mostly stories. You have to read through the answers to find the data points we would want to identify.
- The veterans, or whoever filled out the survey for them, had varying levels of education from dropping out of school at an early age to maybe some college and it shows in their writing. There is no guarantee that the veterans would spell everything correctly. (Let’s be honest – how many of us can spell Appomattox correctly without auto-correct or the uncontrollable need to confirm it through spell-check or Google if there’s no wavy red line underneath it?) So even if you do have a clear way to identify the data as something like a person’s name, how can you confirm whether the two spellings from different surveys are in fact the same person?
- How would you take the data and import it into the database? Can it be done without human intervention? I don’t believe there are data sheets available for download like there are for so many other subjects, like baseball or aviation.
But there are a lot of good reasons for trying this:
- I don’t think I have ever had the opportunity to really see if I can design a database from scratch. Why wait for someone to give me the chance when there’s an idea right in front of me?
- We don’t have to worry about importing the data right away, right? Eventually, we may want to figure out how to get the data in there, but it doesn’t have to be the initial focus.
- Once we get to the point of trying to import data, maybe there is functionality that could be tapped that could help solve the problem of reading through narratives. Could we use something like full-text search, SOUNDEX functions, etc.? Is there other technology that we could use? These are options that I don’t normally use so it’s chance to see what’s out there and explore.
- There are a lot of moving pieces when it comes to designing a database, regardless of what the subject is. The subject will definitely help shape what the design is, but there are still considerations which are universal. We may decide that this is a horrible topic to use, but I think we would learn a lot from the background work that we do to get to that point.
- I’m just fascinated by the thought of trying to see if it can be done. I’m not going for an advanced degree in history or computer science or any other subject for that matter. I’m not on any time line. This means there’s time to explore and figure this out.
Don’t worry – I still intend to throw in posts on day job problems and their solutions I run across or T-SQL Tuesdays or other such random posts. But as I said, I like the idea of working on a bigger project.
I’m ready to see what happens if you are…
3 thoughts on “What Shall We Blog About?”