Posted on October 8, 2007
What journalists should know about databases
In an interview, Derek Willis, a database editor at washingtonpost.com, recommended this for journalism students:
There are plenty of academic disciplines that use data all the time. So if a student knows a professor who does survey research, that’s usually database-oriented. Or political scientists who study election results or voter participation — they usually deal with datasets.
But the easiest and probably best way is to just start. Start keeping some information in a spreadsheet. Pretty soon you’ll learn how to deal with issues and problems. Then you’ll outgrow the spreadsheet and start looking around for a more robust solution.
Free or cheap database software and hosting is abundantly available, too, especially on university campuses. It’s a matter of asking for it.
Why should you care about databases if you are a journalist? Think of it first as keeping track of bits of information. Well, heck — that’s essential for every journalist! How do you do it? With a messy stack of old reporter’s notebooks, perhaps. Uh-huh. How’s that working for you?
So . . .
Step 1: Get into Google Docs and start using some spreadsheets. Manage your beat, your sources, your stories. Need help? You’ll do just fine. Anyone can learn to use a spreadsheet, and doing so will make you feel comfortable with the same structure that a database requires.
Step 2: Take a stats course. I wish I had. Sadly, I was one of those math-phobic journalism students. Luckily, I did learn the basics of programming while still an undergrad. (I guess I didn’t think programming was too much like math.) I would stridently urge you to learn a bit about statistics BEFORE you follow Derek’s advice about offering your services to some professor who’s using data. Professors do not have time to teach you one-on-one if you are starting from zero — that’s why we have courses, credits and tuition!
Step 3: Learn some XML. Matt Waite gave me a wonderful suggestion via a comment on his blog: Use Timeline, from the SIMILE project. Here is the step-by-step tutorial. Whoa! Some of you will be very afraid! But wait, before you run away, please look at the XML file. See how simple that is? You can do this! Here’s an easy intro to XML.
Step 4: If Timeline scared you, don’t give up. Try this instead. You can generate custom Google Maps with those pop-up balloons supplying added info. All you need to know that you might not know already is how to get the correct latitude and longitude for the map, in the correct format. Matt told me this is the easiest way (it could not possibly be any easier!):
- Go to Google Maps and type in an address, as you normally would.
- Click Search Maps to get the location and map.
- Then copy and paste this command (be careful to copy ALL of it) into your browser’s address bar, and press Enter:
Instant latitude and longitude! Sweet!
Step 5: Now you’re ready to make friends with some computer science students, or go and seek out that professor who needs your cheap student labor to get some data beaten into submission. Or — if you’re a working full-time journalist — find some public records for your state or city, and get started!
Step 6: Join IRE. It might be the smartest career move you’ll ever make. Journalism students pay $25 per year for membership. You know those TV commercials where people using a particular cell phone service are accompanied by a giant crowd of people — their “network”? That’s what IRE is like for people who “interrogate data,” as they like to say. You’ll find lots of information, tutorials and support there. Even better, if you can manage to attend one of IRE’s reasonably priced regional training sessions, you will get face-to-face instruction. (Don’t be put off by the outdated term “computer-assisted reporting.”)
And finally, there’s a book I can recommend: Database Design for Mere Mortals (2nd edition). It’s not a how-to for any specific software or type of database. Instead, it explains the principles of database structure in a very clear way. I only read about half of the book, but it helped me a lot. It’s surprisingly readable.
And why, you ask — why would I do this? So that you could produce an awesome work of journalism such as this D.C. schools package, from washingtonpost.com. That’s why!