What journalists should know about databases

In an interview, Derek Willis, a database editor at washingtonpost.com, recommended this for journalism students:

There are plenty of academic disciplines that use data all the time. So if a student knows a professor who does survey research, that’s usually database-oriented. Or political scientists who study election results or voter participation — they usually deal with datasets.

But the easiest and probably best way is to just start. Start keeping some information in a spreadsheet. Pretty soon you’ll learn how to deal with issues and problems. Then you’ll outgrow the spreadsheet and start looking around for a more robust solution.

Free or cheap database software and hosting is abundantly available, too, especially on university campuses. It’s a matter of asking for it.

Why should you care about databases if you are a journalist? Think of it first as keeping track of bits of information. Well, heck — that’s essential for every journalist! How do you do it? With a messy stack of old reporter’s notebooks, perhaps. Uh-huh. How’s that working for you?

So . . .

Step 1: Get into Google Docs and start using some spreadsheets. Manage your beat, your sources, your stories. Need help? You’ll do just fine. Anyone can learn to use a spreadsheet, and doing so will make you feel comfortable with the same structure that a database requires.

Step 2: Take a stats course. I wish I had. Sadly, I was one of those math-phobic journalism students. Luckily, I did learn the basics of programming while still an undergrad. (I guess I didn’t think programming was too much like math.) I would stridently urge you to learn a bit about statistics BEFORE you follow Derek’s advice about offering your services to some professor who’s using data. Professors do not have time to teach you one-on-one if you are starting from zero — that’s why we have courses, credits and tuition!

Step 3: Learn some XML. Matt Waite gave me a wonderful suggestion via a comment on his blog: Use Timeline, from the SIMILE project. Here is the step-by-step tutorial. Whoa! Some of you will be very afraid! But wait, before you run away, please look at the XML file. See how simple that is? You can do this! Here’s an easy intro to XML.

Step 4: If Timeline scared you, don’t give up. Try this instead. You can generate custom Google Maps with those pop-up balloons supplying added info. All you need to know that you might not know already is how to get the correct latitude and longitude for the map, in the correct format. Matt told me this is the easiest way (it could not possibly be any easier!):

  1. Go to Google Maps and type in an address, as you normally would.
  2. Click Search Maps to get the location and map.
  3. Then copy and paste this command (be careful to copy ALL of it) into your browser’s address bar, and press Enter:

Instant latitude and longitude! Sweet!

Step 5: Now you’re ready to make friends with some computer science students, or go and seek out that professor who needs your cheap student labor to get some data beaten into submission. Or — if you’re a working full-time journalist — find some public records for your state or city, and get started!

Step 6: Join IRE. It might be the smartest career move you’ll ever make. Journalism students pay $25 per year for membership. You know those TV commercials where people using a particular cell phone service are accompanied by a giant crowd of people — their “network”? That’s what IRE is like for people who “interrogate data,” as they like to say. You’ll find lots of information, tutorials and support there. Even better, if you can manage to attend one of IRE’s reasonably priced regional training sessions, you will get face-to-face instruction. (Don’t be put off by the outdated term “computer-assisted reporting.”)

And finally, there’s a book I can recommend: Database Design for Mere Mortals (2nd edition). It’s not a how-to for any specific software or type of database. Instead, it explains the principles of database structure in a very clear way. I only read about half of the book, but it helped me a lot. It’s surprisingly readable.

And why, you ask — why would I do this? So that you could produce an awesome work of journalism such as this D.C. schools package, from washingtonpost.com. That’s why!

12 Comments on “What journalists should know about databases

  1. Worth repeating, over and over:

    You can do this.

    You can.

    It won’t feel natural at first. You’ll feel out of place. Don’t quit. You can do this.

  2. Thanks for the tip, Craig. I’ve been looking at Zoho for the past hour. Zoho Creator looks amazing. The big gap seems to be: No WordPress plug-in! Horrors!

    But for getting started on a database, the new DB & Reports application seems perfect.

    Thanks again!

  3. This isn’t clear from the transcript (which is on me – I didn’t fully explain myself), but I didn’t intend to suggest that students offer their services to professors without knowing about data. Instead, my thinking was that students who can only conceive of “data” as a strictly computer-related concept could ask a social science prof who works with data if they could perhaps look at/play with some of the datasets that have been gathered.

    In other words, if you’re comfortable with elections in general, chances are you might find electioon-related data less daunting than other subjects.

  4. Pingback: links for 2007-10-09 | SOJo: Student of Online Journalism

  5. @Mindy

    Not a WordPress user myself, but you can include both the entry fields and results into a website.

    So you could build a simple What’s On form and display future events in a calendar format.

    Just check the Embed in your Website options, under More Actions.


  6. Thanks, David! I can also suggest these two sites for customizable Flash data charts and graphs:

    XML/SWF Charts


    Both simply use XML to render data in a SWF. You don’t need to own or understand Flash to use them.

  7. Pingback: links for 2007-10-10 by andydickinson.net

  8. Pingback: Getting interactive « Getting to grips with the net

  9. Pingback: Technolo-J : No. 3: Embrace databases

Leave a Reply