Delivering data: Which solution fits best?

I’m not a data expert like journalists Adrian Holovaty, Matt Waite and Derek Willis (to name only a few), but I deeply appreciate the work they do. I see a great potential for a better-informed public — if journalism organizations could make certain kinds of large data sets easier to understand.

I’ve been watching a variety of newspapers adopt Caspio for their online data presentation needs, and I’ve been listening to people in the inner circle of data-driven journalism talk about this. I’ve also listened to our students talk about how they used Caspio during their internships, or on their first job. I’ve gathered there are concerns that today some news organizations might be throwing bunches of data onto their Web sites, but without the context necessary to make the data comprehensible, or even useful (see Matt’s post about data ghettos).

Now Derek has written a brief, clear post (which even a dabbler such as I can understand) about why using Caspio might NOT be the ideal long-term strategy for a news organization’s data needs. The post could be useful in newsrooms where some editor (or publisher) is talking about how great Caspio is. It also might be helpful to journalism educators trying to teach the kids some basic Excel, when some wise guy in the class says, “In my internship, I learned to use Caspio. So I don’t need this.”

It’s also a post you should consider reading if you ever felt annoyed at Adrian’s answer to the question, “How can we get this kind of data on our Web site?” Adrian’s answer is: “Hire a programmer.” He’s not disrespecting the question (or the questioner) — he’s giving the answer that, in the long term, makes the most sense for a news organization.

13 Comments on “Delivering data: Which solution fits best?

  1. In thinking about this subject (and I know this is something that Mindy and I discuss ad nauseum), it makes me wonder what should journalism students know about data when they get out of school?

    At UF, students are required to learn the basics of Excel, including charts and graphs for presentation–just to get them thinking about how to produce user-friendly information. I’ve been thinking about adding a mapping component, as that would be fairly easy to add.

    We’ve also talked about a course that expands on this idea to more in-depth analysis. But I wonder, should the focus of a data analysis course be more on the variety of programming and presentation options (e.g. graphics, Caspio, Django) or should the focus be digging down into the data (using SQL, Access, Excel, SPSS) and understanding the way to clean, manipulate and understanding the numbers? I certainly don’t want it to be a statistics course, but I’ve learned that students really don’t understand checking their work and what they’ve really found.

    What are the key skills that a student needs to be a solid “data-driven” reporter?

  2. David A. Milliron, a vice president at Caspio, e-mailed me these examples of Caspio Bridge in use at news organizations.

    You can judge for yourself how user-friendly they are.

    The Florida Times-Union

    The Cleveland Plain-Dealer

    The Indianapolis Star

    ABC News 20/20

    Canadian Broadcasting Corporation (CBC News)

    WZZM TV – ABC Channel 13 (Grand Rapids)

  3. And in so judging, you should keep in mind that if you didn’t know about these links, you’d be hard-pressed to find the information these apps contain. For example, search for “gainesville regional luggage theft”. You won’t see this page in the results.

    The fact that a lot of media companies bought Caspio does not prove that it’s a great service.

  4. Thanks, Derek. Nice example.

    What I dislike about these search-based data dumps is that they lack all context. It does not seem very valuable to me to look up one airport’s luggage losses, for example. I would like to compare my airport to other airports, at the very least.

  5. By Dan D. Gutierrez
    CEO of

    My firm launched the web’s first Database-as-a-Service offering in 1999 (before Caspio) and in that time we’ve been very pleased to see our industry grow from a point where few people could see the benefit of moving the database function to the cloud to now where SaaS solutions are commonly used in the enterprise.

    We found that folks who looked at our service early on could understand not having to hire a database programmer to implement their solution but still had concerns about reliability and security. We recognized these concerns upon our launch, and they remain important today.

  6. The “invisible web” or “deep web” is not a Caspio phenomenon.



    Experts have been trying to figure out a solution for this problem since the appearance of the first search engine. I can confidently say that Caspio is ahead of the pack.

    1. The content of most databases is so generic, such as street addresses, people names, or transaction amounts that they hold no real value for search engines. Our customers make their database assets visible to search engines through an intro paragraph outside the DataPage.

    2. The problem of search engines not being able to index database-driven content is not unique to Caspio or to our highly flexible and unique JavaScript deployment method. Note that search engines also ignore URL parameters, those values that are after a question mark and therefore almost all homegrown databases remain invisible to search engines. Furthermore, if an interactive search form is in front of a database, no search engine can reach it regardless of how it is created or deployed.

    3. JavaScript is only one of the deployment methods Caspio provides. Domain mapping along with iFrame and link deployment models provide more diversity and choices than any home grown solution. It is up to our clients to decide how they want to configure their DataPages and what deployment model to use of the half dozen or so we make available.

    4. Caspio is engineering innovative ways to provide our customers more choices. Some of these new options will be available in the fourth quarter of 2008.

    5. The providers of search engines realize this as one of their shortcomings and they are looking for ways to reach more content and be able to index databases. This problem will not be a problem in a year or two.

    6. As for you not being able to compare one airport to another, again, it is up to the end user to decide what type of database web app they want to build using our product. Our customer’s success depends on their ability to execute their ideas quickly. Traditional development methods are too slow, expensive, and unnecessary for most types of web applications. Caspio Bridge is designed to simplify web application creation and online data management while making them highly affordable.

    Caspio is proud to be trusted by the online news media. It is an excellent product with a huge following.


    David A. Milliron, Vice President
    Media Services Division, Caspio, Inc.
    877-820-9100 x741

  7. David’s correct in saying that the responsibility for SEO also lies with the user, but even then it’s not working. Try a generic airport luggage theft query. Does ABC not want this stuff indexed? Contrast that with a framework like Django, which helps with SEO (and also addresses the issue Milliron raises about query parameters, since you can design urls that don’t have them).

    The question in my mind isn’t whether you can put data online using Caspio; clearly you can (although saying that Javascript isn’t the only method belies the fact that most Caspio apps I’ve seen use this method). The question is to what end? Do you want to put up apps for a single story that don’t get indexed? Can news organizations accurately anticipate all the ways people might be searching for their content?

    As Justin Lilly said in a comment on my original post, “If you slight the amount of work and knowledge involved in putting out a good product … you’re going to end up with a communist—’everyone gets the same thing’—style future.” Is that what we want?

  8. It looks like the Caspio people think we are slighting them specifically. I don’t think that’s the case. The problem isn’t that Caspio exists, or the services they provide. The problem is that many newspaper companies don’t know how to create or use databases in a way that helps tell the story. They hear that databases are important and turn to simple solutions without thinking critically about what the point of the database is.
    And the simple solution (i.e., Caspio) just isn’t gonna cut it.

  9. Megan, I think you hammered an important nail in that comment. It isn’t that Caspio (or any other company) is evil, or selling a bad product. As David Milliron mentioned, Caspio can be used to do more than what people typically use it to do.

    But if you buy a tool that makes it easy to, say, shred the lettuce for a salad, you’ll get a beautiful bowl full of the base for a good salad.

    You still might not know how to make a good salad.

  10. Pingback: Teaching Online Journalism » How you look at data: Graphics vs. text

  11. I think Megan has a good point. It is all about what your aims are when you turn to online data handling. From my own experience I can tell, using an online service is just a piece of cake, even for complete nontechies. I started with TeamDesk without knowing a thing of how to work within the system, but managed to built an application without any problems (have no experience with Caspio though, can’t say anything).

  12. Pingback: Teaching Online Journalism » Why you should learn to love data

  13. The wise guy who thinks he doesn’t need Excel in the Caspio world probably hasn’t done anything very useful during his internship.
    In order to create effective Caspio databases, you have to be very comfortable with basic Excel. Most data is not “Caspio-ready” when you receive it; you’d better be able to sort, append, do calculations, etc. etc. And if you’re talking about a larger amount of data, be prepared to get very comfortable with MS Access.

Leave a Reply