Data and Visualization: Predictions for 2011
A lot of my time these days goes into planning DataMarket‘s efforts in the new year. An essential part of that is trying to grasp the major trends in areas that matter to us.
DataMarket is building an active marketplace for statistics and structured data. We believe in a “visual data exploration” approach, meaning that users’ first experience with any data is a visualization that should provide a quick overview of what the data is all about, then allowing users to dig deeper to see the raw numbers, download the data in various formats, embed it in other web content or connect to the data live using our API.
This vision, and our goals for the coming year – including our launch of an international data offering – frame the topics that I’ve been thinking about. For links to broader predictions in the fields of Big Data and Data-as-a-Service see the bottom of this post.
That said, here are the things I believe will shape our key areas of interest in 2011:
Data markets will become widely accepted as an emerging field.
I’ve previously defined these as “Services that make it easy to find data from a range of secondary data sources, then consume or acquire the data in a usable – and often unified – format. Several of these services are trying to create marketplaces for data, envisioning that data providers can offer their data sets for sale to data seekers.” I used the analogy of “Amazon-for-data”, but I see that others have started using “Data app stores”, which may in fact be closer to home. With at least 8-10 efforts to build such services, some already with significant VC backing or led by large corporations, the space is heating up.
I believe we’ll see many of these services differentiate themselves in 2011 by focusing on specific types of data. There are definitely opportunities in building specialized data markets for geospatial data, for statistics and for enormous scientific data sets – to name a few types – and each comes with their own challenges, target audiences and preferred approaches. In the spirit of doing one thing and doing it well, I think most of these projects will want to see success in one such segment of the market before generalizing – or consolidating.
A couple of chart solutions will separate themselves from this important, but crowded space by maturing and gaining a large developer following.
As our application has matured, our requirements have expanded:
- We need more control to fix bugs, control the look and feel and implement features that are not supported by default in amCharts (or other similar solutions for that matter) = Open source.
- We need to support iPad, iPhone and other devices and software that don’t run Flash + we kind of like standards = HTML5
- We need to be able to render beautiful charts for high-resolution printing and bitmaps alike = Vector based
Based on these (and other) requirements, we are betting heavily on Protovis, a brilliant solution written by Mike Bostock and Jeff Heer of the Stanford Visualization Group. The drawback is lack of support for Internet Explorer 7 and 8 – something that can by no means be ignored for a business-oriented solution with a broad target audience – but we believe we’ve found suitable workarounds.
Online media, data journalists and bloggers will increasingly use ready-made or reusable technologies to enrich their stories with data, data visualization and charts.
A lot of the great efforts by leading media in the field such as the NY Times and The Guardian are built specifically around a specific data set to tell a specific story. This is a high-cost, high-return approach, and even the largest media can only afford to do a few such stories a month. A lot of stories – however – will benefit from a simple chart or a map showing the big picture, then allowing the readers to dive in deeper if they want to test their own theories or do some analytics of their own.
An example could be an article on unemployment. The piece could include a graph showing nationwide unemployment for the last decade, then allow readers to dive in to compare the latest figures for different areas, see the development over a longer period or compare unemployment with – say – inflation rates. This should not be a specific project for that piece, but a generic solution that gives the journalists themselves access to the underlying data and ability to configure the setup so that he can easily attach any relevant time series data to any article. Similar examples are easy to think of in geospatial data, large collections of text documents (think the Wikileaks files) and any other kind of data.
There will be plenty of VC activity in the Big Data / Data-as-a-Service / Data Markets fields in 2011.
Most people seem to agree that VC spending in general will be on the rise in 2011, and early stage funding for companies such as InfoChimps and Timetric and larger funding rounds for companies such as Socrata and Factual indicate that this space will be one of the hot topics for VCs in the coming year.
Some larger name VC funds, such as Andreessen-Horowitz, Morgenthaler Ventures and Benchmark Capital are already involved, and many others are at least exploring options in the space. There is even – at least one – early stage investment fund that is dedicated to Big Data.
- – -
There are obviously many more trends and developments that will shape the year to come, but from our perspective these are some of the most important.
If you’re interested in a broader look on what 2011 may bring in the field of Data-as-a-Service or “Big Data”, here are a few prediction posts I’ve come across: