29
Mar
10

Freebase Gridworks

Gridworks sounds very exciting. Check out the screencasts linked from Jon Udell’s blog.

While Gridworks doesn’t attempt to penetrate the semantics of the source data, this is enormous progress from the feeble tabular-data import tools available on Swivel, Many Eyes, Timetric and other sites. From the demo I can’t tell whether it will be possible to save the set of transformation rules applied to one dataset and use it again later. This would be extremely useful for massaging periodic datasets every time the source is updated.

According to the Freebase blog, Gridworks will be released as open source next month.

04
Jun
09

Google Squared – vaguely cool, but really crude

I don’t get Google Squared.

At first glance it seems interesting: returning search results as a list of items, with common attributes in columns.

But in its current state I don’t see how to make this actually useful for anything. James Turner seems much more excited about it, but he points out a lot of deficiencies. There’s no About or FAQ on the site, so there’s no telling what the Google Labs team is up to with this. It’s clearly in a very early alpha state. Comparisons with Wolfram Alpha are off the mark, I don’t see any similarity between the two services.

For example, consider a search for US presidents. A reasonable expectation is that you’d get back something similar to the Wikipedia List of Presidents. But no:

  • The list caps out at 7 (apparently random) items. The number of U.S. presidents is well-known and small; in such cases the list should automatically be comprehensive.
  • There’s no sorting option, and no column reordering. Those are presumable oversights.
  • Recording data provenance for each cell in the square is very useful. But you should be able to identify a preferred source for a particular column, rather than having to do it for every individual cell.
  • No sharing! If I build a comprehensive square on a topic, then I should be able to make that available to others as the default search result, replacing the minimal square automatically generated by Google.

If Google adds some basic community features to Squared, it might just become minimally useful.

29
May
09

Timetric: new time-series data visualization service

Promising new entrant in the online data visualization field: Timetric, developed by a small start-up team in the UK.

The current repository of almost 100,000 series is skewed heavily towards UK economic data sources. User can contribute additional data series, but the supported import format is currently fairly restrictive; Timetric cannot interpret and convert files that do not correspond to their specification.

Data series can also be accessed as RSS feeds, though this seems to be just an XML version of the tabular data. Data series acquired from external URLs do not appear to be “active”, there’s no apparent way to refresh the series from the source.

Charts are in Flash. I’m not a big fan of Flash, but the Timetric presentation is quite polished. HTML embedding is supported. There are also nice PNG sparklines, likewise embeddable.

Unique to Timetric is their support for data manipulation within the system: you can combine two or more existing series using an Excel-like syntax for specifying calculations.

Timetric is off to an impressive start. They’re already publishing two blogs and they are twittering. I see their biggest challenge as they same one that faces Swivel, Many Eyes and others: what’s the business model?

26
May
09

Wolfram Alpha has a long way to go

I expressed some excitement earlier about the forthcoming release of the Wolfram Alpha “computational knowledge engine”. Alpha went live over a week ago; here are my impressions so far.

Wolfram|Alpha’s long-term goal is to make all systematic knowledge immediately computable and accessible to everyone.

Alpha is still a long way short of achieving this goal of making knowledge “computable” and “accessible to everyone”. In its current form, Alpha gives answers, but provides no access to the underlying data or any way to explore further.

Let’s use a sample query: united states gdp

For results, Alpha provides a recent (2007) GDP figure and a chart showing the annual trend since 1970.

Unfortunately, there’s no way to download the data table behind the chart, which is to say, the data is not “accessible to everyone”. Apparently “computable” only refers to the ability of the Mathematica engine (which powers Alpha) to do computation, but the service does not help users to do further computation of their own. There is a promising “Live Mathematica” link on every results page, but this is worthless: the notebook that is presented in Mathematica is not live at all, there was no way to do further local computation on the data. The chart displayed by Alpha was produced by Mathematica running on their server, but there are no options to adjust the chart format, and since you can’t get the data table you can’t create your own chart.

The obvious next step would be to look for the source data from the original source. But the “Source information” link gives only a list of references, with no links to actual data.

alphasource.png

These limitations make Alpha, for now, fairly useless for real research and investigation. But it’s a promising start. There’s a wealth of factual information in the repository and the search-like user interface is not intimidating. I look forward to future releases that might bring some of Wolfram’s goals truly within reach.

29
Apr
09

Still no progress on open data visualization

I wrote a year ago that visualization services are stagnating. Other than yesterday’s announcements of Wolfram Alpha and public data in Google search, has there been any progress on online analytics and data visualization?

No.

Swivel has completely refocused on business services, but business.swivel.com is still in beta. A new competitor, Good Data, has appeared, but has nothing new to offer. Many Eyes is now powering visualizations for the New York Times, but there have been no significant new capabilities added to the service.

Currently my hopes are pinned on Wolfram Alpha. What will the next year bring?

29
Apr
09

Public data in Google search results

Google search now returns links to Google-generated data visualizations.

Currently, there are only two datasets available through this mechanism, “unemployment rate” and “population“. It’s unclear when we’ll see additional datasets offered through this channel.

The Google capability may be useful for quick viewing, but (so far) this is a toy:

* there’s no way to download the visualization, or the raw data;
* links are provided to data source information pages (US Bureau of Labor Statistics and Census Dept), but not to the data displayed on the chart;
* no analytic capabilities.

Google’s decision to announce on the same day as the Alpha debut was a childish prank. I suspect that Nova Spivack has it right and Erick Schonfeld is off the mark – but Erick is correct that Wolfram’s service is not live yet and we can’t know for certain what has been achieved until we see it for real.

29
Apr
09

Wolfram Alpha is a big deal

After months with little activity in the open analysis area, today brings two major announcements. This post covers the first, and most significant: Wolfram Alpha, from the same company that produces the excellent Mathematica software.

Today Stephen Wolfram presented a preview of the upcoming Alpha product at Harvard’s Berkman Center (archived video to be available shortly). As had been rumored for weeks, the system is impressive. But it isn’t open to the public yet, and a firm launch date has not been announced.

Wolfram refers to Alpha as a “computational knowledge engine”, whatever that means.
Alpha acts like a search engine (you enter some text, and get back a page of results), but instead of finding information located elsewhere on the web, it delivers answers to factual questions. Examples demonstrated by Wolfram included “gdp of france”, “weather in lexington”, and tracking the orbit of the international space station. In all these cases, Alpha returned not only a current value but also time-series historical data presented graphically plus other relevant contextual data.

Alpha brings together a powerful combination of capabilities behind a simple web interface. Alpha gathers data from a large and growing set of external sources (including nearly-live financial and weather feeds); allows human curators to provide the semantic instructions for transforming the source data into a form that can be used by the engine; and uses custom software built on top of Mathematica to respond to queries across the entire repository.

If Alpha is truly as impressive as it looks in the demo, and it can handle the scalability challenge of public availability, then this is a huge advance on the state of online analytical tools. Having a mathematical engine available behind the scenes makes possible a much wider range of inquiry than the search-and-retrieve model provided today by search engines and data visualization services. Alpha’s capabilities won’t be duplicated quickly or easily by competitors.




Follow

Get every new post delivered to your Inbox.