How Software Developers are Turning Data into Knowledge

This article directs your attention to three Web sites that give a glimpse into the future of business information and knowledge creation.

This Content Component encountered an error

This article originally appeared on the BeyeNETWORK.

Giving away razors in order to sell razor blades is hardly a new idea, but that's pretty much where we are in the open source business intelligence (BI) software business these days. In fact, it's still pretty much where we are in a lot of businesses these days, and have been for the past few million days. 

Think, “free trial.” Magazine and periodical publishers have used this concept for years: send in the card for a free issue, and then they send you an invoice a month later. If you don't pay it, they'll send increasingly threatening letters, as well as a couple more issues – and then if you don't pay, they let the matter drop. That's the way they do business, and it works more or less. 

Ditto the old “free trial.” This is a surefire way to make a sale of a product that you know will be useful and become indispensable for your customer. The free trial works for anything that the customer can become accustomed to – the improved performance or better luxury or whatever – but that also has high margins. So you used to see salesmen giving away vacuum cleaners to housewives. You don't often see people getting cars on a free trial, or other things that offer minimal benefits over the old model, nor do you see free trials for products that have razor-thin margins, such as TVs or other electronics. 

On a global scale, we're still in the early days of integrating computing into our daily lives. Think about 100 or 200 years ago, before things such as engines, electric motors, electricity in general, steam power, and so on. As those things became a part of our lives, they started out as things in themselves: you didn't buy a specialized machine that incorporated electric motors or steam engines, you bought electric motors and steam engines (maybe a tractor of some sort), and then you hooked them up to the application you wanted to drive. 

It's the same with computers: you buy general-purpose boxes that compute like crazy, but you've got to do your own system integration. Instead of buying the digital equivalent of a washing machine, you buy a motor (the computer) and all of the parts appropriate to your application – operating systems, RDBMSs, application software, disk storage and other peripherals – and put it together yourself. 

All that is going away. Now, you can buy a data warehouse or digital entertainment center or smart vacuum cleaner in a box. Eventually, the whole business will become service-oriented rather than product-oriented. 

Open source software is an ideal lubricant for this kind of industry, as it is ubiquitous, supports almost every imaginable platform, and can be had for nothing – or it can create its own businesses that provide a higher level of support. 

Increasingly, we're seeing companies using software, whether open source or proprietary, as a key component of some application they provide as a service. When open source software is involved, all the better – but that's not necessarily a selling point. Merely offering access to the information service – or, more often, knowledge mining application – is the primary benefit offered to customers. Software doesn't even come into the picture because the product is the service. The service, in turn, is the knowledge you are able to mine from the vendor's data sets. 

I'd like to direct your attention to three Web sites that give a glimpse into the future of business information and knowledge creation, and show how some people are thinking about data, data representation, and how we generate knowledge and understanding from data, information and raw business intelligence. 

First, meet Swivel. "If you're curious about data, Swivel is the place for you,” reads the tag line for startup Swivel – a place to explore and play with data. The home page is still labeled "Preview," which means "beta.” While sometimes beta means "still figuring out how to turn a profit," Swivel has convinced CNET founder Halsey Minor, as well as other deep-pocket investors, that there will be a payday in the not-so-distant future. 

Swivel's plan is to build up a respectable volume of public data sets – and graphs built from those data sets – and then offer commercial services to companies wishing to do graphing with their own data sets as well as the public data sets. The first step seems to be going well: well over half a million graphs and closing in on 2,000 data sets are available to the public at this writing. The next step involves offering business users options involving their own, private, data sets and the ability to combine them with public data sets already on the site. 

In many ways, Swivel looks like an open source software company: they accept contributions under the open-ish Creative Commons Attribution 2.5 license. They offer open access to information provided for public use, and you can get something for nothing. They even use an open source graphics package, Ploticus to generate graphs. 

However, if I read the fine print accurately, once you run data through Swivel and get output, Swivel slaps its own license on that content, a license that limits your rights to redistribute the graphs you generated with data you contributed. 

But Swivel is not (yet, anyway) an “open source play.” Despite depending on open source code for the service, some of the Swivel legal boilerplate seems to hint that at least some of the software is proprietary (though I am not a lawyer, of course). The magic buzzwords "open source" don't show up in the corporate/press pages either, so you couldn't really call Swivel an "open source" BI company in any practical sense. 

Swivel isn't about software, anyway. They enable aggregating and integrating data and using it to create knowledge. In other words, they provide a service. Just like any other company that sells BI products: you pay them (or, you will eventually be able to pay them), and they'll give you the tools to derive knowledge from data. 

They add value in two ways. First, you don't have to pay for the server farms and system administrators and BI stack software licenses and so on. And second – more importantly – they give you access to an aggregation of much more than your own data. You can chart your internal, private, results against any (or all) of the relevant data that has been provided to Swivel from public sources. 

Trampoline Systems sells their own proprietary software, and they're not even providing any public platforms for playing with data. But they do have an interesting demonstration of how you could use their software to extract knowledge from great big ugly conglomerations of data. 

Fewer things are bigger and uglier than Enron's e-mail archives put into the public domain by the courts. What better resource for a company selling a cool BI metadata knowledge browser tool? Trampoline Systems shows what its SONAR (Social Networks and Relevance) technology can do with 200,000 Enron messages sent by 150 executives at the Enron Explorer site. Use the tool to see who was talking to whom about what, what they knew and how they knew it, and what it all meant to Enron's customers and victims. 

But the only thing “open” about Trampoline Systems is that they are using data that fell into the public domain: they've leveraged the raw data by making available an interesting – and ultimately, utterly useless for almost all of us – resource that does little more than showcase Trampoline's product. Not that there's anything wrong with that, either. 

Finally, we come to IBM's new plaything from the Visual Communication Lab: Many Eyes. Many Eyes is very cool; it's similar to Swivel in that you share your data sets and visualizations, and you can chat about the data and the visuals on the Web site – so it's a sort of community for data junkies. 

Many Eyes has only a few “eyes” so far: just a couple hundred data sets compared to the almost 2,000 on Swivel, but quality is more important that quantity – some Swivel data sets cover things such as “How many drinks I've had in 2007.” 

However, Many Eyes is still clearly superior, even if surely as it becomes more popular, the quality of data sets will drop. The big differences have to do with ease of use and interactivity. Graphical visualization of data is just a few clicks away from the Many Eyes home page. What's more, as you pick and choose the rows and columns you want to visualize from a data set, the actual visualization will shift to reflect your choices. Individual data points are displayed on mouse-overs, and the whole process is fast enough to be easily interactive for anyone. 

Will IBM turn Many Eyes into a product? They should, and I hope they do. Even more, I hope they let the Many Eyes staff, mostly brilliant and driven scientists and researchers, turn it into a community-based service. That could turn into something very interesting.

Dig deeper on Open source business intelligence

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchAWS

SearchContentManagement

SearchCRM

SearchOracle

SearchSAP

SearchSQLServer

Close