BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
There is a tremendous amount of information stored in satellite imaging data, but discoverability is a persistent problem. The way in which images are processed and indexed can make it difficult for users to search through databases for meaningful information. But companies using Hadoop for analytics are helping to make this data more relevant.
Skybox Imaging is one company working to apply analytics to satellite images to leverage the data they contain. The company builds and operates high-resolution imaging satellites and makes the images, as well as data, available to customers. The image data can tell users the speed of a container ship as it moves through a waterway, the completion percentage of a construction project or how many empty spaces are in a parking lot.
To accomplish this is no small feat. Skybox aerospace engineer and vice president of product development Julian Mann said the data pulled out of their satellites wouldn't be recognizable to the average user on its own.
"If you took a single frame that came off the camera and tried to open it up in any imaging software that you might have on your computer, you wouldn't be able to see anything," Mann said. "It's a very raw sensory output."
Things get interesting when the company starts to organize and make sense of the data. Skybox partnered with Cloudera to implement their distribution of Hadoop. The software framework is necessary to handle the data-intensive nature of satellite imaging, Mann said. Skybox customers can then embed their own algorithms in the company's platform and use the analytics engine to crunch the data for their own purposes. Agriculture clients can monitor crop yields. Shipping and supply chain companies can monitor vehicles. Oil and gas companies can evaluate land areas. All of this is based on looking for changes in imaging data by using Hadoop for analytics.
The availability of satellite images isn't necessarily new, but the ability to study changes is, and the implementation of Hadoop into image databases enables it. Mann said this is what allows users to assemble data, normalize it, index it and make meaningful connections. The satellite imaging industry may be generating up to 15 petabytes of data per year within the next four years, he said. It takes a strong analytics engine like Hadoop to organize and make sense out of all this data.
This explosion of data is putting pressure on imaging and location intelligence companies to find ways to manage it all, but there are ways around putting too many internal resources toward database management operations. Mann said this is why Skybox chose to partner with Cloudera. The arrangement allowed Skybox to focus on the things it's good at, like analyzing geospatial sensing data, while letting Cloudera handle the things it have more expertise at, such as implementing Hadoop, he said.
While it may be good for companies engaging in heavy analytics to have staff members who understand databases, Mann said it is not necessary to devote resources to nuts-and-bolts implementation steps when there are vendors out there who are willing to do these tasks. Contracting out these pieces allows businesses to leverage analytics to make meaningful business improvements rather than focusing on becoming experts at implementing Hadoop, Mann said.
Find out what happened at the Hadoop Summit 2013
Read the Whatis.com definition of Hadoop cluster