It's a common problem that even the U.S. government hasn't yet solved: How to get departments to share data across organizational boundaries.
In the most recent illustration of the problem, the government's failure to prevent the so-called underwear bomber from boarding a plane to the U.S. with explosives in his trousers on Christmas Day wasn't due to a lack of intelligence data, President Obama said.
Rather, it happened because U.S. intelligence agencies failed to share data and make sense of it -- or "connect the dots," as some say -- to identify the threat before it materialized.
"I will not accept that," Obama said three days after the failed attack. "We have to do better. We will do better, and we have to do it quickly. American lives are on the line."
While organizational and cultural issues are partly to blame for the lack of data sharing, a number of data management technologies, if used correctly, could help the president, as well as organizations of any type, achieve the goal, experts and vendors say.
Enterprise search to find missing pieces
One such technology is enterprise search. Enterprise search allows users to search among an organization's data via keywords regardless of which department or division "owns" the data, said Sid Probstein, CTO at Newtonville, Mass.-based Attivio.
"The government has a very real problem filtering through that data to find the pieces that will actually help analysts," Probstein said. "The government needs tools which are able to help organize filtered data."
Users can set up queries such that only results that fit particular requirements are returned. This type of technology, had it been in use, might have connected the suspect's name -- which was on a TSA watch list -- with data held on him by the U.S. Embassy in Nigeria, prompting an alert to be sent to analysts, Probstein said.
CEP and data federation tap multiple data sources
Complex event processing (CEP) and data federation technologies can also help connect disparate data sources, analysts agree.
CEP technology automates the process of integrating large volumes of data in near-real time from multiple data sources to identify events of interest based on sets of preconfigured conditions.
Data federation technology, also called enterprise information integration and data virtualization, likewise collects data from disparate source systems but then integrates the data in a middleware or virtualized layer, then sends it on to a business intelligence (BI) or other analytic application, according to Jim Kobielus, an analyst with Cambridge, Mass.-based Forrester Research.
But being able to identify possible threats by integrating and analyzing disparate data through CEP or data federation technology solves only part of the problem, according to Randy Wood, vice president of public sector technology for Informatica. The technology itself must also be simple and agile enough for intelligence analysts to use and amend as conditions on the ground evolve, Wood said.
"It's one thing to capture data, it's another thing to integrate all your disparate data into a single view," he said. "The key is to expose all of those capabilities ... to end users so they themselves can use the technology and do so in a powerful but simple way."
For example, Informatica's CEP technology taps a rules-based user interface, Wood said, so analysts can change the conditions the system is set to detect by filling in text fields, with no coding or IT assistance needed.
Other BI vendors, including SAP BusinessObjects and IBM Cognos, meanwhile continue to try to develop more user-friendly interfaces for their software.
Data quality to match identity data
Yet another missing piece of the puzzle is data quality. There are often multiple spellings of suspected terrorists' names – Osama bin Laden or Usama bin laden – that must be resolved.
It has been reported that State Department officials did check to see whether the so-called underwear bomber -- Umar Farouk Abdulmutallab -- had a valid U.S. visa. Abdulmutallab's name was misspelled in the visa database, however, and the State Department's search came up empty.
In another recent high-profile case, an eight-year-old Cub Scout from New Jersey was patted down and searched when he tried to board a flight because he had the same name -- Michael Winston Hicks – as a person of interest on a TSA watch list.
Data quality software from vendors like Informatica, DataFlux, and Trillium Software can search and match identity data, even taking into account small differences in spelling or language.
Organizational issues remain
Technology can go only so far, of course. Ultimately, to improve the government's ability to "connect the dots," intelligence and defense agencies have to do a better job of cooperating with one another and limiting "turf battles," analysts and vendors agree.
After all, if the State Department won't let the C.I.A. – or, for example, the sales department won't let the marketing department -- tap into its data sources, no amount of enterprise search, CEP, data federation or any other technology will be of much use.
A lack of technology is not the sole issue, however. "I think the problem that governments and intelligence agencies and other organizations face generally are political and organizational problems," Attivio's Probstein said. "The government probably needs to invest in more technologies but also in breaking down these silos."