This article originally appeared on the BeyeNETWORK.
In high school geometry we referred to it as “reductio ad absurdum.” This is the method of proof which proceeds by stating a proposition and then showing that it results in a contradiction, thus demonstrating the proposition to be false. While there are many tools in the business intelligence toolkit, we want to talk a bit about knowledge discovery and how sometimes the “reductio ad absurdum” approach can be useful.
As we know, data mining is a part of the knowledge discovery process and they are both strongly intertwined with data warehousing. The advances leading to mainstream data warehousing constitute the seminal point of departure from which we have to mark the start of contemporary analytical processing. It was data warehousing that served as the first thrust for the rigorous and methodical production of knowledge from our data.
Data mining is about finding meaningful new correlations, patterns and trends in large amounts of data and thus extracting previously unknown information from mountains of bits and bytes. It applies pattern recognition technologies as well as machine learning, statistical and visualization techniques to accomplish its objective.
It is in the manipulation of an enterprise’s data through these business intelligence and database management tools that we start the ascent from bits and bytes, to data, information, intelligence and knowledge. All sound knowledge management environments are richly supported through data warehouses and data marts that are mined to produce business intelligence and hence important contributions for the different communities of practice. Furthermore, it has been through the emergence of information technology and computer systems that we have been able to make substantial inroads.
These insights tie very directly to our concept of business intelligence; not so much in terms of the software tools, but rather the approach. In addition, what we are often involved with here is the continuation of the age-old process of obtaining meaning from a collection of data points or observations. Daniel Boorstin, the former Librarian of Congress, sheds light on this phenomenon in a fascinating essay titled, “The Age of Negative Discovery,” that appears in his book Cleopatra’s Nose. Boorstin points out that “for most of Western history interpretation has far outrun data.” For example, faced with the magnificence of a clear night sky replete with millions of light points visible to the naked eye, ancient man soon surmised the existence of a heavenly bear, a hunter, a fish or a swan. Out of this approach to data visualization were born the constellations, long before we knew much about stars, planets, black holes or the big bang. He goes further to show how discovery becomes more difficult when we already know a fair amount, as he recounts Capt. James Cook’s first voyages across the Antarctic Circle between 1770 and 1775. The British explorer managed to sail round Antarctica until pack ice blocked his route and yet was often befuddled as to whether these were islands he was seeing or truly a new continent. The annotations and comments in his log show him almost engaging in the process of negative discovery. Trying to deduce what is through discarding what cannot be.
Boorstin notes this “modern tendency…[where] we see data outrun meaning.” He attributes this “outrun” to the advent of the “mechanized observers” or machines that generate such vast numbers of observations, or data points and make it essential that we learn to navigate these oceans of facts. The sensors of our day produce these data points in numbers such that we have had to rely on a new vocabulary to quantify them. We now feel comfortable going beyond the kilobytes and the megabytes, and work with new terms such as, terabytes, petabytes, exabytes or even yottabytes (or brontobytes, take your pick).
The essential insight in all this has to do with the importance of negative discovery. In other words, discovering that which is not and hence allowing us to discard all data, through analysis, that does not contribute to a better understanding of reality. The implication for knowledge management is that in order to “use our collective intelligence” we must increasingly use tools and techniques that enable us to interpret large amounts of data as we strive to achieve understanding. And in order to do this, “reductio ad absurdum” can be quite useful.
Today, it becomes extremely difficult to actually establish, for example, that a new star or subatomic particle has been discovered. Let’s take the former example first. Usually we start with a small but known segment of the sky and use computers to minutely analyze changes in the observations conducted with a telescope on that segment over a period of time. Then you proceed to a very detailed analysis where we are looking for change. What is different now that was not there in our precious patterns or observations? From there, a proposition often follows that we can then prove leads to a contradiction and can hence be proven false. This process will often then lead us to the conclusion we are looking for.
If we take the case where we are in search of a new particle, the process is similar though the toolkit will obviously vary. We may be bombarding an isotope with certain particle beams in order to record the result of the collisions in the paths of the subatomic byproducts as they stream through a heavy water pool. Ultimately, these data points will be captured in computers and we will proceed to the long and protracted process of analysis in order to produce business intelligence. Have we found a new particle or is it just another instance of one we already know? Or, is it an error in the experiment? Again, we may often rely on “reduction to the absurd” in order to prove or disprove the hypothesis.
This approach is often used by scientists in the labs today at NASA, NOAA (National Oceanographic and Atmospheric Administration), the Weather Bureau, many National Labs and other government agencies. It can be a powerful tool in other settings too. For example, as we look to determine what has changed in a very large database, or data warehouse, after a refresh there might be some benefit from seeing what we can learn through the negative discovery process.
Editor's note: More government articles, resources, news and events are available in the Business Intelligence Network's Government Channel. Be sure to visit today!
Dr. Barquin is the President of Barquin International, a consulting firm, since 1994. He specializes in developing information systems strategies, particularly data warehousing, customer relationship management, business intelligence and knowledge management, for public and private sector enterprises. He has consulted for the U.S. Military, many government agencies and international governments and corporations.
Dr. Barquin is a member of the E-Gov (Electronic Government) Advisory Board, and chair of its knowledge management conference series; member of the Digital Government Institute Advisory Board; and has been the Program Chair for E-Government and Knowledge Management programs at the Brookings Institution. He was also the co-founder and first president of The Data Warehousing Institute, and president of the Computer Ethics Institute. His PhD is from MIT. Dr. Barquin can be reached at email@example.com.
Editor's note: More government articles, resources, news and events are available in the BeyeNETWORK's Government Channel. Be sure to visit today!