What's the best way to guarantee our enterprise data mining is accurate and
that the trends/associations it picks up are reliable?
With data mining, I am not sure what guarantees you would be looking for. There is the exact
science part of data mining, which is looking back into historical data and determining, for
example, that 20% of customers who bought x also bought y. Backward-looking correlation analysis
and the plotting of trends to a regression line are exact sciences and as good as the data they are
trained on. When it comes to the more interesting, future-facing part of data mining, if it is
applying any data mining smarts whatsoever, it is never any better than an educated guess. As they
say, in the future, anything can and will happen. By the very nature of "knowing" what may happen
in the future, we actually change it. I've also seen a few end clients determine data mining was
useless after a small, flawed "test." Data mining is really a mindset and should be adopted once
its merits are determined… and given a chance. It may or may not be deemed successful if the
results are not incorporated into a holistic program geared to the desired business result. In
other words, data mining does not stand alone on its own merits within an organization.
We do want those guesses our data mining makes to be as educated a guess as possible and that's
where the due diligence over the software selected comes in.
This was first published in February 2007