Data Mining Business Intelligence

What They Don't Tell You About Data Mining

Learn why successful data mining projects are led by marketers, not IT staff, and how to avoid common pitfalls.

by Alan Weber & John Trewolla, Management Analytics Group
(Published in Ingram's magazine, 8/2002)
Ask us!
Questions?
Just ask!
DATA MINING IS TYPICALLY APPROACHED AS A TECHNICAL CHALLENGE. Get a big enough data warehouse with a fast enough computer, load all the data you can get into it and Voila! A successful data mining effort is born.

If the purpose of data mining was to build a big, impressive computer system that your vendors and IT department can be proud of and brag about to their techies, most data mining efforts would be a complete success. Unfortunately, most corporate data mines end up getting little or no use.

The purpose of data mining should be to build knowledge and understanding to make better decisions. The idea is that decisions based upon hard, factual data are more reliable. Technical knowledge contributes to the availability of data, but does not necessarily contribute to the usefulness or understanding of the data.

Most data warehouses are full of space -- and spaces. Literally, most data warehouses (which is where data mining is done) are over designed in an attempt of capture everything about everybody. The business challenge of understanding the data is ignored by the technical specialists who instead focus upon the technical challenge of storing data nobody understands.

The results are that the burdens of using the data are so great that users quickly find reasons to avoid using the new system. The burdens of capturing and storing data that gets little use causes support people to quickly realize that they have better things to do and they skip the work of loading in current, potentially useful data. So, the "data warehouse" spins happily away, storing spaces instead of data.

THE BETTER APPROACH

Data mining is best done in the same way that other mining is done.

  1. First, know what you're looking for before you start.
  2. Second, do a high-altitude "fly by" to see if any territory looks like it might hold something interesting.
  3. Third, get the permission of the territory-owner to do some preliminary exploration (You'll not get far if you're trespassing on someone's property in your search!)
  4. Fourth, extract some "ore samples" and test their quality to confirm that the "ore" is good enough to mine further (You might be dealing with bad data!)
  5. Finally, establish a consistent method and methodology to extract the data so that each time you tap into the ore, you get consistent results. (Don't shoot down your conclusions with inconsistent data extraction procedures.)

The best data mining is done by end-user domain experts, not the IT staff or programmers. Domain experts are specialists in different areas like marketing, research, production and so on. Too often, the IT staff does not understand the business relationships between the data elements. This kind of understanding is crucial to making sense out of the data. Technical skills are important, but no data mining effort can succeed without involving people who understand the business and the business objectives.

How much more cost-effective and agreeable it would be to design a data warehouse around an understanding of what data might prove useful! It would save on (1) the expense of buying data storage, (2) the frustration of entering useless data, (3) the resources required to provide and protect data privacy and (4) the time required to learn how to use the information.

Involving domain experts at an early stage of a data mining project is essential for success. For example, there is much information in corporate data systems that is incorrect or incomplete. Many systems used for sales support have a place for recording product cost. However, costs are usually managed by another department and are tracked on another system. As a result, the sales support system may not contain accurate cost-by-product information. Determining this early on, before decisions are made based upon faulty cost data, is crucial to making good marketing decisions.

Before undertaking a major data mining project, ask two important questions.

  1. First, what problems are we trying to solve?
  2. Second, who understands the problem most clearly?

As you move ahead, stay focused upon the problems, involve the people who understand the problems and view technology as a means, not an end.

Ask us!
Questions? Just ask!     [Back to How-to Resources]      [Top]
About MAG | Privacy Policy | Contact Us | © Copyright 2009 Management Analytics Group LLC. All rights reserved.