Data Mining Business Intelligence

Data Hygiene - Database Bullets to Dodge

Learn how good data hygiene saves money and boosts campaign response rates, ROI, sales and profits.

by John Trewolla, Principal Advisor, Management Analytics Group
Ask us!
Questions?
Just ask!
CLEAN DATA IS THE FOUNDATION of marketing success. You can learn more about how data hygiene (getting "clean data") fits into the "big picture" of data-driven marketing by reviewing What Goes Into a management analytics database?

WHAT IS DATA HYGIENE?

Data Hygiene refers to maintaining your data in a way that is complete, consistent and correct. In the technical language of databases and computers, this is called "data integrity". It's simple to understand even if it may take some work to achieve.

"Complete" means that each record in your database contains each piece of data that is essential. For example, each customer or patron record must have a name and address information before it is useful.

"Consistent" means that all of the data of a certain type are stored in the same way. For example, each customer's name is stored as first-name-first ("FNF") or last-name-first ("LNF") -- but not intermixed. Another example is would be that no comments or notes be stored in a field reserved for ADDRESS.

"Correct" means that the data are accurate and free from typos and wrong information. For example, it means that each customer's name is spelled correctly and that their zipcode does not have any missing or reversed digits.

Errors like these are unavoidable, of course. No database is completely free from errors. Nevertheless, each instance of a data hygiene problem reduces the value of a database. If there are many such problems, it can require quite a lot of work, money and time to clean up the database enough to be able to profitably use it for marketing purposes.

HERE ARE EXAMPLES OF COMMON DATA HYGIENE PROBLEMS

Here is a list of some of the common data hygiene errors ("Database Bullets to Dodge ") that we have encountered. See if you recognize any of these as being familiar. The bad news is that every database -- even small ones -- have some of these problems.   The good news is that Management Analytics Group has developed several tools and procedures for fixing most of the problems listed below.

Dodge this Bullet...

that shows up this way...

to avoid this problem.

Two unrelated people in one name field.
“Family” names.
Blank Lastname fields.
Company name in “Last Name” field.
No apartment or suite number.
Not tracking first name.
Not updating for NCOA at least every year.
Mailing to a non-existant address.
Not tracking address suffix (Street, Avenue, Terrace, Place, Circle, etc.)
Not tracking directionals (West, East, etc.)
Having both a WORK address and a RESIDENTIAL address for the same person as two separate records in the database.
Different street numbers for the same person.
Missing or Incorrect zipcode.
Mis-spelled Street Name.
Using “Address 1” or “Address2” fields for non-essential information.
Using Address1 or Address2 fields for COMPANY name.
Mixing foreign addresses in with domestic addresses.
Using any address field for DO NOT CONTACT or DECEASED.
Using any address field for notes and comments.

Same patron in system as J. Smith and John Smith.

Names longer than 35 characters.
Address names longer than 35 characters.
City names longer than 25 characters.
Non-printing (hidden or invisible) characters in any field.
Sending data in formats other than ASCII CSV, ACCESS or DBF tables.
Expecting 100% accuracy in mailing and/or telemarketing lists.


WHAT'S THE BOTTOM LINE?

The best time to take care of data hygiene is during data entry – before “dirty data” contaminate your database. That is, "dodge" these "database bullets" before they hit you.  Investing in the effort to train your data entry team about these common problems always pays off in avoided costs and problems.

And, when using lists obtained from list brokers or other organizations, be extra careful about merging in "dirty" data. As with other kinds of hygiene problems, prevention is a lot easier and less costly than a cure!

Ask us!
Questions? Just ask!     [Back to How-to Resources]      [Top]
About MAG | Privacy Policy | Contact Us | © Copyright 2009 Management Analytics Group LLC. All rights reserved.