The Data Exec Series: A less painful way to clean up your data
By Rob Kall, CEO & Co-founder Cien.ai
“I love what you guys are doing with AI, but I don’t think our data is ready for it. We’re working on fixing it now, let’s talk in 6 months”.
– Virtually Every CXO
The Bad Data Curse
After talking to hundreds of executives about data-driven transformations, two things are virtually always true:
1) They are extremely eager to adopt data-driven processes to improve their business.
2) They have serious doubts if the quality of their data will support it.
When we started Cien.ai we pitched our new AI-driven sales rep analysis, and we got this answer almost every time: “I love what you guys are doing with AI, but I don’t think our data is ready for it. We’re working on fixing it now, let’s talk in 6 months”. Guess what they said when we called back six months later? The same thing, of course! Data quality is a constant struggle for modern businesses.
Why Is It Hard To Fix Bad Data?
If you are experiencing some or all the typical data quality issues (duplicated, incomplete, inconsistent, incorrect, or lagging data) there are multiple hurdles you need to overcome to address:
1. Identify problematic fields and records & quantify the problem
2. Identify correct values (e.g., new master lists for inconsistent categories)
3. Fix the records through either manual or automatic methods outside your database.
4. Apply those changes to your primary database.
5. Ensure that no workflows and reports are broken due to these changes (this can be a big issue in complex organizations).
6. Measure the new fixed dataset to ensure that you actually solved at least some of the problems.
No wonder it takes a long time… And because #5 above is a constant fear, there is always hesitation to actually pull the trigger. The result is the “We’ll have it fixed in 6 months” plan…
What Does Success Look Like?
So, to break down the problems above: It is possible to completely automate steps #1, #3, and #6 (e.g., Cien.ai’s Automatic Data Enhancement tech performs those steps).
That cuts down enormously on time. When it comes to #2, the best idea is usually not to rock the boat. Use existing categories and rules, just eliminate the noise from old inconsistent ones. If the categories and rules do not make sense, does that change separately from the actual cleaning?
That leaves you with #4 and #5. Anyone who has done a big data migration for a “live system” knows the stress, when all of a sudden hundreds or thousands of users report problems based on a change you just implemented. The way to avoid this is to make those changes one at a time. If you have cleaned up data, don’t push it all in one day. You will have 3 weeks of hell after… Instead, fix one field at a time. Most of them will have no consequences, and for the ones that do, you can just stop your updates and address that problem until solved.
Taking this approach it’s reasonable to have a completely “cleaned up” primary database in less than a month, and if you do #6 continuously, you will even be able to bring the receipts that what you did had a positive impact (and that you are not slipping back due to new data still being entered with poor quality)!
About the Cien.ai Data Exec Series
This article is part of our Data Exec Series, inspired by our work with B2B business leaders, growth consultants, and PE operating partners. These articles focus on the aspects of becoming a data-driven executive, ready for the AI revolution. If you are interested in RevOps analytics and Sales Performance content, please check out our Growth Essentials and Practical RevOps Series as well.