USA National Data Quality Pandemic – Part 1 of 2


I am rarely impressed these days with all the hyperbole in the data industry,  but a recent article raised my eyebrows. Well respected Data Quality expert  Tom Redman wrote an article for Harvard Business Review which re-stated an IBM claim that bad data
cost the US economy a gobsmacking $3T in 2016. That is an estimated 22 times the size of the global big data market,
$136b quoted by IDC for the same year.

Doing a little digging, the mysterious number quoted by IBM cannot be validated from their Infographic, which cites respected (but a cluster of ambiguous) sources such as McKinsey Global Institute, Twitter, Cisco, Gartner, EMC, SAS, MEPTEC and QAS.

But for a good discussion, lets assume IBM, or their little cadre of sources did their sums well, and let’s compare their calculations to a ClearDQ estimate based on production examples.

The big difference I will highlight, is the IBM number would be a theoretical number with no tools or process to collect the exact number. On the other hand, as unequivocally proven to Gartner staff in 2016, the ClearDQ platform, method and software can, and could, calculate the precise damage, and quickly, if someone
were brave enough to write the vendor a check.

People have asked can we calculate the damage that Data Quality causes different business units, departments, cities, government departments, industries and nations.

Can ClearDQ calculate the US DQ annual Damage Bill ?

The answer is yes it can, and we can do it quickly and accurately.

For our tool to operate over 323m people is a slight (solved)  challenge, but we’ve solved that with mounting ClearDQ in the cloud. So it’s feasible.

It’s official. According to the USA economy is carrying Data Quality damage between $1.262T and $3.108T per annum.

Read on to see how we do this for real (this is what we do for a living), and how we used our “real” case studies to estimate the damage to the USA Economy.

To calculate the damage quickly we use our own existing proven, detailed client benchmarks as examples, simply to give us a guide.

Each benchmark example below, carries deep economic and data engineering evidence of the commercial data damage and the root cause of the data damage.

We define this as “baseline drag” which represents the minimum data damage, and excludes other data damage we find long the way such as Fraud, Regulatory Fines, Customer Damage, and other revenue losses… these are calculated separately and are extra (and NOT counted in the calculations below).

We define workforce damage as the “inability for a staff member to perform their primary job function directly due to identifiable poor quality or damaged data”.

Some previous validated, evidence based examples are:

  • Capital Markets Bank 23% Workforce Damage
  • Energy Retailer 13% Workforce Damage
  • Health Insurer 24% Workforce Damage – 2015 Measure
  • Health Insurer 10% Workforce Damage – 2016 Measure (same client who adopted the ClearDQ recommendations)
  • Higher Education Institution 32% Workforce Damage

USA Economic figures to calculate the damage:

  1. We will use a population of 323,995,528 population (source 2016)
  2. We will use an average salary of  $29,979 (U.S 2015) 
  3. We will use our previous case study results as our ranges of 13% (low), 20% (midpoint, average) and 32%(high) to give us a guide on what Data Quality damage is costing the USA economy


  • Low (13%) $1,262,698,051,409 – $1.262T per annum
  • Med (20%) $1,942,612,386,782 – $1.942T per annum
  • High (32%) 3,108,179,818,852 – $3.108T per annum


If you are interested in calculating the “actual & total” data quality damage circulating through your organisation today, along with the identification of the root causes and how to claw back the damage, you may need our automate tool and our algorithms to do that quickly, accurately and at scale.

Our framework, method (algorithms) and software tool will satisfy your accounting, operational, risk and data experts.

Author: Martin Spratt, 30 March 2017. Martin Spratt is a data value guru, author and CDO advisor, held hostage in Melbourne by 4 women and a cat, and survives on cappuccinos. This article first appeared on





Data Quality for Financial Performance