Outside-In DQ Robotics

What is the difference between OUTSIDE-IN Data Quality and INSIDE-OUT Data Quality approaches ??
Jack Olson explores the idea in 2003…
Data quality assurance groups fundamentally operate by identifying quality problems and then fabricating remedies. They can do the first part through either an outside-in approach or an inside-out approach. The outside-in approach looks within the business for evidence of negative impacts on the corporation that may be a derivative of data quality problems. The types of evidence sought are returned merchandise, modified orders, customer complaints, lost customers, delayed reports, and rejected reports. The outside-in approach then takes the problems to the data to determine if the data caused the problems, and the scope of the problems.
Jack Olson 2003
Data Quality: The Accuracy Dimension
Werner Engelen  expanded on this in 2009
“EVIDENCE then ISSUES then DATA” – Here we look within the business for evidence of negative impacts on the organization that may originate in bad data quality. E.g. returned goods, modified orders, complaints, rejected reports, lost customers, missed opportunities, incorrect decisions, etc. Finally the problems are taken to the data to determine if the data or data entry processes caused the problems or not.
DATA then ISSUES then IMPACT – Here we start with the data itself. Inaccurate data is studied to determine the impacts on the business that either already have occurred or have the potential to occur in the future. This methodology depends heavily on analysis of data through a data profiling process.
Inside-out starts with a complete and correct set of rules that define data accuracy for the data. This is additional metadata (element descriptions, permitted values, relationships, etc.). Based on this metadata, inaccurate data evidence can be produced.
The inside-out approach is generally easier to accomplish, will require less time, will use own data quality analysts, whilst minimal bothering other departments. At the same time it will catch many problems which the outside-in approach will not catch.
A data quality program should use both approaches when not wanting to miss some important issues.
Werner Engelen, 2009
Martin Spratt 2016
Speaking with a major bank recently on their Data Quality program, the Data Quality improvement opportunity OUTSIDE-IN is in excess of $700m per annum. INSIDE-OUT is $20m. A vast intellectual, economic and political difference. The DQ team (the 4th DQ team in that company to be formed in 30 years) will spend $10m to return $20m, and I predict be dissolved within 3 years, like their predecessors (3 previous DQ teams).
The OUTSIDE-IN approach is too politically difficult for the IT-centric DQ teams to pursue, hence they go for the temporal and smaller benefits, which result in DQ teams being dissolved and rebooted in this cyclical fashion. This bank’s competitors are exactly the same, in their 4th major DQ investment iteration on 30 years.
I predict a DQ industry shift to big game hunting and focus on more mature, more permanent, self-funding DQ models that are owned and driven by business leaders, not IT teams. OUTSIDE-IN will drive the DQ investment commitment and the business improvement focus-areas, and then fund (or validate the funding of) the INSIDE-OUT (the more traditional DQ technical) work.
ClearDQ automates the OUTSIDE-IN DQ diagnostics, discovery and economics, drives business ownership of the damage recovery (and other data economic insights). The INSIDE-OUT DQ market is already littered with proven data element discovery, profiling, cleansing, enrichment and triage tools in proprietary and open source forms.
In the case of the bank mentioned above, ClearDQ would take less than 8 weeks to discover and account for $700m in damage, accurate to within a $1000 for all 55,000 employees, accounting for every staff KPI that relates to any data activity, and will guide the $2bn per annum IT spend in unlock the $700m in damage per “in flight” projects. In contrast The DQ team will take 12-24 months to recover $20m in damage, and when the team is shut down in 36 months, the DQ problems will return within 12 months. Without OUTSIDE-IN supervision of DQ benefits, INSIDE-OUT efforts are slow, painful and often only temporal fixes with nullified return on investment over the long haul.

Data Quality for Financial Performance