Personal tools
You are here: Home Library EIMI Archives Volume 1, Issue 4 - June 2007 Edition Information Quality Characteristics (part 1), guest author: James Funk
Document Actions

Information Quality Characteristics (part 1), guest author: James Funk

By Richard Wang Ph.D.

Information quality can have many different definitions.  If you listen carefully to people describing issues they have with data that they use, you hear them talk about inaccurate data, data that is not relevant, data that is not timely, as well as having too much information.  The work done as part of the Massachusetts Institute of Technology (MIT) research concerning data quality conducted by Richard Wang, Yang Lee, Diane Strong, and Leo Pipino indicates that one can identify 16 characteristics that impact the overall quality of the information  people  are expected to use in fulfilling their job and task responsibilities.   That does not mean that each information quality situation involves all of the characteristics. It does mean that one has to listen carefully to the person who is describing the situation and identify which of these 166 characteristics exist within the specific context being explained. We will examine each of these characteristics in future columns. A list of the 16 characteristics can be found at the end of this month’s column.

But to set the stage for future discussions, I would like to share with you recent newspaper and magazine articles dealing with aspects of information quality. I do this because there have been an increasing number of such articles and they help to start one thinking about different data quality issues and characteristics as well as what individuals and organizations can do to address those issues. Although some people equate information quality with “accuracy”, our experience has shown that concerns about information quality extend far beyond that single characteristic.

 Betsy Burton from Deloitte in an “INFOWORLD” article about Business intelligence (BI) and associated data issues states that “Data quality and data integrity are not going away. There’s no easy way to solve them.” The article correctly mentions that BI software vendors have tried to address data quality and integration issues with Master Data Management (MDM) solutions but that they have had limited success in trying to cleanse and reconcile data. One of the main reasons is that the BI efforts fall short in this regard is because they usually are dealing with a part of the organization and the data quality issues arise as they attempt to integrate and use information from across the entire organization. In each particular context the information is perceived and measured to have a high level of quality. It is when one tries to integrate the information within an overall organization context that troubles begin to appear and are not easy to overcome. You should realize that for some sources the data will always be dirty. In such instances you should try to determine if the data is really needed. Is it relevant for the purpose at hand? If it is, you need to think about the best approach to handle the data to make it meaningful. The article mentions a situation where two different people using the same information run reports using two different tools and get different results. It reminds one of a recent advertisement by a BI vendor that has three different people walking into a meeting with the CEO each with a different answer to a question.  The question could be as simple as “What is the firm’s gross profit for the current fiscal period”?   Depending upon the specific context used to calculate the answers, all three could be correct or none of them could be correct. A too frequent experience is attending a meeting and spending too much time determining whose numbers are correct. The time should have been spent analyzing the numbers and determining the proper action that should be taken by the organization. Consistency and usefulness of the information is important to any organization.

In a “Baseline Magazine” article it was reported that a recent Gartner Group study indicated that more than 25% of critical data held by Fortune 1000 companies was flawed.  The data was inaccurate, incomplete or duplicated. Think about the implications to an organization!  These data issues are addressed in some fashion when the financial statements for the firm are prepared. Unfortunately when someone tries to build aggregates of the information from the original source data, Complications will arise and issues of inconsistency and misunderstanding will occur.

 A personal experience involved the development of an initial data warehouse for global financial information. The initial effort was to build a new source of global information that would be more available and would allow senior management to monitor the current month’s progress toward budget goals for gross revenue and other profit and loss (P&L) items. I will be referring to this effort in more detail in future columns as it is a good example for discussing different information quality issues.  The effort was to build the information from the source systems that feed the process used to develop the P&L statements. To deliver information that would be believable to the senior executives, a stated goal was to match the published P&L information. After a great deal of effort the initial goal was changed to deliver the capability for gross revenue. This change was necessitated because there was no consistent source data for the other P&L items. Even the new goal proved elusive as the definition for gross revenue varied among the over 75 corporate subsidiaries. Initial attempts to aggregate sales for a subsidiary that matched reported amounts proved to be extremely challenging. We found that we had to develop a different process to aggregate sales for each subsidiary. Even then we were not always successful in matching the published revenue amounts. It took almost two years to successfully match the revenue numbers for all but one of the “major subsidiaries”. The only one that we were not able to match was the United States. Needless to say the senior executives were not thrilled with the progress or results. Fortunately they understood we were just the messengers about the current situation. The result was major global efforts - including a new definition for “gross revenue” - which we will discuss in future columns as part of possible solutions for information quality issues.

My hope is that you begin to understand the complexity surrounding this concept of information quality and that there usually are no quick and simple solutions within most organizations. That is why we use the metaphor of a “journey” as well as continuous improvement when talking about solving information quality problems.  It took a long time to create the problems and it will take time to correct the problems.  We ill have more to say about these journeys in future columns.

 

We look forward to our continuing conversations about information quality and wish you success in your information quality journey.  If you have questions about what we have discussed or want more clarity about what we have said, or have suggestions for what you want discussed contact us at either jimfunk@mit.edu or rwang@mit.edu, http://mitiq.mit.edu.

About the Author

James Funk is currently president of Beyond Accuracy LLC which is a firm focusing on helping organizations improve their information management and data quality. He has spent 30 years working to establish and develop information management and data quality within several global corporations.  During that time he has developed the initial data administration functions at a large utility and at a global consumer products manufacturer. He also was responsible for the development of the initial data warehouse efforts at those organizations.  At the consumer products manufacturer the successful data warehouse implementation led to a cost reduction in ongoing operations that was over 50 times the initial cost of implementing the data warehouse.

Mr. Funk has been focusing on information quality for over 10 years.  He has been associated with the MITIQ program since 1996.  He has been a presenter at their annual conference and has been an instructor at the MIT summer institute program on data quality.  In the past several years he has lead data quality workshops in Asia, Australia, Europe and the United States. 

Mr. Funk was his company’s representative on an industry standards committee which developed global standards for item and party models that are used today to drive increased use of consistent e-commerce transactions across the world.  He was the North American co-chair of that standards group.  He also is a co-author of the recently published MIT Press book Journey to Data Quality.

Mr. Funk’s current focus is on developing the proper techniques for presenting information that will help organizations overcome data quality issues and successfully compete in today’s fast changing business environments.

He can be reached at jdfunk@wi.rr.com

Navigation
 
 

Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: