Is The Information Objective, Believable and Reputable? – Guest Author James Funk

By Richard Wang

We have been discussing the issue of accuracy as it pertains to the quality of the information that is being used. There have been many books and articles that have focused on that information characteristic. I would like to now begin our discussion of the other characteristics that the researchers at MIT and other universities associated with the Total Data Quality Management program at MIT have also identified as contributing to the overall quality of the information used within organizations.

As mentioned before, there have been 15 other characteristics found to impact information quality. This month we will discuss three other characteristics which are intrinsic to the information itself.

The first of these is the objectivity of the data. This characteristic usually is a problem for information that contains codes representing an event or condition. If the data consists of the number of items that have been shipped to a customer, it is a count that can easily be checked to determine how well it represents an accurate depiction of the actual event that has occurred. If the information also contains a code or piece of interpreted data, the process used to produce the information has some subjective aspect which will impact the quality of the data.

The second characteristic is the believability of the information. Information that is believable will be considered to have higher overall quality. I have experienced situations in which data that was highly believable was inaccurate. It was extremely difficult for the consumers of the information to overcome concerns that arose when the accuracy of the data was improved and the changes caused discomfort. The believability of information represents the feeling of the information consumer as to how well it fits their view of the world and supports the actions taken or decisions made. When the information supports their beliefs and understanding it is thought to be believable and have good data quality. It takes a long time for an information consumer to develop a sense that the information they are using is believable. It only takes one incident for that believability to be shaken and for the information consumer to lose confidence in the information provided. If this occurs, the process of rebuilding believability again takes time and effort.

The third characteristic is reputation. If the information consumer thinks that the data is accurate, that it objectively represents the events or conditions in which they are interested, and that it supports their view of that external environment, he will consider that the data can be confidently used for its intended purpose. As with believability, it takes a long time and great effort to build and maintain a good reputation for the information being used. Similarly, this reputation can be quickly impacted and information consumers can lose confidence in utilizing it.

We will talk in more detail about patterns that help organizations identify and solve information quality issues in future columns but lets use a quick example to show how these characteristics can impact that quality.

Mismatches among different sources of the “same” data are a common cause of intrinsic information quality concerns. Initially information consumers are not able to identify the source to which quality problems should be attributed. They do know that data conflict. These concerns initially appear as believability problems. Over time, information about causes of mismatches accumulates from evaluations of the accuracy of the different sources, which leads to a poor reputation for less accurate sources. As a reputation for poor quality data becomes common knowledge, those data sources are viewed as adding little value for the organization resulting in reduced use of the data.

Organizations can have a history of mismatches between their inventory system data and physical warehouse counts. In these instances the warehouse counts serve as a standard against which to measure the accuracy of the system data.

The system data source is thought to be inaccurate and not believable and is adjusted periodically to match actual warehouse counts. The system data then again gradually develops mismatches which results in a gradually worsening reputation until the data is not used for decision making. If this occurs, how does one defend the production and maintenance of such data if it has such little value for the organization?

Judgment or subjectivity in the information production process is another common cause for quality concerns. Initially only those with knowledge of the information production processes are aware of the potential problems which usually occur as concerns about data objectivity. Over time, information about the subjective nature of the production process accumulates which results in questionable believability and reputation of the data. The overall result is again reduced use of the suspect data.

Next month we will talk about how one can analyze these issues and implement potential solutions for these problems.

I have one last note for this month. Recently there has been an increase in the number of books dealing with the issues with information that organizations and society have and what can be done to address these issues. The first isCompeting on Analyticsby Thomas Davenport and Jeanne Harris. The book talks about how successful organizations use the increasing amounts of information they have to improve their business process and extract maximum value from that information. The second book isSuper Crunchersby Ian Ayers. It discusses how the best organizations analyze massive databases to provide greater insights into human behavior and are delivering very accurate results which have profound impact on those organizations. The last isMicrotrendsby Mark Penn. It identifies over 70 microtrends using the best available data that are changing the way we live. Organizations that take advantage of that data are in the best position to be successful.

All these books deal with the increasing value that organizations are placing upon data and the analysis of that data to improve their processes and to better meet the needs of their customers. They provide useful insights into how one can successfully manage the mountains of information most organizations produce and maintain. They also provide a basis for thinking about how information quality initiatives can best support the increasing reliance on factual information to support business processes and decisions.

We look forward to our continuing conversations about information quality and wish you success in your information quality journey. If you have questions about what we have discussed or want more clarity about what we have said, contact us at eitherjimfunk@mit.eduorrwang@mit.edu, or visithttp://mitiq.mit.edu.

About the Author

Richard Y. Wang is Director of MIT Information Quality (MITIQ) Program at the Massachusetts Institute of Technology. He also holds an appointment as University Professor of Information Quality, University of Arkansas at Little Rock. Before heading the MITIQ program Dr. Wang served as a professor at MIT for a decade. He also served on the faculty of the University of Arizona and Boston University. Dr. Wang received a Ph.D. in InformationTechnology from MIT. Wang has put the term Information Quality on the intellectual map with myriad publications. In 1996, Prof. Wang organized the premier International Conference on Information Quality, which he has served as the general conference chair and currently serves as Chairman of the Board. Wang’s books on information quality include Quality Information and Knowledge (Prentice Hall, 1999), Data Quality (Kluwer Academic, 2001), Introduction to Information Quality (MITIQ Publications, 2005), and Journey to Data Quality (MIT Press, 2006). Prof. Wang has been instrumental in the establishment of the Master of Science in Information Quality degree program at the University of Arkansas at Little Rock (25 students enrolled in the first offering in September 2005), the Stuart Madnick IQ Best Paper Award for the International Conference on Information Quality (the first award was made in 2006), the comprehensive IQ Ph.D. dissertations website, and the Donald Ballou & Harry Pazer IQ Ph.D. Dissertation Award. Wang’s current research focuses on extending information quality to enterprise issues such as architecture, governance, and data sharing. Additionally, he heads a U.S. Government project on Leadership in Enterprise Architecture Deployment (LEAD). The MITIQ program offers certificate programs and executive courses on information quality. Dr. Wang is the recipient of the 2005 DAMA International Academic Achievement Award (previous recipients of this award include Ted Codd for the Relational Data model, Peter Chen for the Entity Relationship model, and Bill Inman for data warehouse contributions to the data management field). He has given numerous speeches in the public and private sectors internationally, including a thought-leader presentation to some 25 CIO’s at a gathering of the Advanced Practices Council of the Society of Information Management (SIM APC) in 2007. Dr. Wang can be reached at rwang@mit.edu, http://mitiq.mit.edu