By Larissa Moss
When I signed up for a bi-monthly column with EIMI, I was asked to provide a name for my column. Based on my background I thought of names like Method Madness, pointing to my affinity for methodologies; XS: Extreme Scoping, exposing my keenness for XP (extreme programming) methods; Enterprise Information Architecture, affirming my dedication to data and information; and Data Warehousing on Steroids, reflecting my career path. But as I was tossing around these names, I realized that, no matter what topic I will write about, there is one underlying theme I want to emphasize: We need to getBack to Basics.
Take, for example, data strategy. The topic is almost as old as our profession itself. Yet, it is amazing how many companies, young and old, have no cohesive, enforceable data strategy. These companies get involved with data warehousing, business intelligence, customer relationship management, master data management, and other new technology initiatives without the framework of a data strategy. Years into their efforts, many business executives are still frustrated over their inability to trust their company’s data. They have spent millions of dollars on new technologies and “silver bullet” solutions, only to find that the state of their data assets has in many cases deteriorated rather than improved.
The explosion of tools and capabilities to create, manipulate, and access data is frightening without a data strategy. If we thought we had redundancy problems in the past, wait until we track items by RFID and incorporate unstructured data into our information portfolio. And at the same time as we are drowning in data, we glorify technology that automates our decision-making. As a result, our decisions are often plain wrong. To make things worse, we are reducing, in some cases disabling, direct contact with our customers who are affected by our bad decisions, and who are frustrated that they have lost the personal connection to their service representative who could straighten things out. What makes business executives believe that this [non-]strategy is working?
A data strategy can be seen as a survival guide for the information age. Like other core company assets, such as financial assets, real estate assets, fixed assets, and so on, data assets should be controlled based on a strategy. A data strategy is a strategic plan for enterprise-wide data governance that spells out a company’s policies, procedures, roles, and responsibilities for standardizing its data, ratifying its business rules, controlling data redundancy, managing its master data, integrating structured with unstructured data, storing and using its data, as well as protecting its data. Thus, in order to put the “I” back into BI, companies need to create and implement a data strategy. Here are some of the basic components of a data strategy.
Data standardization and integration
Data standardization and integration go hand in hand – you cannot have one without the other. This means that data redundancy has to be addressed. Redundant data must be identified, cataloged, resolved, and ultimately reduced. This is much easier said than done because it requires time-consuming human analysis. While there are data profiling tools that can help identify potential duplicates and master data management products that help manage core business data, it still takes the knowledge of a business person to understand the semantics of each data element. It also takes a willingness and commitment from the business people to negotiate and agree on how to standardize the data.
Everyone seems to be getting on the data quality bandwagon. The vendors talk about it, and much is written about it. Yet, in many companies, a lack of management awareness and support for data quality continues to be a problem. Once again, there is no “silver bullet” solution that can automagically turn your dirty data into good quality data. While there are data cleansing tools that can help, it still takes the knowledge of a business person to define the intended meaning and the intended content of each data element. These have to be documented as business rules before they can be fed to any data cleaning tool or business rules engine.
Meta data management
Meta data is one of the vehicles for achieving data standardization and integration. Meta data is contextual information about IT assets, such as data, processes, programs, and so on. Meta data components for data assets includes business definitions, domains (valid values), data formats (type and length), business rules for creating the data, transformation and aggregation rules, security requirements, ownership, sources (operational files and databases), timeliness, and applicability, just to name a few. Not many companies capture all of these meta data components, and those that do, don’t make effective use of it. Meta data is no longer the dirty “D” word: documentation. It is now the nice “N” word: navigation. Documentation is often considered to be an IT overhead; after all, programmers can read the code, which is usually more reliable than any documentation anyway. But navigation cannot be dismissed so easily because it is an essential tool for business people to navigate through their BI/DW environment.
To most IT technicians, data modeling is synonymous with database design. When I ask “who invented entity relationship modeling” I get the answer: Dr. Codd. Wrong. Entity-relationship modeling, the first data modeling technique, was invented in the mid 1970’s by Dr. Peter Chen. Dr. Codd formalized the six Normalization rules and published the famous 12 rules of relational databases (actually 13 rules when counting rule Zero), but he did not invent the entity relationship modeling technique. The point I am making here is that data modeling originated as a business modeling technique before it became a database design technique. In the early days of relational databases (early 1980’s), the business data model, known as logical data model, and the database data model, known as physical data model, looked very similar because we did not denormalize heavily for operational systems. In the BI/DW world, those two models are quite dissimilar, especially when we store the data multi-dimensionally. Regardless of what database design schema you ultimately choose for storing the data, you should still create business data models for understanding the semantics and the business rules of the data first. Without understanding the semantics and the business rules of the data, it is impossible to standardize the data.
Data ownership and stewardship
As I said before, a data strategy is a strategic plan for enterprise-wide data governance. Who in the organization is, or should be, most interested in enterprise-wide data governance? IT staff or business people? Clearly, the business people. Thus, instituting enterprise-wide data governance is a business responsibility. This responsibility starts with data ownership and extends to data stewardship. The data owners are usually the originators of the data, or they are the primary users of the data. In either case, data owners are senior business people who have the authority to set policies and to create business rules for the data elements under their control. Data stewards also come from the business side, not from IT. They do not have authority to set policies or to create business rules, only to communicate and to enforce those policies and business rules. Data stewards also perform data audits and help resolve data disputes. Without data stewards, it would be very difficult to fully implement a data strategy.
In summary, a data strategy is an essential and fundamental building block for all IT initiatives, regardless whether they are operational or decision-support initiatives. I would go even further to say that adding more and more applications to our IT portfolios without a data strategy is like building a skyscraper without first pouring a foundation. The risks are obvious. If you would like to read more about Data Strategy, here are two references:
“Data Strategy” by Sid Adelman, Larissa Moss, and Majid Abai, Addison-Wesley, 2005. ISBN 0-321-24099-5
“Data Strategy: Survival Guide for the Information Age” by Larissa T. Moss, Cutter IT Journal, Vol. 19, No. 8, August 2006
About the Author
Larissa Moss is president of Method Focus Inc., and a senior consultant for the BI Practice at the Cutter Consortium. She has 27 years of IT experience, focused on information management. She frequently speaks at conferences worldwide on the topics of data warehousing, business intelligence, master data management, project management, development methodologies, enterprise architecture, data integration, and information quality. She is widely published and has co-authored the books Data Warehouse Project Management, Impossible Data Warehouse Situations, Business Intelligence Roadmap, and Data Strategy. Her present and past associations include Friends of NCR-Teradata, the IBM Gold Group, the Cutter Consortium, DAMA Los Angeles Chapter, the Relational Institute, and Codd & Date Consulting Group. She was a part-time faculty member at the Extended University of California Polytechnic University Pomona, and has been lecturing for TDWI, the Cutter Consortium, MIS Training Institute, Digital Consulting Inc. and Professional Education Strategies Group, Inc. She can be reached at firstname.lastname@example.org