The Building Blocks of Enterprise Information Management (part 3) – Data Governance
By David Marco
In last month’s column I discussed the importance of data governance for the Enterprise Information Management (EIM) program. In this article I will discuss the first tasks that the data governance team must address and the typical data stewardship activities that they will be involved in.
Preparing for Data Stewardship
The data stewardship committee must complete the following tasks before they can capture and define business and technical meta data:
Form a charter
Define and prioritize the committee’s activities
Create committee rules of order
Establish roles for committee members
Design standard documents and forms
Form a Charter
The first task of the data stewardship committee is to form a documented charter for their activities. This charter states the business purposes that necessitated the data stewardship committee formation. The data stewardship charter should not be a voluminous document, as this document’s goal is to provide a clear direction as to the committee’s strategic business goals. Obviously this charter needs to target the specific concerns and opportunities of the company.Best Practice:I like the data stewardship charter to fit on one single-spaced page. Anything longer is likelytoolong.
For example, pharmaceutical companies tend to have extensive and elaborate data stewardship committees. Its data stewardship charter traditionally focuses on clinical trials, the process a pharmaceutical company goes through to research, develop, and attain government approval for new compounds (drugs). The average cost for developing a new drug is between $150-$250 million, and over 10 years of time, before it can be brought to market. Everydaya new compound is delayed from reaching the market costs the company $1 million in lost revenue. This includes the extra time it will take to recoup sunk expenses (ever see the interest expense on $150 million?) and the possibility of a competitor creating a competing compound.
During these trials, government agencies like the FDA have rigorous standards that must be met before a new drug can gain approval. Passing these FDA (and other agency requirements) tests is not easy. These organizations and the corresponding legislation require that a pharmaceutical company has very definitive definitions for their data elements. Clearly, a pharmaceutical company‘s data stewardship committee’s charter will focus very heavily on how they can expedite the passing of the FDA audits.
Define and Prioritize Committee Activities
Once the charter has been created, the data stewardship committee needs to define the specific activities that they will be performing. It is vital that these activities will support the strategic objectives of the data stewardship charter.
Once these activities have been defined, they must be prioritized so the data stewardship team knows what to tackle first. At this point in the process we suggest using a matrix to show the possible activities. On the vertical axis, list which of these activities will be most beneficial to the organization (see Figure 1).
Figure 1: Prioritization Matrix
Create Committee Rules of Order
Once the activities of the data stewardship committee have been identified, the committee will have to create rules of order for their organization. Following is a sample of the types of rules of order that will be needed to be defined:
Regular meeting schedule
Meeting structure or agenda
Meeting notes capture and dissemination
Establish Roles for Committee Members
After the data stewardship committee has defined their rules of order it will be important for this team to formally define their different data stewardship roles and responsibilities. Earlier we defined four data stewardship roles: executive sponsor, chief steward, business steward and technical steward. Certainly these roles are a good beginning set for any new data stewardship committee; however, if you are like most companies you will tailor these roles, titles and descriptions to suit your company’s specific needs.
Design Standard Documents and Forms
Now it is time for the data stewardship committee to create any standard documents or forms that will support the data stewardship activities. This activity is important, as you do not want to have each steward creating their own document for each activity.
One of the most common documents that will be required is a change control document. Members of your company use it to formally document their data stewardship tasks. For example, suppose that a key task of your data stewardship committee is to define business meta data definitions. Certainly you will have business stewards working on these definitions; however, some people may not formally be part of the data stewardship committee. These people may want to recommend changes to the business definitions that your business stewards defined. Clearly you would need a form (optimally Web-based, tied to a managed meta data environment (MME)) that would allow them to provide feedback.
Another common form is a data stewardship feedback mechanism. It is important that the data stewardship committee is not viewed as a group that is in their own ivory tower. Allowing feedback on the things that your data stewardship committee is doing well, as well as recommendations on what they can do better helps to ensure that you are meeting the needs of your constituency.
Data Stewardship Activities
The specific activities of the data stewardship committee will vary from one organization to another and from industry to industry. However there are some common activities, listed here:
Define data domain values
Establish data quality rules, validate and resolve them
Set up business rules and security requirements
Create business meta data definitions
Create technical data definitions
The data stewards who work on these different activities will be primarily working with meta data; however, there will be occasions when they may need to work with actual data. The following sections walk through these activities and provide a set of guidelines and best practices for performing them. We discuss the typical data stewards who perform each task. It is important to note that sometimes there are people who are highly knowledgeable on the data and the business policies around the data, even though they do not belong to the particular stewardship group that I mention for the activity. For example, there may be some technical stewards who are as knowledgeable on the business policies and data values as any of the “official” business stewards. So even though I state in my guidelines that the business stewards should be creating the business meta data definitions, obviously you would want these technical stewards working with the business stewards to define the business meta data definitions.
For all of these activities, the chief steward will play a critical role in ensuring they are properly completed. A good chief steward ensures that the technical and business stewards work thoroughly and expediently, understanding how easy it is for a group to fall into “analysis paralysis”. The chief steward will act as a project manager or guide in each of these activities. Most importantly, the chief steward aids committee members in any resolving any conflicts — and there will be conflicts.
Define Data Domain Values
Once the business stewards define the key data attributes, they need to define the domain values for those attributes. For example, if one of the attributes is state code, then the valid domain values would be the two character abbreviations of the states (e.g. CA, FL, IL, NY).
As with all data stewardship tasks, this meta data will be stored in the MME. It is highly recommended that a Web-based front-end be developed so that the business stewards can easily key in this vital meta data.
In many cases data modelers input attribute domain values into their modeling tool. If this process has occurred in your company you can create a process to export that meta data from the modeling tool and into the MME. This will allow the business steward a good starting point in their process to enter domain values.
Establish Data Quality Rules, Validate and Resolve Them
Data quality is the responsibility of both the business and technical stewards. It is the responsibility of the business steward to define data quality thresholds and data error criteria. For example, the data quality threshold for customer records that error during a data warehouse load process may be 2%. Therefore if the percentage of customer records in error is greater than 2% than the data warehouse load run is automatically stopped. An additional rule can be included that states if the records in error is 1% or greater but less than 2%, then a warning message is triggered to the data warehouse staff; however, the data warehouse run is allowed to proceed. An example of data error criteria would have a rule defined for the “HOME_LOAN_AMT” field. This rule would state that the allowable values for the “HOME_LOAN_AMT” field is any numeric value between $0-$3,000,000.
It is the responsibility of the technical stewards to make sure the implementation of the data quality rules are adhered to. In addition the technical stewards will look to work with the business stewards on the specific data quality threshold and data error criteria.
Set Up Business Rules and Security Requirements
Business rules are some of the most critical meta data within an organization. Business rules describe how the business operates with its data. A business rule describes how the data values were derived and calculated, if the field relates (cardinality) to other fields, data usage rules and regulations, and any security requirements around a particular entity or attribute.
For example, a healthcare insurance company may have a field called “POLICY_TREATMENTS”. This field may list the specific medical treatments that a policy holder has undergone. The business rule for this field is an alphanumeric, 20 byte field, whose “system of record” is “System A”. In addition, there may be security requirements on this field. Most health insurance companies provide coverage to its employees, so the security requirement for this field is that the IT department cannot view this field or associate it with any fields that would identify the policy holder. When security rules like these are broken, the corporation is vulnerable to legal exposure.
Create Business Meta Data Definitions
One of the key tasks for the business stewards is to define the business meta data definitions for the attributes of a company. It is wise to begin by having the business stewards to define the mainsubject areasof their company. Subject areas are the “nouns” of the corporation: customer, product, sale, policy, logistics, manufacturing, finance, marketing, and sales. Typically companies have 25-30 subject areas, depending on their industry. Once the business stewards define the subject areas, then each of these areas can be further drilled down. For example, a company may distinguish between the different lines of business or by subsidiary.
Some data elements require calculation formulas of some sort. Your company may have a data attribute called “NET_REVENUE”. ‘NET_REVENUE” may be calculated by subtracting “gross costs” from “gross revenues”. Any calculation formulas should be included in the business meta data definitions.
Once the key data elements are identified, then the business stewards can begin working on writing meta data definitions on the attributes. The process for capturing these definitions needs to be supported by an MME. The MME includes meta data tables with attributes to hold the business meta data definitions. In addition, a Web-based front-end would be given to the business stewards to key in the business meta data definitions. The MME captures and tracks these meta data definitions historically, using “from” and “to” dates on each of the meta data records. A meta data status code is also needed on each row of meta data. This status code shows if the business meta data definition is approved, deleted or pending approval.
When the first business meta data definitions are entered, it is common to mark them as “pending”. This allows the business data stewards to gain consensus on this elements before changing their status to “approved”.
Create Technical Meta Data Definitions
The technical stewards are responsible for creating the technical meta data definitions for the attributes of a company. It is important to understand that technical meta data definitions will fundamentally differ in form from business meta data definitions. As business meta data definitions are targeted to the business users, technical meta data definitions are targeted for an organization’s IT staff. Therefore it is perfectly acceptable to have SQL code and physical file and database locations included in the technical meta data definitions.
Usually it is too much work to have the technical stewards listallof the physical attributes within the company. Instead, begin with the technical stewards listing their key data attributes. By identifying the core data attributes, the IT department can focus technical meta data definitions on only the most important data attributes. Once your technical stewards have defined these initial physical attributes they can now start working on the remaining attributes.
The process for capturing these technical data definitions is a mirror image of the process to capture business meta data; in fact, the Web-based user screens should look very similar. The same functionality described in the business meta data definitions above (from and to dates, status codes, and so on) should also be included.
Once both the business and technical stewards define their meta data definitions, any discrepancies will almost immediately come to light–and there will be discrepancies. For example, the business stewards may define “product” as any product that a customer has purchased. The technical stewards may define “product” as a product that is marked as active. These two definitions are clearly different. In the business stewards definition any product (active or inactive) that is currently on an open order for a customer would be valid. Obviously, the IT staff will want to work with the business users to repair these hidden system defects.
Next month I will present a structure for the EIM Organization and how its different components need to interact with one another.
About the Author
Mr. Marco is an internationally recognized expert in the fields of enterprise information management, data warehousing and business intelligence, and is the world’s foremost authority on meta data management. He is the author of several widely acclaimed books including “Universal Meta Data Models” and “Building and Managing the Meta Data Repository: A Full Life-Cycle Guide”. Mr. Marco has taught at the University of Chicago, DePaul University, and in 2004 he was selected to the prestigious Crain’s Chicago Business “Top 40 Under 40” and is the chairman of the Enterprise Information Management Institute (www.EIMInstitute.org). He is the founder and President of EWSolutions, a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class solutions using data warehousing, enterprise architecture, data governance and managed meta data environment technologies (www.EWSolutions.com). He may be reached directly via email at DMarco@EWSolutions.com