Evaluating Repository Technology
By Anne Marie Smith, Ph.D.
The mission of a repository is to provide an efficient method for controlling the definition, access and use of metadata so this information can be used to provide meaning to corporate data. “Metadata” is “data about data” – information that can make the actual instance data in files or databases understandable to both technical staff (system of record, table or file where the data element is located, data type and size of the element) and to the business staff (business name of the cryptic system name, source the data, algorithm or calculation upon which a derived data is based, etc.) All this information can be stored in a repository, providing a central point of control for the management of metadata throughout an organization.
To evaluate something is “to place value upon or appraise an object”. To do this you determine the worth of an object against stated criteria. This report will outline a method for determining the relative worth of repository technology in an organization. No two evaluations will be conducted identically, but each evaluation can use a similar methodology successfully.
Historically, there has been little or no sharing of metadata across (or sometimes within) languages, applications, databases, file structures, etc. How many times has a systems analyst or programmer had to pour through documentation to discover some piece of metadata information from one file, program or database to use the same element in another file, program or database? Since the answer to that question is probably “a zillion times”, the effective and efficient management of that metadata would make technical staff much more productive. Giving business users of data access to the business metadata would enable them to make better use of the data used by so many analysts and executives. A repository can provide the central storage, consolidation, refinement and access to that important metadata.
All major projects have Critical Success Factors, and evaluating a repository is no different. Some common critical success for conducting a thorough repository evaluation would be:
- Active managerial support from both IS and business communities
- Established corporate information policies and standards
- Established development methodology
- Formal change management and configuration management policies
- Technological advocacy within the organization
- Organization culture of information accessibility
The presence or absence of these factors can determine the likelihood of successfully instituting a culture of enterprise metadata management that a repository is designed to support.
Evaluating a repository is a labor-intensive task, one that requires a team of specialists. A representative team would consist of a technical architect, a database administrator for the underlying repository database, a few programmer/integrators that are familiar with the application programs whose metadata will be scanned into the repository, a network security analyst, some data administration staff members and some business users who are familiar with the metadata requirements of the organization’s business community. The entire team will not always operate as a single entity, but each member’s function is necessary to successfully and fully evaluate such a complex product.
The evaluation team will develop a project plan, which may include the following major steps:
- Review the organization’s information architecture
- Establish basic technological criteria
- Determine organizational information goals and objectives
- Develop a Request for Information questionnaire to be sent to appropriate repository vendors
- Match vendors’ response to the RFI’s questions against the organization’s goals and objectives
- Evaluate the vendors’ responses in the context of the goals and objectives
- Recommend products for detailed testing
Each of these major steps may have a list of tasks associated with them, and the organization’s needs and schedule will influence the project deadlines. Based upon the organization’s technical architecture and its technology criteria, certain vendors/products may be eliminated from consideration. For example, if the organization has mainframe technology, running with DB2 and/or IMS and COBOL, products that are based on UNIX or NT platform and use Oracle as the RDBMS for the repository would probably not be included in the formal evaluation. Also, a company whose information architecture favors standard common platforms would not choose a repository with a vendor-proprietary underlying database management system.
Establishing the organization’s information goals and objectives may be the hardest step in the process. Some commonly used information goals and objectives are:
- Integrate storage for data definition
- Enforce naming and definition standards
- Enterprise data rationalization and data reusability
- Inventory and manage source and programming code metadata
- Facilitate user interpretation of logical and physical metadata
- Enable enterprise metadata management
These goals and objectives will form the foundation for the questions that become the Request for Information sent to the appropriate vendors. Having identified your organization’s goals and objectives ensures that the organization’s repository evaluation team drives the evaluation, not the vendors, whose goals and objectives are different from the evaluating organization.
Many Requests for Information are structured into sections, with each section having its own purpose and associated goals, objectives and questions. Some common questionnaire sections for evaluating a repository are:
- Basic Functionality
- Access Mechanism
- Product Architecture
- Scanner Tools
- Repository Population Mechanism
- Report Generation and User Interface
- Vendor Philosophy, Company History and Financial Status
The questionnaire should be written with detailed questions in each section that drive the vendor to respond in sufficient detail to perform an evaluation for the product’s suitability in the organization. To track and compare vendor’s responses against the appropriate criteria, develop a Questionnaire Response Matrix that may look like this:
Some questions may apply to more than one goal and/or objective, and measuring vendors’ answers for consistency and applicability will be one of the tests the evaluation team uses to determine the suitability of a product and vendor. Some repository implementation consultants can provide generic questionnaires that can be customized to fit an organization’s needs, but many organizations insist on drafting their own questionnaires to personalize the RFI.
The completion of the matrix should assist the team in choosing one or more products to actually test in the organization’s environment. When that decision has been made, it will be time to start writing an evaluation test plan. Each functional area of each repository should be tested using the same test plan, to provide consistent and measurable results. Each organization should develop its own test plan to conform to the organization’s goals, objectives, testing methodology and culture. Relying on purchased test plans or generic plans from the vendors does not allow the organization to establish its own criteria and can result in an inappropriate decision.
Most organizations will require the development of a cost/benefit analysis for a purchase as substantial as a repository. Representatives from the accounting or financial analysis departments usually perform such analysis, with active assistance from the repository evaluation team. Some repository vendors provide generic cost justifications for repository and these models may be useful to organizations that do not have a strong facility with cost justifying IS activities. Since much of the benefit for repository comes from the replacement of manual human labor with automated metadata management, this analysis may result in astronomical benefit numbers. Caution should be exercised in presenting such results to management, since many benefits may be realized only after several years of full repository implementation.
Some tangible benefits of implementing a corporate repository include:
- Structured, centralized approach to defining metadata
- Detection of redundant or inconsistent data
- Support for programming impact analysis
- Documentation of logical to physical data element mapping
- Support for data warehouse development for technical staff and business users
However, there are intangible benefits in implementing a corporate repository. They are difficult or impossible to quantify, but should be mentioned as important benefits when deciding to embark upon the management of metadata using a repository:
- Faster implementation of applications, based upon accurate and non-redundant metadata and data
- Faster response for changes to existing application, due to improved impact analysis
- Metadata consistency with in and across applications, leading to better understanding of the data used by the organization
In conclusion, repository technology can address many information goals in consistency and reusability of metadata and in support of an integrated technical architecture. Repository can be evaluated and justified successfully, although it requires significant effort from the evaluation team and managerial support for the information goals and objectives. Finally, technical and business needs and skills should be combined in the evaluation of repository since both technical and business users will use the organization’s metadata.