Designing the Optimal Meta Data Tool (part 1 of 3)

By David Marco

Many government agencies and corporations are currently examining the meta data tools on the marketplace to decide which of these tools, if any, meet the requirements for their meta data management solutions. Often times these same organizations want to know what types of functionality and features they should be looking for in this tool category. Unfortunately, this question becomes very complicated as each tool vendors has their own personalized “marketing spin” as to which functions and features are really the most advantageous. This leaves the consumer with a very difficult task indeed especially when it seems like none of the vendors tools fully fit the requirements that you meta data management solution requires. At EWSolutions we have several clients that have these exact same concerns about the tools in the market.

Although I have no plans on starting a software company, I would like to take this opportunity to play software designer, and present my optimal meta data tool’s key functionality. One of the challenges with this exercise is that meta data functionality has a great deal of depth and breath. Therefore, in order to properly categorize our tool’s functionality, I will use the six major components of a managed meta data environment (MME):

  • Meta Data Sourcing & Meta Data Integration Layers
  • Meta Data Repository
  • Meta Data Management Layer
  • Meta Data Marts
  • Meta Data Delivery Layer

I will now walk through each of these MME components and describe the key functionality that my optimal meta data tool would contain.

Meta Data Sourcing & Meta Data Integration Layers

For simplicity sake I will be discussing this “dream” tool’s functionality for both the meta data sourcing and the meta data integration layers together. The goal of the meta data sourcing and integration layers is to extract the meta data from its source, integrate it where necessary, and to bring it into the Meta Data Repository.

Platform Flexibility

It is important for the meta data sourcing technology to be able to work on mainframe applications, distributed systems and from files (databases, files, spreadsheets, etc.) off of a network. These functions would have to be able to run on each of these environments so that the meta data could be brought into the repository. I did not include AS 400 environments in my list of platforms because of its fairly sparse use; however, if your information technology (IT) shop’s preferred application platform is AS 400 clearly your optimal meta data tool would work on that platform.

Prebuilt Bridges

Many of the current meta data integration tools come with a series of prebuilt meta data integration bridges. The optimal meta data tool would also have these prebuilt bridges. Where our optimal tool would differ from the vendor tools is that this tool would have bridges to all of the major relational database management systems (e.g. Oracle, DB2, SQL Server, Informix, Sybase and Teradata), the most common vendor packages (e.g. Siebel, SAP, PeopleSoft, Oracle, etc.), several code parsers (COBOL, JCL, C+, SQL, XML, etc.), key data modeling tools (ERWin, Designer, Rational Rose, etc.), top ETL (extraction, transformation and load) tools (e.g. Informatica, Ascential) and the major front-end tools (e.g. Business Objects, Cognos, Hyperion, etc.).

As much as is possible I would want my meta data tool to use utilize XML (extensible markup language) as the transport mechanism for the meta data. While XML cannot directly interface with all meta data sources, it would cover a great number of them.

These meta data bridges would not just bring meta data from its source and load it into the repository. These bridges would be bi-directional and allow meta data to be extracted from the meta data repository and brought back into the tool.

Lastly, these meta data bridges wouldn’t just be extraction processes, but also have the ability to act as “pointers” to were the meta data is located. This distributed meta data capability is very important for a repository to have.

Error Checking & Restart

Any high quality meta data tool would have an extensive error checking capability built into the sourcing and integration layers. Meta data in a MME, like data in a data warehouse, must be of high quality or it will have little value. This error checking facility would check the meta data which it is reading and would check it for errors and then capture any statistics on the errors that the process is experiencing (meta meta data). In addition, the tool would have error levels of the meta data. For example it would give the tool administrator the ability to configure the actionsbased on the error that occurred in the process. For example, should the meta data be:

  1. flagged with an informational/error message; or
  2. flagged as an error and then not loaded into the repository; or
  3. flagged an a critical error and the entire meta data integration process is stopped.

Also this process would have “check points” that would allow the tool administrator to restart the process. These check points would be placed in the proper locations to ensure that the process could be restarted with the least degree of impact on the meta data itself and on its sourcing locations.

Next month I will continue designing our optimal meta data management tool by presenting its key functionality in the Meta Data Repository and Meta Data Management layers of a managed meta data environment (MME).

Free Expert Consultation