Meta Data Repository: Key to Re-Usability
By Larissa Moss
The cornerstone of every commercial business is profitability. If market forces do not permit a great deal of leeway in the pricing of goods and services, other ways must be found to generate adequate returns. Reducing expenses is one of the most obvious ways to expand profit margins, and positioning the organization to react to new business opportunities quickly with flexibility and accuracy is another.
Continuing to build automated business solutions, whether operational systems or BI applications, using the current development approach is both expensive and time consuming. Therefore, a mechanism that can cut costs and shorten the delivery cycle must be viewed as a strategic tool and a competitive weapon. This is where a well-supported and fully operational meta data repository environment can help.
One of the best ways to reduce expenses, and shorten the delivery cycle at the same time, is to reuse what has already been built. This has been recognized in the maxim, “Reuse before you buy and buy before you build.” For IT to follow this maxim and to be able to quickly react to changing business demands, IT cannot continue to remain primarily a builder of new applications or continue to buy “silver bullet technology solutions.” IT will have to evolve into an assembler of reusable components!
Reusing existing data and processes makes such good sense that it is difficult to find anyone with serious objections to it; yet in practice hardly any organization is mandating a reusability policy from their IT group. One reason for the lack of reusability of components is probably the fact that it is nearly impossible to accomplish, unless a central facility like a meta data repository is developed, populated, and used. Therefore, a meta data repository must be recognized to make a significant contribution to the bottom line of the business as a whole.
Reusability of data and processes
We need to consider what makes something reusable. There are two criteria that must be satisfied.
In the first place, what is built must be designed and constructed with reusability in mind. This applies equally to data as it applies to processes. Secondly, the components to be reused must be accessible and identifiable as being reusable. This means that having a list or catalog of relevant information about reusable data and processes is imperative. A meta data repository is designed specifically for that purpose. It functions in much the same way as the card catalog does in a library. The card catalog does not contain any of the books in the library, but contains relevant information about those books including a pointer (Dewey Decimal Number) that will allow the user to go to the place in the library where any particular book is physically stored.
All librarians recognize the value of the card catalog and the need to maintain it. Since the collection of books is always changing, the catalog must be continuously updated to reflect these changes. This is an integral part of the natural processes of the library. It would be difficult to imagine using a library that did not perform this maintenance, yet most IT organizations are run without a similar catalog of their objects, namely the data items, data structures, programs, modules, and applications in their inventory. The maintenance of the meta data repository must become as integral to the natural process of managing the IT component inventory of an organization as the maintenance of the card catalog is to managing the book inventory of a library. Both are essential if the respective endeavors are to succeed.
Software Factory Model
The word inventory suggests business activity. Items in inventory can be purchased or manufactured. Either way, a business that does not know, with a high degree of precision, what it has on hand at any given moment, does not have the information it needs to control its processes effectively and efficiently. Having something on hand and not knowing it, is worse than not having it and knowing that it is not available, because buying or building something that is already “in house” wastes time and money, and in the end produces redundancy as well. This redundancy must be maintained, thereby adding even more time, money, and resources; all of which increases expenses and lengthens the delivery cycle.
If, on the other hand, a fully operational meta data repository was available, a search for an object (data or process) would produce a “hit”. The returned information would confirm that the cataloged object is the required one. Confidence in the meta data repository as a labor and money saving venture would be reinforced. Time and money that would have been spent in the reconstruction of the object would be saved. In addition, a lengthy search for information that may or may not exist about the needed object is avoided as well, because the authoritative source about such things, the meta data repository, was consulted. Imagine how much easier Y2K impact analysis would have been with a complete IT component inventory in a well maintained meta data repository.
It is instructive to view the IT organization as a software factory. Just as the word inventory suggests business, the word factory suggests dynamic processes and productivity. No senior executive or plant manager would tolerate the risk and uncertainty associated with the lack of information about factory processes, that IT organizations seem to accept as normal in the course of constructing applications. Using the meta data repository in conjunction with a software factory model or mindset can produce the control information that can put the processes of application construction and maintenance on a more sound and business-like basis.
Every technique that a factory uses to control its operation has a counterpart in IT systems. A factory uses predefined processes for executing work orders depending on their nature, scope and degree of completeness. These processes correspond to the development steps, activities, tasks and deliverable definitions of an application development life cycle. A factory also records data about work-in-progress as well as information on completed work orders, inventories of finished goods, shipped products, levels and use rates of raw materials, etc. This information is indispensable to running a factory effectively and efficiently. The entire collection of data and information that a factory uses in managing its business has its IT counterpart in the contents of the meta data repository.
Additional Benefits of Reusability
In addition to maximizing the value of available resources, using the meta data repository to reuse data and processes has additional benefits beyond those already mentioned. Reuse promotes consistency and less redundancy, which in turn has a major effect on an organization’s ability to be “quick, flexible, and accurate”. When a single source of data or a single process must be changed, it is “quicker” to make required changes in one place than in many. Making a change in one place to a data structure or process also assures consistency and higher quality, which directly affects being “accurate”. By creating structures that are designed from the beginning for reuse, the “flexible” term in the equation is positively affected since an approach of “mix and match reusable components” becomes feasible.
When the meta data repository is used to control the business of application development and maintenance, it now becomes possible for the organization to make the transition to an information-based environment. Meta data is instrumental for this transition because of the very simple relationship that exists between data and information, which is data + meta data = information. Information is data within context, and meta data provides that context.
Many companies today find themselves data rich, but information poor. This condition can be changed by using a meta data repository to define the information structures a company needs. Capturing the data and information structures a company has will permit analysts to determine what must be done to transform existing data into required information. The transformation processes (algorithms, programs) should be captured as meta data as well. Once this has been done, the meta data repository will be able to assist in influencing the future as well as in documenting the past and controlling the present. The meta data repository cannot by itself make the needed changes, but it is a powerful enabler if the required contents are available and management decides to use them to that end.
A critical success factor for an effective meta data repository, is to create an infrastructure in which it is used. This infrastructure is more appropriately called an environment, because infrastructure implies something that can be physically built whereas an environment goes beyond that and speaks to ways of operating and thinking as well. A successful meta data repository environment is one in which current processes will have been reengineered and attitudes reshaped to support a more efficient and effective use of resources throughout the application development life cycle.
Reshaping of attitudes and reengineering the processes currently used in application construction is the responsibility of the executive management in the organization. The role that executive management must play is communicate the importance of using the meta data repository as a way of managing the organization in general, and the delivery and maintenance of operational systems and BI applications in particular. This executive sponsorship cannot be overemphasized.
There are several important parts to this sponsorship. The most important is the communication of management expectations that meta data will be a deliverable of every application. Management must explain that it requires an accurate and complete knowledge base of the IT component inventory, as well as the business definitions upon which design decisions are based. In other words, management must establish a policy for mandatory meta data repository creation, maintenance, and usage.
Another important aspect to meta data repository success is to adopt a “pay-as-you-go” approach. This means updating the meta data repository as part of the process of application development and not as an afterthought. Documenting an application as an afterthought is usually considered “extra work” by the technicians. But using the meta data repository to document the work as it is being done in conjunction with a rigorously used application development methodology is vital to creating robust applications and the information necessary to maintaining them quickly and at low cost. Instituting active quality control procedures for development projects that include inspection of meta data repository entries, as proof that the required work has been done, should be a high priority.
Meta data repositories have long been seen as tools primarily of interest to developers and technicians. Their value to knowledge workers and business analysts is now being recognized. The strategic value of a meta data repository to an organization is that of being an enabler for the transition to an information-based environment, which gives the organization the ability to be “quick, flexible, and accurate”. Managers will be able to manage more effectively and technicians can be more productive when the business meaning of data and processes are more widely understood. Once the inventory of data and process has been cataloged, the opportunities for reuse will increase, which in turn will promote consistency and enhance quality. Lead times and the costs of development and maintenance will shrink as more of the inventory can be identified as reusable.
For the meta data repository to provide all of the benefits mentioned, an environment must be created in which its knowledgeable use and careful maintenance are not just encouraged but required by executive management. It is management’s role to create such an environment in which everyone understands that the meta data repository must be complete, accurate and continuously maintained because it is a repository of information about the corporate assets of business data and the business processes, and the IT constructs which automate them. To be without a full accounting of such vital assets is to forego the opportunity of maximizing the value of those assets. It also poses the risk that the competitive position of an organization could be harmed through unnecessarily high costs, long implementation lead-times, and low quality levels in its applications.
About the Authors
Mike Brodie is a veteran in the community of meta data repository administrators. His first exposure to rudimentary data dictionaries dates back more than 20 years. Since then he has evaluated, selected, installed, and enhanced numerous data dictionary and meta data repository products. He also designed and developed a number of customized meta data repositories. He is committed to educating the business community on how to maximize the business impact of meta data on the financial performance of their companies. Mike Brodie can be reached at email@example.com
Larissa Moss is founder and president of Method Focus Inc. She has over 20 years of IT experience with information asset management. She frequently presents at conferences worldwide on the topics of Data Warehousing, Business Intelligence, CRM, Information Quality, and other information asset management topics, such as data integration and cross-organizational development. Ms. Moss is co-author of the books: Data Warehouse Project Management, Addison Wesley 2000; Impossible Data Warehouse Situations, Addison Wesley 2002; and Business Intelligence Roadmap: The Complete Project Lifecycle for Decision Support Applications, Addison Wesley, 2003. She can be reached at firstname.lastname@example.org