Which Should Come First the Chicken or the Egg (Meta Data Repository or the Data Warehouse)?
By David Marco
Which application did most companies build first, the data warehouse/data marts or the meta data repository? The obvious answer is the data warehouse. Most Global 2000 companies have a data warehouse (typically several) conversely many companies still do not have a meta data repository. A much more interesting question is “if a company only has time/money/resources to build one of these applications which SHOULD it build first, the meta data repository or the data warehouse”? Before we address this question, I want to make sure that it is clearly understood that both a meta data repository and a data warehouse are critical applications that most every company NEEDS to have. Those companies that neglect these applications, or do not build them properly will be replaced by their competitors that do.
Which Should We Build First?
Over the years I’ve given more than a hundred keynotes/seminars on data warehousing and meta data. During these talks I’ve been asked the question of which should you build first many times. After giving it careful thought I have come to the conclusion that a corporation’s optimal approach is to FIRST build their meta repository. Let’s examine the reasons why.
IT Applications Failure
When a corporation looks to undertake a major IT (information technology) initiative, like a CRM (customer relationship management), ERP (enterprise resource planning), data warehouse, or e-commerce solution their likelihood of project failure is between 65% – 80%, depending on the study that you are looking at. This is especially alarming when we consider that these same initiatives traditionally have executive management support and cost many millions of dollars. For example, I have one large client that is looking to roll out a CRM system (e.g. Siebel, Oracle) and an ERP system (e.g. SAP, PeopleSoft) globally in the next four years. Their initial project budget is over $125 million! Consider this, when was that last time that you’ve seen an ERP or CRM initiative being delivered on time or on budget?
Meta Data Repositories Enable ALL IT Applications
When we examine the causes for these projects failure several themes become apparent. First, the projects did not address a definable and measurable business need. This is the number one reason for project failure, data warehouse, CRM, meta data repository, or otherwise. As IT professionals we must allows be looking to solve business problems or capture business opportunities. Second, the projects that fail have a very difficult time understanding their company’s existing IT environment. This includes custom applications, vendor applications, data elements, entities, data flows, data heritage, and data lineage. A meta data repository (and specifically technical meta data) allows a corporation to decipher their IT environment and reduce the systems development life-cycle for ERP, CRM, data warehouse, and E-Commerce applications.
For most of these systems (data warehouses specifically) a meta data repository is a critical project enabler and long-term sustainer of the application. However many companies in their enthusiasm to build a data warehouse did so at the expense of architecture, quality, and without a meta data repository supporting it. Not surprisingly most Global 2000 companies will spend the better part of this decade completely rebuilding these systems.
Vendor Tools
As I had previously mentioned that most companies have selected their data warehousing tools and built their data warehouse before implementing their meta data repository. While data warehousing tools have certainly matured over the years, the companies that selected their data
warehousing tools without addressing their meta data repository requirement will most likely end up with tools that will not support their meta data repository. Conversely the tools that are used to build the meta data repository typically do not hamper the development of the data warehouse (an incorrectly built meta data repository does).
Often times a corporation will not want to wait to attain the substantial benefits of a meta data repository and a data warehouse, and will look to build both of these applications in parallel. This approach makes sense as a meta data repository is an absolute necessity for the success of the data warehouse. Conversely data warehouses and the tools that build them typically provide some of the most valuable meta data for the repository.
The number of companies looking to build the meta data repository is growing more rapidly than ever before. While meta data repository initiatives are certainly not without their fair share of project failures, those companies that have worked hard and been methodical in their approach have build repositories that are providing them a tremendous competitive advantage.