Focus on Data Definition

By Bruce Johnson

Most requests for using adata deliverytool (BI is now becoming too broad a term to use clearly anymore) to showcase data and deliver it to appropriate users are simplistic in nature. Still, the number of project failures, overruns, and solutions that the business says doesn’t give them value are found frequently.

As technical and business resources are trained in how to deliver solutions, most recommendations and templates put the focus on the definition of requirements. One of my favorite, yet hardest to navigate solutions is creating a fact/qualifier matrix. This tool is supposed to be used to help IT and the business to jointly identify the specificdataandusage needsthat will lead to a successful solution. If an enterprise data model exists and the appropriate meta data is available to explain it, the population of such a matrix is quick and efficient. If not, it is oftentimes an exercise in futility – it takes a long time to finalize and by the time the business has actually adopted the data and delivery, the fields and derivatives have often evolved significantly from the initial matrix. The challenge comes from the fact that you are sitting down with someone from a specific part of the organization trying to understand how they want to see data. Ultimately, you are getting their perspective. When it comes to how other people see it, they may use the same terms with completely different meanings. Since these two solutions cannot now come together, it is covered up by the statement “our data is just too complex”. It isn’t the nature of the data that is complex, it is the nature of the usage.

Using Technology To Solve Terminology Problems

Invariably, as much focus as is put on the technology tools, it will only be effective if the data definition is clearly established. If your enterprise data is clearly defined, managed, and governed, the application of BI type toolsets to manage it is an easy and extremely effective task. Think of the number and variety of front end BI tools in the marketplace today. They range from simple reporting tools to analytics, dashboards/scorecards, and even heavy duty data mining. All of them have their loyal customers that swear by, and sometimes at, their specific tools. Thus, you must ask if there is really only one good tool that will fit your needs, or if the tool is just a method of delivery.

I have sat through presentations from several of the largest technology vendors that promise that they can build a solution to instantly notify us when a specific biohazard event occurs. It is easy to capture our imagination of the value of this need. However, if the data was clearly defined and available on the appropriate platform, this would be an elementary solution. First, we define the specifics of the rule that needs to be executed. Using “if / then” logic, that we were taught in programming from it’s inception, is all it takes to begin to monitor that rule (not to mention there are several good rules engines that work extremely well against sound data). Yet, few organizations are able to realize this because they haven’t captured the essence of their information.

A specific example I encountered in the finance industry centered on the definition of net sales. While sales viewed this as <gross – SGA expenses>, marketing looked at is as <gross – SGA – commissions> (you could understand why the sales force didn’t want to use this), and finance viewed it as <gross – SGA – commissions – rebates>. Through talking with various business leaders, it became apparent that this was an issue they had been dealing with for years. Since each area referred to it as net sales and they didn’t come together to resolve the differences, it further distanced these areas from each other. The truth was merely that there were 3 different outcomes that they were looking for and all 3 had value worthy of measuring.

Are You Building A Data Dump(s)?

An enterprise data model, with sound meta data, and strong business governance of the data takes significant time to establish. Business leaders often command specific solutions, yet they are very strong thinkers and generally very willing to discuss options. When IT and the business really sit down and collaborate on how to satisfy the business needs both now and in the future, the reality of “doing more with less” can be realized. One measure of our success as IT leadership should be our ability to properly inform the business of the complexities, challenges, and importance of a sound, scalable design. Yet, most often our solutions are to grab what we can to satisfy the immediate need and promise that all of these “one-off” solutions will magically fit together in the future. We tend to highlight the tools and technologies, not the design. Most large companies have re-architected their data warehousing solutions many times for this exact reason. When they try to understand why they spend so much every year on maintaining all of their various solutions, they end up going back to the drawing board and ironically, the enterprise data management initiative is the first one cut. It is equivalent to IT acknowledging to the business that we will continue to deliver data solutions held together with duct tape and band aids. We have all heard that the definition of “insanity” is doing something over again the same way and expecting different results.

All of that being said, there are cases where this may be the right approach for your needs.

Focus On Data Definition

By starting at the beginning and clearly understanding all of the information around your organization you can establish a holistic baseline that all analytics are designed off of. This usually results in significant business impact and changes to improve the quality of the data captured in your transactional systems.

By combining an Enterprise Data Governance program run by the business and comprehensive meta data that is incorporated into your delivery solutions, you now have the clarity to unlock the analytics and measurements required to impact strategy.

When questions of the definition, timeliness, and validity of your data arise, the tools and business authority will be there to provide assurance. If you want to have a successful SOA architecture, you first must make sure your data house is in order. SOA becomes an exercise in development, not a battle to get agreement on terms and valid values that is never ending.

One last key is to avoid data model / terminology perfection – this is the epitome of the “boiling the ocean” concept. Getting everyone to agree on every aspect of data terminology is a black hole that will eventually swallow your projects. Getting executive business support and keeping this risk at the forefront of these efforts will keep you focused on the task at hand.

There are many factors involved in the creation of an enterprise data model, most importantly the approach. How can you plan and coordinate this such that you can deliver solutions while building the model, instead of having to spend years doing the model before you can use it. By not understanding how to build an enterprise data model incrementally, this becomes a crutch that is used to explain why it won’t work. This is a topic to explore in depth in another article.

About the Author

Bruce has over 20 years of IT experience focused on data / application architecture, and IT management, mostly relating to Data Warehousing. His work spans the industries of healthcare, finance, travel, transportation, retailing, and other areas working formally as an IT architect, manager/director, and consultant. Bruce has successfully engaged business leadership in understanding the value of enterprise data management and establishing the backing and funding to build enterprise data architecture programs for large companies. He has taught classes to business and IT resources ranging from data modeling and ETL architecture to specific BI/ETL tools and subjects like “getting business value from BI tools”. He enjoys speaking at conferences and seminars on data delivery and data architectures. Bruce D. Johnson is the Managing director of Data Architecture, Strategy, and Governance for Recombinant Data (a healthcare solutions provider) and can be reached at

Free Expert Consultation