Business Warehouse Oversight – Part I
Data warehouses, marts and all analytical solutions are often thought of as technical structures that are defined, built, and managed by the IT department. By now I think we all know that the solutions must be business driven, having a real business problem to solve, and IT needs to have some documented requirements of what the business wants to do with the data. While IT needs to be responsible for the architecture, applications, and technologies used to populate and deliver these solutions, the business needs to fulfill a crucial role in oversight of the content and usage of any data warehouse/analytics solution.
Governance of solutions is a very broad topic, in this case, we are focused on the management of the data going into the solution, who is able to and actually using it, as well as how it is being used. In order to better understand the various roles and functions, let’s examine the crucial aspects of business involvement required to have a solution that realizes value and provides return on investment.
To delve in depth in this focus area, this article is the first in a 2 part series that will examine both data content oversight and usage oversight.
Business resources are the real experts behind the content of data in their capture systems and thus should be the ones providing the oversight of the data population. The business needs to be active in the data and meta data used to populate analytical solutions, as well as the management of data and meta data as it is updated for the solution. The key areas to map for business involvement include:
Data Mapping And Profiling: Data in our sources regularly changes in both content and quality. This could be due to new source system changes, or business process changes that cause data capture or content to be altered. While this is thought of as only a project assignment, it is also an activity that needs to be maintained and frequently validated. In both cases, this activity should be a joint effort between the business and IT resources that have to manage the tools, technologies, and database. Establishing the processes and frameworks that support this will enable a repeatable structure that can be used across all analytics efforts in your organization. This area also provides an opportunity to work with the sponsors, owners, and custodians of the source systems to educate them on the purpose and design of your analytical system. Where required, you may have data usage agreements and security processes that help those key resources understand the level of protection and auditing that you provide for any data integrated into your solution.
Meta Data Content: There are several areas of meta data that are valuable to analytical solutions. From business terminologies/definitions to data heritage/lineage, to operational meta data, all 3 areas need to be controlled. Some of this data will be automatically generated and thus needs to be validated as new meta data is available. Other meta data is updated and maintained by the business. Typically IT provides an interface for the business to maintain this information, but the business also has a responsibility to publish said changes to users so the new information is proactively shared. Users tend to rely on meta data to navigate solutions, thus, as users get comfortable understanding the context of data, any meta data changes should be communicated prior to making them available to users so the resulting change and understanding of the data is understood. Connecting these changes directly to your analytical applications so upon login users can be reminded of them can help to cement the bond between the user and the content.
Data Acceptance: While this seems to be a simple validating and rubber stamp certification of data, it is complicated by the fact that most data warehouses are loaded both with initial one-time loads of historical data as well as recurring periodic updates. Building on the process your organization establishes for the mapping and profiling of data, a process for data acceptance is the next key step. Expanding the business and IT relationships, this area is also a joint effort where results are managed, tracked, and made available to the business and IT leadership overseeing the solution. Typically most of the same resources are included to keep a consistency in knowledge of the data, mapping, and quality.
Meta Data Acceptance: Meta data is the secret sauce in describing the data and solution to users of analytical systems. Having the context around the data they are using enables them to most easily adopt and leverage the resulting solution. Meta data acceptance is in theory very similar to data acceptance. It can thus be handled by following the same processes and tools you have developed for data acceptance. The business and IT resources may be the same or, depending on the size and complexity of the solution, may be separate, focused areas. The only real difference is that now you are dealing with the various components of meta data that are attached to the data and resulting front-end solution.
Data Quality Assurance: Once the solution is up and running, data quality becomes even more important. As users get comfortable with their analytics and tools, any data changes that impact the meaning and derivatives/metrics can have a strong negative impact on users if not understood. Leveraging the same resources (both IT and business) and processes that conduct the data acceptance, you should enhance that process to ensure the continuing quality of the data in your solution. Publishing data quality issues will help users avoid challenges or concerns that require significant research on their end only to find out the data in the database has changed for whatever reason. One best practice is to have business and IT resources responsible for managing the quality of your analytic data sit on the committees that oversee all source system releases. Hearing ahead of time what changes are coming, they will be better prepared to know what to look for and when to have special data quality audits. For large or significant changes, this can help facilitate getting the analytics resources included in system releases so they can see the data as it goes through requirements and system/integration testing. Once a system is in production, this joint collaboration enables you to have business resources that best understand the data be the first line of support for data related questions from the users. The more technical and complex the data you are dealing with, the more valuable this support structure is. In healthcare, the users of these systems are often providers of care. Having a care provider talk to an IT resource, commonly an ETL developer, when they have a data question often results in frustration, misinterpretation, and lots of extra effort in fixing data that may just be interpreted or displayed incorrectly.
Organizational Alignment: Depending on the size of your company, there are many ways to organize the various resources and teams in support of data content. One best practice is to connect the business and IT resources involved in content oversight. This helps to create a shared vision, frequent interaction, and joint ownership of results. This doesn’t have to mean they all report to one manager/director. While one organization may report to IT and the other to a specific business area, it is important that there are dotted line relationships, joint meetings, and joint leadership interaction/supervision.
About the Author
Bruce has over 20 years of IT experience focused on data / application architecture, and IT management, mostly relating to Data Warehousing. His work spans the industries of healthcare, finance, travel, transportation, retailing, and other areas working formally as an IT architect, manager/director, and consultant. Bruce has successfully engaged business leadership in understanding the value of enterprise data management and establishing the backing and funding to build enterprise data architecture programs for large companies. He has taught classes to business and IT resources ranging from data modeling and ETL architecture to specific BI/ETL tools and subjects like “getting business value from BI tools”. He enjoys speaking at conferences and seminars on data delivery and data architectures. Bruce D. Johnson is the Managing director of Data Architecture, Strategy, and Governance for Recombinant Data (a healthcare solutions provider) and can be reached at firstname.lastname@example.org