Planning for the Evolution of the Data Warehouse
By Larissa Moss
There are several things to consider when planning the evolution of a data warehouse. Some of these are obvious, while others are often overlooked.
The most obvious area given much attention to is scalability of the technical platform. Because a data warehouse cannot be developed in one big bang, growth in terms of database size, number of users, network size and complexity, and hardware capacity are usually anticipated and planned for. Even though the growth factor is often underestimated, some exercise in capacity planning is still performed.
But there is more involved with data warehouse sustainability than just scalability.
One is resources, human resources. That comes in two flavors: the technical staff involved with developing the data warehouse and the users who use it. Can the momentum and the excitement be kept up for both?
The technical development staff does not like to become the maintenance staff. They want to do development work, and will move to other projects if they get stuck with maintaining old data warehouse iterations. Therefore, it is important to have a plan to maintain the existing data warehouse. This too comes in two flavors.
One is to have a programming staff that is familiar with the underlying code and can work on important “fixes” which may include code changes, or just rerunning jobs that failed during the night. It is important to remember that “enhancements” which usually fall into maintenance, should not be performed by this staff but instead be folded into the next data warehouse development iteration.
The other flavor of maintaining the data warehouse is technical support for the end users. This could be a separate group or the same maintenance group that fixes problems. Technical support is a key issue to sustaining user enthusiasm for the data warehouse. Users come in different flavors as well. There are the tourists, who use prewritten GUI applications, there are the data farmers who may need to write sophisticated queries against the data warehouse, and there are the explorers who are mining the data warehouse. All of these must be supported with the same zest by the technical staff.
Another important aspect to keeping up user enthusiasm is marketing the data warehouse. Keeping the benefits of the data warehouse known and published. This can be accomplished with a monthly electronic news bulletin sent out over e-mail or published on the intranet. Furthermore it is a good idea to have users advertise the data warehouse to other users. This can be done in monthly user group meetings. Let the users share their experiences and their queries, and have “chat time” set aside where information between users can be exchanged informally.
One area that is most often overlooked is the data architecture; the organization’s logical data model. Some questions to consider here are: What is the policy and the process for creating a robust and complete logical data model of the organization’s business data? Will this logical data architecture be developed over time as the data warehouse grows? Or does the company sponsor a separate effort which is tied into all systems development work, operational and decision support? In either case you need to establish a robust and detailed data architecture, which then can be used by the data warehouse development staff. To build a logical data model for the entire enterprise is a long and costly effort. Therefore plans must be in place which will fold this effort into the overall development strategy of the organization.
Just as important as the data architecture, and often just as overlooked, is the entire data warehouse development infrastructure: the methodology, the standards, the metadata policies, the repository, the data ownership and data stewardship, the review process, the facilitation process, the conflict resolution process.
Unless a company is very structured already, these things will probably be experimented with during the first few iterations of the data warehouse. Many decisions will work out, but many other decisions will have to be revisited during the post-implementation reviews. A process must be in place to prepare for these reviews, schedule them, facilitate them, analyze the review results, come up with recommendations, and ultimately fold these recommendations into future data warehouse iterations.
Often it turns out that the first iteration of the data warehouse is the easiest one, even though it does not seem that way when you develop it.
About the Author:
Larissa Moss, founder and president of Method Focus, Inc., has been consulting, publishing, speaking, and lecturing worldwide, on the subjects of data management, data architecture, and data warehousing. She also co-authored the data-driven relational system development methodology RSDM-2000.