Proactive Performance For The Data Warehouse
By Sid Adelman
Diseases are named either after the doctor who discovers the disease (Alzheimer’s, Crohn’s) or after the unfortunate patient on whom the disease or disorder is diagnosed (there is an Eastern European curse, “May you have a disease named after you.”). Baron von Munchausen was unfortunate enough to have given his name to a debilitating disorder. His disorder was psychological. He believed he had a variety of diseases and in fact displayed the symptoms of those diseases but with no underlying illness. His legacy is the Munchausen Syndrome.
A more complex and unfortunate syndrome is Munchausen by Proxy in which a parent causes his or her child to become ill or injured and the manifestation can then be treated medically. The parent becomes the center of attention as the child is cured and all is set right again. A more benign version of Munchausen by Proxy is a form of neglect (malnutrition, no immunization) that eventually most probably requires medical care.
An extreme example of Munchausen by Proxy in IT was the case of a DBA who set database parameters so that when volumes exceeded an anticipated threshold, performance became intolerable. This DBA would then come in and fix the problem. He was then heaped with accolades – which he otherwise never received. He had set up the problem and was now getting the same attention the parent in Munchasen by Proxy was getting.
Gerald Hodge, the President of HLS Technologies in Sugar Land Texas, specializes in anticipating performance problems in the DB2 world. He points out that most organizations seem to have no interest in keeping performance problems from happening but only in fixing them once the problems occur. This is a form of benign neglect, somewhat akin to not immunizing your children.
Performance service level agreements, for example “two seconds or less for 90% of the transactions,” are the norm for large organizations that recognize the heavy costs of terrible response time. Hodge observed that “The main problem with service level agreements is that they have no economic support. If they are not adhered to, someone may lose his bonus, but no money is put in play to solve the problem.” This goes to the heart of the problem; few organizations are proactive about poor performance, but only wait until response time become unacceptable.
Hodge also addressed threshold management. “If the service level is subsecond, then tuning should start well before the [subsecond] threshold is reached.” There are early warning systems that can alert the installation to impending doom long before the users notice anything. Only some organizations have established service level agreements for data warehouse ad hoc queries as the access paths and the resources used can be highly variable. Some have established benchmarks for selected predefined queries and the response time on these queriesis monitored.
Data warehouses have their own set of performance problems, albeit somewhat different than those in operational systems. When left to fester through benign neglect, they can lead to the failure of the data warehouse just the same.
Let us take a closer look at some of these problems.
- Data warehouses are becoming far too large. In many shops with a mature data warehouse, their sizes exceed those of the operational systems.
- Queries and reports are not as evenly distributed as they are in operational systems. For example, for financial and sales data warehouse queries and reports, the activity is usually concentrated around month-end and the beginning of the month.
- Query volumes are difficult to predict. Users have a difficult time estimating how many queries they will run. In fact, if the data warehouse is successful, users will generate far more queries than they might have predicted. Interesting results will raise new questions that need to be answered and more queries will be generated.
- It may even be difficult to get a good estimate of the number of users. As the data warehouse becomes more popular and the word spreads, more and more users will log on and try it out with, at first, simple queries. These simple queries may quickly grow into long, complicated and resource-intensive queries. If access to the database and user activity is not monitored and analyzed, these users can remain incognito for quite awhile.
- The access patterns of ad-hoc queries are difficult to predict. A new query may cause an access for which no index has been created and the result may be a long-running sequential job with some expensive joins.
- The load/refresh/update time often exceeds the allowed window. This has been one of the major performance problems with the data warehouse as the number of source files increase, as volumes get larger, as the data cleansing becomes more extensive, and timeliness requirements become more stringent. This is also a reason why hardware capacity planning for a data warehouse is difficult and cannot be considered as definitive as capacity planning for an operational system.
- Poor performance does not appear all at once. The problems gradually increase in size and magnify their impact on the system. The timing of when these problems surfaceis predictable, but only if the performance of the system is measured. A surprisingly small number of data warehouse installations bother to measure performance. The issue is usually one of lack of management awareness, concern and attention.
The following steps should be considered to keep your data warehouse from becoming ill.
- Sell management on the need to have good performance in the data warehouse. This may seem superfluous, as we assume that management would always recognize the need. Usually, management will give it lip service but will not allocate the right resources to make it happen. Sell them on the idea of being proactive to eliminate performance problems before they become apparent to the users. You may also consider what some organizations have done, which is to quantify the cost of poor performance and communicate that cost to management.
- Establish the role of a person responsible for data warehouse performance, usually a DBA. This person would work closely with people responsible for the hardware, the operating system and the network.
- Work with your performance management tool vendors, query/report tool vendors and DBMS vendors to provide a robust monitoring capability. You want to know who is generating the queries and reports, how many are they generating, CPU time, number of I/Os, time of day the query/report is executing, what data is being accessed, the access path, the size of the result set and the response time.
- Set some series of service level agreements (SLAs)– even if they are only internal IT agreements – for performance. These would include SLAs for benchmark queries and reports that are: 1. Simple, 2. Medium and 3. Complex. These benchmark queries would have known characteristics for CPU time and number of I/Os. Do not try to establish an SLA for undefined ad-hoc queries, as there is no way to know if they will generate ten I/Os or ten million I/Os.
- In the class that trains the users responsible for generating ad-hoc queries, include a module on performance that explains the use of indexes, how to avoid Cartesian products, the impact of asking for more than what is needed, the use of metadata and the use of existing canned queries/reports.
- Establish a canned query/report library with a responsibility for its administration. This responsibility would include an acceptance and performance test prior to the query/report being added to the library. Be sure to communicate to all users that such a library is available, and describe all the queries/reports in it. This would significantly improve use of the library.
- Establish lines of communication between the power users and the person responsible for data warehouse performance. The power users would indicate the intention for access to new data or to data in a different access pattern. This would give the person assigned to data warehouse performance time to estimate the impact of the new query and determine if anything should be done even before it is run for the first time.
- Be sure the database subsystem (DB2, Oracle, Teradata) has the proper attention. DBMS systems have a variety of tuning knobs, choices for sizes and numbers of pools, initialization parameters, free space management, indexes, partitioning, disk configuration and data set placement that will have a marked impact on performance. There are ways to encourage efficient access paths – those paths being entirely dissimilar between data warehouse and OLTP. Since the data warehouse and OLTP environments have such different characteristics, well-tuned and well-configured DBMS subsystems will have significant differences between their data warehouse and OLTP versions.
Most large organizations have people or even whole departments devoted to controlling performance. They are responsible for estimating performance, capacity planning, monitoring performance and tuning and correcting the problems. They work closely with other members of the organization who are directly responsible for the system, the database and the network. Some of these organizations are now focusing on the performance of the data warehouse.
A healthy data warehouse can remain healthy, but only if IT takes proactive steps to monitor, measure and tune the data warehouse databases. In addition, there needs to be a mindset and standards (SLAs) for excellent performance where activity should be invoked prior to hearing from complaining users.
About the Author
Sid Adelman is a principal consultant with Sid Adelman & Associates, an organization specializing in planning and implementing data warehouses, performing data warehouse and BI assessments, and in establishing effective data strategies. He is a regular speaker at “The Data Warehouse Institute” and IBM’s “DB2 and Data Warehouse Conference”. Sid chairs the “Ask the Experts” column on www.dmreview.com, and has had a bi-monthly column in DMReview. He is a frequent contributor to journals that focus on data warehousing. He co-authored one of the initial works in data warehousing, Data Warehousing, Practical Advice from the Experts, and is co-author of Data Warehouse Project Management with Larissa Moss. He is the principal author of Impossible Data Warehouse Situations with Solutions from the Experts and his newest book, Data Strategy, was co-authored by Larissa Moss and Majid Abai. He can be reached at 818 783 9634and firstname.lastname@example.org. His web site is www.sidadelman.com.