
It’s that time again! This month’s T-SQL Tuesday is being hosted by Jes Borland (blog | twitter), and the topic is: Aggregation!
When I think of aggregation as a T-SQL programmer, my first thought is usually “Why aggregate in T-SQL at all? Why not use Analysis Services?” Now, understand I’m in no way saying that one should never use functions like SUM, AVG, COUNT, etc in T-SQL; they can be very useful and perform well when used appropriately. But when I’m in a conversation and people start talking in phrases such as “Average over time” or “Comparison with the previous period”, I will usually jump in and say “Why don’t we use SSAS for this?”
One of my biggest projects at work was the construction of a data warehouse to store collected PerfMon data across all our SQL servers. Here are two common scenarios for how that data would be used:
- Baseline analysis and comparison – how busy is the server compared to others in the farm? Compared to itself last month? Last year?
- Point-in-time troubleshooting – The users complained that the application was slow at 11:21 AM on Tuesday; was the database server slow or was it something else? Using the collected data we could easily compare one specific period to a rolling average of that same time frame, and see if there were any appreciable differences.
For each of these use cases, I constructed some MDX scripts to easily allow my fellow DBAs to drill into the details of the cubes and get meaningful information back. So, I thought I’d share some of the tricks I used in writing them!
A Little Background
So that you can best understand the scripts, I wanted to briefly summarize the structure of the PerfMon cubes:
The cube consists of one measure and three dimensions:
- The CounterData measure, which contains all the actual PerfMon data in decimal form.
- The Calendar dimension, used to tie a piece of counter data back to the time at which it was captured. Contains information such as Date, Day Of Week, Month, Quarter, etc.
- The Counter dimension, used to tie a piece of counter data back to a specific counter instance across all servers. Contains information such as Server Name, Object Name (e.g. Processor, Memory, Logical Disk, etc), Instance Name (e.g. ‘M:’ for logical drives), and Counter Name (e.g. “Avg Secs / Write”).
- The Time dimension, used to tie a counter fact back to the specific time of day it was captured at.
Now, with that out of the way, let’s write some MDX!
Compare Two Periods
In this scenario, what we’re doing is looking at averaged data across two distinct periods of time, and finding the differences. We used this both for baseline comparison and for troubleshooting (i.e. to see if a server really was performing worse than average at a given time).
First, we’re going to create two named members:
WITH MEMBER [CurrentPeriod] AS
ROUND
(
AVG
(
(
[dim Calendar].[Date Key].[2010-08-08 00:00:00],
{[dim Time].[Hour].[17]:[dim Time].[Hour].[19]},
[dim Calendar].[Week Day].[All],
[Measures].[Average Counter Value]
)
), 3
), FORMAT_STRING = "#,0.000"
MEMBER [ComparePeriod] AS
ROUND
(
AVG
(
(
{[dim Calendar].[Date Key].[2010-06-01 00:00:00]:[dim Calendar].[Date Key].[2010-08-09 00:00:00]},
{[dim Time].[Hour].[17]:[dim Time].[Hour].[19]},
[dim Calendar].[Week Day].[6],
[Measures].[Average Counter Value]
)
), 3
), FORMAT_STRING = "#,0.000"
These two members are used to find the two average counter values from the two periods desired for comparison. In this case, we’re comparing Friday August 8th from 5PM – 7PM with all Fridays from 5PM – 7PM ranging from June 1st to August 9th. This might be used to compare a perceived “trouble spot” with the average across a long time span of similar periods. You could easily modify this to compare, say, the last three months to the same three month period the year before:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
|
WITH MEMBER [CurrentPeriod] AS ROUND ( AVG ( ( {[dim Calendar].[Date Key].[2010-06-01 00:00:00]:[dim Calendar].[Date Key].[2010-08-31 00:00:00]}, {[dim Time].[Hour].[08]:[dim Time].[Hour].[17]}, [Measures].[Average Counter Value] ) ), 3 ), FORMAT_STRING = "#,0.000" MEMBER [ComparePeriod] AS ROUND ( AVG ( ( {[dim Calendar].[Date Key].[2009-06-01 00:00:00]:[dim Calendar].[Date Key].[2009-08-31 00:00:00]}, {[dim Time].[Hour].[08]:[dim Time].[Hour].[17]}, [Measures].[Average Counter Value] ) ), 3 ), FORMAT_STRING = "#,0.000" |
Next, we construct a member that compares the two time periods:
|
|
MEMBER [Difference] AS ( [CurrentPeriod] - [ComparePeriod] )/[CurrentPeriod], FORMAT_STRING = "#,0.00%" |
Finally, we can use these calculated members to construct a nice clean grid showing us a bunch of common PerfMon counters over both periods:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
|
SELECT {[CurrentPeriod], [ComparePeriod], [Difference]} ON 0, ( { ( [dim Counter].[Object Name].[Processor], [dim Counter].[Counter Name].[% Processor Time] ), ( [dim Counter].[Object Name].[LogicalDisk], { [dim Counter].[Counter Name].[Avg. Disk sec/Read], [dim Counter].[Counter Name].[Avg. Disk sec/Write], [dim Counter].[Counter Name].[Disk Bytes/Sec] } ), ( [dim Counter].[Object Name].[PhysicalDisk], { [dim Counter].[Counter Name].[Avg. Disk sec/Read], [dim Counter].[Counter Name].[Avg. Disk sec/Write], [dim Counter].[Counter Name].[Disk Bytes/Sec] } ), ( [dim Counter].[Object Name].[Network Interface], [dim Counter].[Counter Name].Children ), ( [dim Counter].[Object Name].Children, { [dim Counter].[Counter Name].[Forwarded Records/sec], [dim Counter].[Counter Name].[Full Scans/sec], [dim Counter].[Counter Name].[Index Searches/sec], [dim Counter].[Counter Name].[Page Splits/sec], [dim Counter].[Counter Name].[Workfiles Created/sec], [dim Counter].[Counter Name].[Worktables Created/sec], [dim Counter].[Counter Name].[Buffer cache hit ratio], [dim Counter].[Counter Name].[Free list stalls/sec], [dim Counter].[Counter Name].[Free pages], [dim Counter].[Counter Name].[Lazy writes/sec], [dim Counter].[Counter Name].[Page life expectancy], [dim Counter].[Counter Name].[Page lookups/sec], [dim Counter].[Counter Name].[Page reads/sec], [dim Counter].[Counter Name].[Page writes/sec], [dim Counter].[Counter Name].[Batch Requests/sec], [dim Counter].[Counter Name].[SQL Compilations/sec], [dim Counter].[Counter Name].[SQL Re-Compilations/sec] } ), ( [dim Counter].[Object Name].[SQLServer:Buffer Manager], [dim Counter].[Counter Name].Children ), ( [dim Counter].[Object Name].[SQLServer:SQL Statistics], [dim Counter].[Counter Name].Children ) }, [dim Counter].[Instance Name].Children ) ON 1 FROM SQLDW |
I used this query many times to justify adding additional capacity to our farm, or to tell an application team to take a hike and stop blaming my database server for their app’s poor performance.
Comparison With The Farm
What about finding out how one server compares to all the others in the farm? We can do that too.
First, construct a series of calculated members that find the sum total of counter values across the farm, then the count of servers, then calculates the average counter value:
|
|
WITH MEMBER [measures].[farm total] AS ( [Dim Counter].[Machine Name].[All] ,[Measures].[Average Counter Value] ) MEMBER [measures].[count of machines] AS NonEmpty ( [dim Counter].[Machine Name].CurrentMember.Parent.Children ).Count MEMBER [measures].[farm average] AS [measures].[farm total] / [measures].[count of machines] |
Then, use that calculated member to look at how a particular counter value compares to the rest of the farm:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
|
SELECT NON EMPTY ( {[dim Counter].[% Processor Time]} ,{ [measures].[counter value] ,[measures].[farm average] } ) ON COLUMNS ,NON EMPTY { ( { [dim Time].[Hour].[9] ,[dim Time].[Hour].[10] } ) } ON ROWS FROM sqldw WHERE [dim Counter].[Machine Name].[MyDBServer] |
In this case, I was comparing processor use between the hours of 9AM and 10AM.
In Summary…
So there you have it: two great ways we can use the power of MDX and SSAS to get some very useful data out of an otherwise massive collection of numbers (the main fact table contains over 4 billion rows and counting). That, I think, is the essence of a good aggregation strategy: slim down an overwhelming amount of data into meaningful chunks.