Why do we use GTD?

I’ve blogged a bit over the life of this site about my use of a life-management framework called GTD and how profoundly it’s affected my life. Sometimes when I talk to people about it, a common response is “but why would I want to spend so much time keeping track of what I have to do instead of just doing it?”

The answer is that having everything in those magic lists gives you a kind of ultimate peace of mind that’s, well, hard to describe for those who haven’t experienced it. Now I know that makes it sounds like some kind of hippy-ish drug-induced trip; but really, the simple confidence of knowing all your responsibilities (“agreements” as David Allen, author of GTD calls them) will be there waiting for you when you come back is nothing short of priceless.

As I was reading my blogs today I found a great entry from fellow GTD’er and SQL community member Brent Ozar (blog | twitter). I think the passage below beautifully describes this “mind like water” (another David Allen phrase) state:

At around 5PM, when I’m not on the road, i leave my home office and my tasks behind.  I walk Ernie (our dog), get the house ready for Erika’s return from work, and leave the workday problems behind.  I’ll still check email from my ozone when we’re not doing anything, and I’ll respond to quick questions, but I won’t do work.

And I won’t care.

I won’t stress out about things I have coming tomorrow, won’t get worried about what a client’s server is doing, won’t work late trying to “get ahead” – because there’s no such thing.  As a knowledge worker, I’m going to be behind for the rest of my life.  The better I am at accomplishing stuff, the more work people will give me.  At 5PM, I have to change contexts because I won’t ever be caught up in my home life either.

via GTD: Why Things Have Been Quiet Around Here | Brent Ozar – Too Much Information.

Thanks for helping spread the word Brent. It’s always great to see others having the same experience and speaking eloquently about it to the world.

Now if you’ll excuse me, I’ve been neglecting that @home list of mine.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

It’s Vacation Prep Time! – T-SQL Tuesday #009

T-SQL Tuesday
It’s that time again, when SQL bloggers from all over join forces to blog for a common theme. That’s right: T-SQL Tuesday!

Background: T-SQL Tuesday is the brainchild of Adam Machanic (Blog|Twitter) and is described as “the SQL Server blogosphere’s first recurring, revolving blog party. The idea is simple: Each month a blog will host the party, and about a week before the second Tuesday of the month a theme will be posted. Any blogger that wishes to participate is invited to write a post on the chosen topic.”

This month’s theme: “Beach Time”, hosted by Jason Brimhall (blog | twitter). Basically, we’re being asked to write about “What do you do prior to ‘Beach Time’ to ensure that the beach time will not involve work.”

This is actually a good topic for me, as last month I took the longest consecutive vacation of my working life at two full weeks. It was challenging to get everything in order before leaving, and I did get a phone call once (ironically on the first day off), but all in all my team did great work and kept all my projects in good order while I was gone (with one minor exception, but that was at least partially my fault as you’ll see).

So how did I pull this extended absence off?

I documented the status of all my projects, including what was outstanding from others and what needed to be done while I was gone.

We have a wonderful SharePoint site where we keep all our projects listed and have detailed status notes on all of them. This means that if any one of us ever had to suddenly be out for an extended period of time, someone else could just glance over their list and pick up where they left off.

This reminds me strongly of one of my GTD habits: keep things in lists and not in your head. By forcing us (and yes, I do sometimes neglect it a little, hence my first (but only) phone call) to keep this updated, we don’t have to rely on our memory to keep track of what’s on our plates.

I made sure my runbooks were up to date.

Just before I left, I closed out a major phase of a project to build a massive data warehouse containing PerfMon data for all the SQL servers we support. It’s a complex system with multiple inputs and moving pieces. Because of this, I made sure the write up a thorough document detailing the ins and outs of the system, including architecture and troubleshooting steps. No way was I being called when I was away because this thing broke.

Now as it turns out, the system did fail while I was gone, but was unnoticed! And while my teammates sheepishly admitted that they missed the alerts and should have at least dug into them a little further, I also took responsibility for not ensuring they (the alerts) were a) urgent enough to be noticed, b) clear enough to tell the reader where to get more information.

Once I was gone, I made damn sure I was very hard to get in touch with.

While I am the first to admit I have a pretty bad addiction to my Crackberry, when I’m on paid vacation I strictly enforce the “no e-mail rule”. I go so far as to actually turn off the mail synchronization on my phone and make it clear in my out of office that I will not be checking my mail at any time during my absence. Only my teammates know my phone number and they are told they can call me if absolutely necessary, but that I may not immediately respond. While I do tend to take my work laptop, that’s only because it’s also my playground for testing SQL related things, and I do not connect to the office.

Being selfish? Maybe a little, but hey, it’s my vacation time and I’m damn well going to enjoy it and not waste it working. That being said, on this last vacation I did end up working for about 3 hours, but that was wholly my fault for not properly documenting something before I left.

I have a great, strong team around me.

While this may not have much directly to do with me, I consider it a crucial factor in vacation success. If your team can’t hold down the fort, even given good documentation, then you’re pretty much sunk. I’ve been in those situations before, and let’s be honest: it sucks. There’s few things in the world more maddening than getting called on vacation because “Mr. So and So is here and he says you told him X, but that doesn’t make sense?”, especially when your conversations with Mr. So and So are explicitly detailed in writing, and readily accessible to your teammates. That’s a real example by the way, from my time as a retail store manager. Let’s just say that individual got politely redirected to where I kept a log of all my conversations, along with a gentle reminder that I was not to be disturbed. Thankfully, this is not the case where I am now, and I am eternally grateful for the hard work and dedication of my fellow DBAs.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

Covering Index = Epic Win, Part 3

In part 2 of this series, we looked at the lengths SQL Server will go just to avoid the dreaded bookmark lookup, including ignoring an index altogether and choosing instead to scan the entire clustered index. But, we also showed how that decision was, in fact, a smart one, due to it resulting in fewer logical IOs.

But let’s face it, neither path (bookmark lookup or clustered index scan) is very efficient or scalable. So, what can we do to give SQL some better tools?

The answer, it seems, is surprisingly simple: create a covering index. That is, we need to create (or modify an existing one) an index that contains all the columns required for the query. But how do we determine what those columns are?

The easiest way I’ve found is to examine our execution plan, specifically the Bookmark Lookup node. Right click on the icon and choose “Properties”. Then, in the properties window, look for a line called “Output List”. Note: if your plan includes a scan, table or index, then use that operator as your root node instead.

If you click on that little “…” icon, what you’ll see is a nice list of the columns being requested by the query:

In this case, we see that the query is requesting the AccountID and Amount columns. So, how do we add these to the index?

Well, there are two options: 1) add them to the index key itself, 2) add them as included columns. For reasons I’ll illustrate in a moment we’ll see that number 2 is the better option, at least in this specific case.

First, adding them to the key:


DROP INDEX NCI_tempA_TransDate ON dbo.tmpA;
GO
CREATE INDEX NCI_tempA_TransDate ON dbo.tmpA (TransDate, AccountID, Amount);

Now, let’s run the same SELECT query:


SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/1/2010';

And the resulting execution plan:

…and STATISTICS IO:

Table 'tmpA'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Both clearly an improvement over the previous results seen in part two (lookup and 217 logical reads).

“Great!” you say, “I’ll go start adding columns to my indexes right now!”

Hold on there… this isn’t all rainbows yet. You see, by adding those columns to the index key itself, we could be incurring additional overhead in terms of index maintenance. Plus, I’m pretty sure if index gurus like Paul Randal (blog | twitter) catch you adding things like large VARCHAR columns to an index key, without them actually being used in filter operations, they just might hunt you down and… well, let’s just say I doubt it’d be pretty. Note: if anyone out there can tell me how to show the difference in behavior via an execution plan or STATISTICS IO I will update this page and give them due props, as I was not able to get any difference between the two plans.

So now let’s look at option two: add the columns using the INCLUDE functionality in SQL 2005+:


DROP INDEX NCI_tempA_TransDate ON dbo.tmpA;
GO
CREATE INDEX NCI_tempA_TransDate_inc ON dbo.tmpA (TransDate)
INCLUDE (AccountID, Amount);

Running our SELECT statement gives an identical execution plan and number of logical IOs as when the columns are placed in the index key. But now we’ve skipped making our index key wider than necessary. Truly a great feature.

Before I close out the series, I should mention that even this second path doesn’t appear to be completely devoid of peril. By adding these INCLUDEd columns, you will be adding to the amount of storage required by the index. For example, the following information was returned by running a query against the sys.dm_db_index_physical_stats DMV, using the page_count column for indexes on our tmpA table:

For explanatory purposes, the indexes are named as follows:

  • NCI_tempA_TransDate_inc – key = TransDate, included = AccountID, Amount
  • NCI_tempA_TransDate_inckey – key = TransDate, AccountID, Amount
  • NCI_tmpA_TransDate – key = TransDate, no included columns

So as you can see, adding INCLUDED columns to the key does indeed increase the storage required by the index. So do keep that in mind when you perform this kind of tuning work, especially on larger databases.

And so we’ve come to the end of our little series. Hopefully you will find this information useful the next time you are tuning a query and notice our little friend, the bookmark lookup (or even a index / tables scans). If anyone has any great stories to share on how this functionality helped them solve a production issue (or prevent one in development), I’d love to hear it in the comments. Props to anyone who can beat my best of reducing a query’s run time from 27 minutes to 44 seconds!

INSERT dbo.tmpA (AccountID, Amount, TransDate)
SELECT  TOP 100 name, CAST(object_id as MONEY), ‘1/5/2010′
FROM sys.columns
Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

Covering Index = Epic Win, Part 2

In part one of this series, we looked at how a non-clustered index can be made into a real performance bottleneck because of a nefarious little fellow called a “bookmark lookup”. We showed how this operation, which is how SQL Server retrieves the requested columns not part of an index, can result in excess IOs and generally poorly scaling performance.

In this part, we’ll look at how SQL Server will even go so far as to ignore an index completely, just to avoid our little friend the lookup.

First off, let’s look again at our query execution plan from our SELECT query shown in part one:

We see how the most expensive part of the query is the “Key Lookup” operation, by a margin on 99 to 1. Pretty obvious that it would be the bottleneck in this case.

Now this particular query returned exactly one-hundred rows. Now let’s see what happens if we increase the number of returned rows a little bit.


SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/2/2010';

This query returns around one thousand results. Let’s look at the execution plan here:


Whoa, wait a second. Why is SQL Server using a clustered index scan operation? Don’t we have an index on the column being filtered on?

Well, yes we do. But see, SQL is generally pretty smart about choosing how to execute it’s queries (though at times I might debate that statement). And while I can’t speak for how it’s making these decisions, I would hedge a bet that one of the deciding factors in constructing an execution plan is IO cost. So, let’s look at the output of STATISTICS IO for this query.

Table 'tmpA'. Scan count 1, logical reads 549, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

549 logical reads total. Now, let’s force SQL to use our index by putting an index hint in the query:


SELECT AccountID, Amount FROM dbo.tmpA WITH (INDEX = NCI_tmpA_TransDate) WHERE TransDate = '1/2/2010';

Now you’ll have to take my word for it, but the execution plan now looks identical to the one we saw in part one. However, let’s look at the STATISTICS IO output:

Table 'tmpA'. Scan count 1, logical reads 2076, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Well that looks pretty bad. The number of logical IOs increased from 549 to over two-thousand. Now we see why SQL might choose to simply scan the clustered index.

So, why would it choose to use the index in the first case? Well, let’s look at STATISTICS IO for our original query:

SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/1/2010';
Table 'tmpA'. Scan count 1, logical reads 217, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Ok, so a total of 217 logical reads, and the index is being used. Now, let’s force it to use the clustered index and see what we get.

SELECT AccountID, Amount FROM dbo.tmpA WITH (INDEX = PK_CI_tmpA) WHERE TransDate = '1/1/2010';
Table 'tmpA'. Scan count 1, logical reads 549, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

See? In the case of our first query, SQL again made the right choice, choosing the operation which resulted in only 249 logical IOs, versus 549 for the clustered index scan.

Somewhere, a SQL developer is probably yelling “See?! Why do you doubt me?!”

OK SQL, we’ve now shown that you make the right call on what operation will be the least expensive to service the query we gave you. But, that doesn’t mean we can’t give you better tools to do the job. That’s what we’ll be covering in part three of our series. I promise we’ll get to the “epic win” part there too.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

Covering Index = Epic Win, Part I

As if I needed yet another reason to push all of the products I support on to SQL 2005 and up, but in the last few days I’ve been saved at least three times by the wonder that is the INCLUDE columns feature.

From BOL:

INCLUDE (column [ ,... n ] )
Specifies the non-key columns to be added to the leaf level of the nonclustered index. The nonclustered index can be unique or non-unique.

Column names cannot be repeated in the INCLUDE list and cannot be used simultaneously as both key and non-key columns. Nonclustered indexes always contain the clustered index columns if a clustered index is defined on the table. For more information, see Index with Included Columns.

All data types are allowed except text, ntext, and image. The index must be created or rebuilt offline (ONLINE = OFF) if any one of the specified non-key columns are varchar(max), nvarchar(max), or varbinary(max) data types.

Computed columns that are deterministic and either precise or imprecise can be included columns. Computed columns derived from image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml data types can be included in non-key columns as long as the computed column data types is allowable as an included column.

In all three cases that I worked, the issue at hand was that some older, rather dusty stored procedures were performing badly. In one case, the procedure took almost thirty minutes to return a mere six thousand rows to be used in a report. Needless to say, because we all know how important reports are to those who use them, this was suddenly deemed totally unacceptable and had to be investigated at once. Luckily, I do love performance tuning.

So upon examining the execution plan of the query in question, the main points of note were:

  1. The majority of the total query cost was taken up by clustered index scan and bookmark lookup operations against large tables.
  2. In all cases the predicate of those operations was a date range (these are tables that hold various kinds of transactions).
  3. In one case, the query executed using a loop join with the transactional table being both the lower (multiple execution) and upper part of the join, courtesy of an index scan and clustered index scan. Notably, the estimated number of rows from the top portion was only one, whereas the actual number was something in the thousands.

In this post (and probably the next one or two), I’m going to go about reproducing the issues described above and show how with some simple index tuning, problems were all but eliminated. That’s right, no code changes. While I am certainly a capable T-SQL coder, I am just as loathe as anyone to crack open these ancient stored procs, whose owners are long gone and yet still are highly critical in daily operations. I don’t know about other DBAs, but my work is in production support, not development and testing code.

Setup Work

First we’re going to create a test table, put an index on it, and put some data in it:


IF OBJECT_ID('dbo.tmpA') IS NOT NULL
 DROP TABLE dbo.tmpA;
GO

CREATE TABLE tmpA
(
 ID INT IDENTITY(1,1),
 TransDate DATETIME,
 AccountID VARCHAR(255),
 Amount MONEY,
 CONSTRAINT PK_CI_tmpA PRIMARY KEY CLUSTERED (ID)
);
GO

CREATE NONCLUSTERED INDEX NCI_tmpA_TransDate ON dbo.tmpA (TransDate);

-- Insert some records --
INSERT  tmpA
(
 TransDate,
 AccountID,
 Amount
)
SELECT  TOP 100
 '1-1-2010',
 b.name,
 CAST(a.object_id AS MONEY)
FROM    sys.columns a, sys.columns b
ORDER BY b.object_id;

INSERT  tmpA
(
 TransDate,
 AccountID,
 Amount
)
SELECT  TOP 1000
 '1-2-2010',
 b.name,
 CAST(a.object_id AS MONEY)
FROM    sys.columns a, sys.columns b
ORDER BY b.object_id;

INSERT  tmpA
(
 TransDate,
 AccountID,
 Amount
)
SELECT  TOP 10000
 '1-3-2010',
 b.name,
 CAST(a.object_id AS MONEY)
FROM    sys.columns a, sys.columns b
ORDER BY b.object_id;

INSERT  tmpA
(
 TransDate,
 AccountID,
 Amount
)
SELECT  TOP 100000
 '1-4-2010',
 b.name,
 CAST(a.object_id AS MONEY)
FROM    sys.columns a, sys.columns b
ORDER BY b.object_id;
GO

Now, let’s execute a little query. Make sure you have the “Display actual execution plan” option in SSMS turned on.


SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/1/2010';

Now, take a look at the execution plan. What you’ll see is probably something similar to this:

See that little “Key Lookup” operator? Notice how it is consuming a whopping 99% of the total query cost? That my friends, is a bad little son of a…. well, you get my drift.

From Grant Fritchey (twitter | blog) and Sajal Dam’s (anyone know if Sajal is on Twitter or has a blog? I couldn’t find one but will gladly put it here) outstanding book SQL 2008 Query Performance Tuning Distilled (p. 163 to be exact and give proper credit):

A major overhead associated with nonclustered indexes is the cost of excessive key lookups, commonly known as bookmark lookups, which are a mechanism to navigate from a nonclustered index row to the corresponding data row in the clustered index or the base table.

When a SQL query returns a small number of rows, the optimizer can use the nonclustered index, if available, on the column(s) in the WHERE or JOIN clause to retrieve the data. If the query refers to columns that are not part of the nonclustered index used to retrieve the data, then navigation is required from the index row to the data row in the table to access these columns.

Now this behavior was only seen in a few of the execution cases I was studying, namely those with smaller result sets (that makes sense I suppose given what was stated above about “small number of rows”). In most of the other examples, instead of this bookmark lookup, I was seeing a simple clustered index scan. Nonetheless, I did want to point out this (commonly seen in my somewhat small experience) kind of behavior, since we know it can be a major performance bottleneck. It’s also a good example of how you cannot just through an index on a oft-queried column and assume that everything will be fine from there out.

Next time we’ll see how even more useless indexes can be, but also begin talking about how we can make them worth their weight in gold.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

Narrow Focus: Blessing Or Curse?

I think pretty much everyone who knows me would say that one of my defining characteristics is my singular focus. Once I put my mind to solving some problem or accomplishing something, I damn well poke at it until it’s finished. Now, in many cases this trait has served me quite well; it’s what allowed me to learn programming and SQL Server with no formal training or even so much as a seminar (thank you Internet and amazon.com), and why I consider myself an above average problem solver.

One of my favorite teachers once told me that at times I “missed the forest for the trees.” He went on to explain that at times my work tended to find it’s conclusion too early, then ignore all other points of view. Narrow focus is often good in writing, but not when it excludes other angles of approach completely.

The same could be said of how I function as a DBA. Take a project I am working on: the design and build of an enterprise data warehouse, storing performance data for every SQL server in our environment. It’s been fascinating and a tremendous learning experience, but also at time a real challenge.

Early on one of the requirements was to collect performance data on a second-by-second basis, essentially forming a “baseline” of data to review and use for later comparison. Now being the aggressive lad I am, I expanded upon this as follows: the warehouse should now store second-by-sec0nd data for every server at all times, with a retention of 2 years. Yes, you read that correctly. Two years of data, with points at every second for fifty plus servers. That’s a lot of data folks (think 1,800 rows per minute, per server, or around 129,600,000 rows per day).

So, I set out to prove it could be done. Here’s a brief list of some of the challenges I’ve hit along the way:

  • PerfMon’s built-in direct-to-SQL functionality scales downright awfully. The structure of the tables would make any decent DBA shudder (the primary key of the main, and largest, table has a GUID as it’s first column, as an example), the use of bulk inserts (not using MS’s own optimization schemes), and statistics issues to name a few.
  • After abandoning that approach we are going with dumping the performance data into flat files, which are then picked up by a SSIS package at intervals, which required a custom script component acting as a data flow source. This is due to the fact that PerfMon records its data in .csv files with a dynamic and unpredictable number and order of columns. Custom logic had to be written to essentially perform a reverse pivot of the data into a more normalized form.
  • Even fully optimized, the package simply could not keep up with the incoming flood of files. They would eventually pile up and consume disk space, not to mention having stale data in the warehouse. This was attacked by custom coding a script task in a parent SSIS package that would dynamically spawn multiple threads running a child package to import one file each, up to ten at once (I’m shocked by the way that this isn’t either a standard part of SSIS or that a suitable aftermarket component could not be found). Naturally this promptly brought the server to its knees, being a poor little VM with only four CPUs. But the files did import faster!

Now, throughout all of this, keep in mind that I was going far beyond the original and still stated requirements given to me. Why? Because dammit, I was out to prove that such a huge amount of data could be handled by a little VM running SQL 2008 Standard Edition.

Well, today I cried Uncle. In an effort to actually get some usable data into the system, we’re (read, I’m) reducing our goals to only baselining one server at a time, with others collecting a mere four data points per minute. And damn, I am disappointed we haven’t gotten there yet. But you can believe I’m not giving up. But at the same time, I look back and consider all the time I spent chasing what, pride?

My old teacher was right: I need to stop staring at the one big tree in front of me and enjoy the woods a bit more. Luckily, there’s this little methodology I follow called GTD, and something tells me it’s going to be very useful in this upward movement of focus.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

MemToLeave In Wonderland

So today I spent basically the entire day working on troubleshooting an issue that is related to a little SQL Server technology called MemToLeave. Now I’m not going to go into extreme technical detail on what exactly this is; instead I’ll direct you towards some other excellent resources:

http://sqlblogcasts.com/blogs/christian/archive/2008/01/07/sql-server-memtoleave-vas-and-64-bit.aspx

http://sqlblog.com/blogs/jonathan_kehayias/archive/2009/07/07/understanding-the-vas-reservation-aka-memtoleave-in-sql-server.aspx

http://blogs.msdn.com/khen1234/archive/2005/05/08/415501.aspx

Basically MemToLeave is an area of reserved memory outside of the buffer pool, which is commonly used for such operations as extended stored procedures, CLR, and linked servers. It only affects 32 bit systems, is limited to a maximum amount of 512MB, and can be set using the “-g” startup parameter.

In my short experience as a DBA, this is far and away the most complex and mind-bending piece of investigative work I’ve done by a landslide. After spending nearly three hours Googling  and digging into various DMVs, I had made very little headway. This is real down-the-rabbit-hole stuff; we’re talking about understanding and trying to determine what is happening deep within the internals of SQL Server. Even my senior DBA, who has a tremendous wealth of knowledge and more than a decade of experience, was largely at a loss to explain what was going on. I even went so far as to use the great Twitter #sqlhelp hashtag (and got some good information, thanks David Levy (twitter | blog)) to try my best to get some insight. As it turned out, the Twitter community helped to push us in what we believe to be the right direction.

While I’ll leave the technical details for another post, our basic conclusion after nearly a day’s worth of work is this: due to a tremendous amount of statements in the procedure cache (the application using this DB makes a maddening number of adhoc calls, in such a way that each is stored as its own plan, even with forced parameterization on), and the fact that a large number of these plans were stored in multi-page arrangements (see Jonathan’s blog post above), MTL became suddenly and rapidly depleted. Normally we’d be forced to stop and start SQL in order to clear this up, but in this case, all we ended up having to do is to a) wait for the application to cease its (adhoc) insert heavy behavior, then run a DBCC FREEPROCCACHE. Before running the command, the return result of xp_memory_size was around 3MB; after it immediately jumped to ~21MB. Also, we saw that the CACHESTORE_SQLCP memory clerk type went from consuming around 180MB to around 1/100th of that (we used the queries on Brent McCraken’s blog here, plus some home-brewed ones of various DMVs). The server stayed quiet the rest of the day.

While I’m far from convinced, the evidence is pretty strong that we’ve found at least a contributing factor. The next bulk series of imports into the system happens overnight, so in the morning we’ll be checking in to see how we are doing.

That’s all for now, but once we have more data I’ll probably be posting a followup with the exact queries and results. What a day!

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

Blackberries & EOD

More often than I’d like, I get e-mails from various folks late in the evening (the record is about 11pm) somewhere along the line of: “This is really urgent and needs to be done EOD today.” Now, I have to assume that either one of two things is going on here:

  1. Their End Of Day is different than mine. Not in the time zone sense (though I do occasionally deal with folks offshore), but just in the sense of they work hours beyond the normal workday.
  2. They’re being clever and trying to be the first mail in my inbox the next morning.

Now in the case of #1, obviously they’re not readers of my blog. Otherwise, they’d see my prior post where I was pretty clear about my policy with regards to working after hours: high value and emergency work only. That’s not to say there have not been some cases where the work met one or both of those criteria; just merely that most of the time it doesn’t.

In the case of #2, for one I’d laugh because with the sheer volume of e-mail I get, no matter where you end up it’s most likely not at the top of anything. That, and even if you do somehow manage to be the first thing I read in the morning, that in no way affects where in the queue of actions you’ll end up. Guess they’re not subscribers to the GTD style of organization, huh?

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

What Is The Role Of A DBA?

Lately this question has been a bit of a hot discussion topic at work. Lots of folks have differing opinions, many of which have merit. So I’m curious, what do folks think the true primary role of a DBA is?

For those of you who might answer “it depends on what the client wants”; that’s a perfectly true answer.  If that’s your opinion, I’d ask that you pick the poll answer closest to what is true for you and your clients.

And if anyone has suggestions on other roles, please leave them in the comments and I’ll gladly look at adding them.

What is the primary role of a DBA?

View Results

Loading ... Loading ...
Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter

On Working Outside Of Monday – Friday

Brent Ozar (blog | @BrentO) wrote a great post the other day that I just read that touched on the oft-seen differences between management’s priorities and our own. Even having only been a DBA for a mere month (damn, has it been that long?) I’ve already encountered situations like this all too frequently.

Troubleshooting and digging into data are two of my favorite things to do when it comes to DBA work. I love pulling numbers out and slicing-and-dicing them every which way until a pattern emerges. In this light, I’ve been really excited lately about the prospect of building a kind of DBA data warehouse: a place where aggregated performance statistics from all our servers are kept and can be analyzed in detail. It’s not a novel concept by any means, just one that’s never been implemented at work. With this in place we could engage in detailed benchmarking of our environment, and proactively find developing “hot spots” before they become real fires.

The trouble is, we can never seem to find time to work on it. There’s always other things to work on, whether it be dealing with issues or building more servers to add capacity. This is really tough to handle because as a team, we pride ourselves on keeping things running at peak efficiency, like a well-trained pit crew for a championship racing team. We can clearly see the value in this project and how much it would help us.

But all that being the case, I won’t be spending my free time working on that. If I do spend time outside of business hours working (something I try to avoid whenever possible), it’ll be on those other, more important items.

Yes, you read correctly: “more important”. “But Josh”, you say, “you just said that stability is your team’s top priority, and to boot that you find your biggest intiative in that area incredibly exciting. Why the heck are you calling the other things ‘more important’?”

The answer is simple: because that’s what my clients want from me. Like it or not, right now the word is that the primary focus must be on building for new initiatives and responding rapidly to the changing needs of our industry. Granted that stability was ranked right behind that, and I am certainly going to stake a case to management about the need for some time to be allocated to our bechmarking project. But as I commented on Brent’s post (modified slightly):

I think in this case you have two ways to go: you can continue to play the role of the a-hole that is always arguing about priorities and complaining that he has to work on weekends to get his “important” work done, or you can accept that your client’s priorities are what they are, and try and work within them. Well, I guess there’s a third choice: quit and start blogging about SQL Server full time, but most of us probably aren’t that brave in this economy.

So thanks for the reminder Brent; while I sure do love building warehouses and crunching numbers, I won’t be doing any of that outside of 8AM – 5PM Monday – Friday anytime soon.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • LinkedIn
  • Google Bookmarks
  • Technorati
  • FriendFeed
  • Slashdot
  • Twitter