Goals for 2011

Posted by Josh | Posted in GTD, The Rookie DBA | Posted on 02-02-2011

Tags: , , , ,

0

Lately I’ve seen several folks on Twitter talking about “updating their 30, 40, and 50,000 foot GTD goals.”. This made me realize that while I’ve really been cruising along in the day-to-day arena, I’ve pretty much neglected some higher level thinking. When you’re working 9 hour days and the to-do lists are always full, it’s pretty easy to lose sight of considering larger goals and visions. Or at least it is for my ADD riddled brain.

50,000 Feet – Life’s Purpose

If I talk about my big time life goals, I would say at the moment they reside in three categories:

Be a wonderful father and husband (and son) for my family

I want to provide us with enough to live comfortably, but not work so hard as to interfere with spending time with my son and wife. Family time is extremely important to me.

Become a well known and respected member of the SQL Server community

After knocking around for years in various technical roles, I really feel that I’ve finally found my niche. I love working with SQL Server; the internals, the various moving pieces, the never ending amount of information to learn. And as I’ve continued to learn and grow, I’ve really become enamored with the SQL community; not in a weird man-crush kind of way, but in a “wow the combined amount of knowledge and experience present here is amazing” kind of way. I really want to become more engaged in being a part of that.

Become a published author

Ever since I was a kid I’ve loved writing. It’s always been a dream of mine to write a best selling novel. Not one of those cheap drug store romance pieces, but something really deep and dark, that gives its readers a real emotional reaction. I’ve picked up more books this year than any year since high school, and the end result has been a real return of that hunger to create. When I read stories like Ender’s Game and The Deathly Hallows, and watch some phenomenal visual narratives like Battlestar Gallactica and Lost (except for the last season, which I think overall was pretty bad), I’m continuously left with a feeling of “I want to write that.” By no means an easy thing to do, but I’m keeping it on the list.

40,000 feet – 3-5 year goals

Present at least once at a SQL community event

I still consider myself very much a rookie in the SQL Server realm, so I’m giving myself a couple of years yet before I start expecting to be at a level where I can present. I’m constantly coming up with ideas, which are getting filed away in a safe place (i.e. anywhere but my brain) for later use. That way, when I feel I’m ready, I’ll have plenty of material.

MCITP Certification

I’ve only recently begun studying for my 70-433 and 70-432 exams, and while in a lot of cases I’m finding that I already know the material, there’s also a lot to learn. As you’ll see later I’m hoping to pass those two exams within a year, but to get the full MCITP certs I’m giving myself a little longer, both for experience and financial reasons (at least for the moment it looks like I’ll be paying my own way).

Take on some side consulting work

Make no mistake: I love my job and the company I work for, and I’m not looking to quit and move into full time consulting. But I would like to start entering into that world, both for experience and, to be totally honest, the extra money. I don’t know how feasible this will be, since I won’t be able to commit to anything more than 8-12 hours per week. I’m hoping I can find some “Remote DBA” style opportunities, and maybe some performance analysis / tuning ones.

30,000 Feet – 1 year goals

Servers at 100% Patch Level

When I inherited my current environment of around 35 SQL Servers, one of the first things I did was to assess their health. Overall I found them in pretty poor condition, especially in the realm of patch level. Many were at least one major service pack behind, which is completely unacceptable. This year, that will change.

Build a complete replica of production

Right now our development environment is scattered and disjointed; some servers have many components (i.e. Analysis Services, Reporting Services), while others have only the Database Engine components. This does not make for valid testing, since it looks nothing like production. By the end of the year, that will change, and I will have a complete setup (including network topology) mirroring our production systems.

Start a SQL community at work

Because I’m very much a “teach a man to fish, don’t catch the fish for him” kind of guy, I want to start helping to empower the folks around me to be better SQL Server users. I’ve decided to do this via series of learn-over-lunch sessions where we’ll all get together and study up on the latest trends in SQL. Sometimes we’ll just be viewing webinars, others we might jointly look over some troublesome code and decide how to fix it. Having already had one such session the response so far has been very positive, and I’m excited to keep on learning.


That’s about it for now. I’m sure things will come on and move off over the course of the year, but I’m really going to try and tie my actions to these higher goals as much as possible. From time to time I think I’ll check in and see where I stand, maybe once a quarter.

What are your big goals for the year?

It’s The Business’s Way Or The Highway – TSQL Tuesday #13

Posted by Josh | Posted in SQL Server, The Rookie DBA | Posted on 13-12-2010

Tags: ,

1


It’s that time again! When SQL Server professionals across the internet gather together to wax poetic about their craft and provide some good reading material. That’s right, it’s T-SQL Tuesday!

This month is being hosted by Steve Jones (blog | twitter), and the topic is the challenges we face as database professionals in our dealings with the businesses we support.

You can have two, and they’re not picking quality…

There’s a common saying I’ve heard in software development: you can have it good, cheap, and take a long time, or you can have it good and expensive and take a short time.” Unfortunately there’s usually a third option, which is “fast and cheap”. It seems all too common that businesses want to rush a product out the door in an attempt to be first on the market, in the process neglecting such items as proper QA testing and performance evaluation. To the salesperson promising things to clients, these things don’t matter; like the mortgage brokers that brought our economy to its knees, once the contract is signed they’re off the hook.

What they don’t see is the tremendous hidden costs associated with throwing bad software on the table. I’m talking about hours spent deploying without good instructions (hell, I’d settle for any instructions half the time), more time spent troubleshooting countless issues discovered after the fact, and even more time rolling out patch after patch. Ultimately the cost may even be the support of the customers, as they take their business elsewhere or simply refuse to use this “brilliant” new product because of the bad taste left in their mouths.

How do we prevent this from happening? Here’s where I’d love to hear from others, because I have by no means found any kind of reliable solution for this. Certainly there are things to do that will help prevent it: thorough code reviews, tracking all the time spent fixing issues to use as justification to management, getting involved early on in the development cycle, even engaging directly with the business (something I think DBAs should do more of personally). But in the end, if it’s in the culture of the place, it seems that it’s going to be a real uphill battle to fix.

All I can suggest is that we try not to say the word “No” too often. If we continually shoot down every release in an effort to achieve perfection, we end up being perceived as roadblocks and naysayers, not helpers and teachers. Mind you, I’m not saying we should cave in completely either; I’ve nailed fellow technology people to the wall before because they’ve been far too enabling when dealing with businesses. But we need to learn to negotiate and compromise, instead of yell and complain. Maybe that crap code gets out the door now, but only if the Dev team agrees (in writing) to fix a list of major issues in the next month. That way, both sides get some of what they want, and everybody wins.

Be A Teacher, Not A Punisher

Posted by Josh | Posted in SQL Server, The Rookie DBA | Posted on 05-12-2010

Tags: , ,

1

Here’s a situation I’m sure we’ve all been in:

You receive a request to run some code against a production system. Upon opening the build folder, you realize that:

  • There is no README file explaining what you’re supposed to do.
  • All the stored procedures are named “sp_xxx” and have no header blocks explaining what they are. (BTW, read here why that “sp_xxx” naming convention is bad.)
  • Multiple procedures are using dynamic SQL.
  • The script is creating a user with ‘sysadmin’ rights.
  • etc…

I could go on for hours about all the mistakes and bad practices I’ve seen in my still short time as a DBA (and I’d love to hear more funny ones from my readers), but that’s not my point.

It’s All About Choice

At this point you have two paths you can go:

  1. You stomp angrily down to the developer’s desk, demand to know how they could dare to waste your time with such garbage, and end up in a Wild West style shootout in the courtyard of your building.
  2. You take the teaching approach.

So what, you may ask, is this “teaching approach”? Quite simple, actually.

Try To See Their Side

I’ve found a lot of times that the folks writing the code really don’t understand what it is they’re doing. Maybe they assume all database users must have sysadmin rights. Perhaps they are new and their manager neglected to point them to your standards document (you do have one, right?), so they don’t know they need to include a header in all their procedures. And I’m sure they don’t understand the deep down internals that make using “sp_” as a procedure name a bad idea; all they know is the rest of the procedures in that database are all named that.

Instead of accusing them of being lazy, first see if they were given the tools they needed to be successful. By listening to them and trying to understand their perspective, you’ll also be gaining their trust.

Show Them The Error Of Their Ways

If they didn’t know that you had a standards document, take a moment to point it out to them and go over some relevant sections. Give them links to Books OnLine and recommend a few good books on T-SQL tuning. If you want to be really helpful, you might even offer to sit with them for an hour or so and do some pair-coding.

I’ve always said that I’d rather teach someone to fish on their own, rather than catch the fish for them. If I blindly enforce my rules on those handing me code, the code might get fixed, but we’ll run into the same problems the next time. Instead, by teaching them how to be better T-SQL thinkers, I’m making my job easier going forward.

Will this approach work all the time? Absolutely not. I’m not going to sit here and tell you that this magical technique will transform all C# developers into instant DBAs. Some people won’t want to learn, and you might have to go the route of drawing a line in the sand from time to time to safeguard your systems.

But if you take the time to be a teacher, and reach out to those you work with, you just might see this in revision notes some day:

  1. Removed cursor functionality as it was not needed and will perform poorly.
  2. Corrected JOIN clause to use indexed columns.
  3. Added DBA standard header and updated revision history.

Fighting SQL Server Sprawl

Posted by Josh | Posted in The Rookie DBA | Posted on 28-11-2010

Tags: ,

1

Photo credit to henrymatt


It seems like every week or so I’ve been getting another request to build out a SQL Server installation for a Development team. Usually these are pretty simple: install database engine, patch to current levels, create a database or two, and grant the developers db_owner / ALTER ANY LOGIN rights (why not securityadmin you might ask? Read this and you’ll understand why), and off you go. Seems easy enough, right?

Well, yes, but you see there’s a bit of hidden cost associated with all that. For one, there’s the overhead of maintaining the server: patches, taking care of failed maintenance (integrity checks, backups, etc), adding more drive space when needed, etc. While it’s not an overwhelming amount of work, it does increase every time a new server is added to the thirty plus ones I already support. This of course includes folks besides me such as the infrastructure (server admin) and storage management (SAN) teams.

The Urge To Consolidate

As a production DBA I was constantly working on projects to consolidate SQL instances. In this day of shrinking IT budgets, the words “leveraged” and “shared” seem to fly around quite a bit, and rightfully so. With the kind of big, multi-core, 64GB+ RAM setups one can aquire these days, there’s just no excuse for having a server with a handful of databases on it (unless, of course, you have some quantitative numbers to prove that those few databases really will take all the resources on that server).

Why should it be any different in development systems? Sure, it’s pretty doubtful that you could get the budget for any big iron pieces of hardware, but then again you don’t really need that. Development systems in general don’t seem to be used for benchmarking or performance testing (as much as I’d like them to be, but that’s another story), so lightning response times aren’t generally a priority. I would say a more realistic goal would be for the system to be manageable in terms of maintenance (since that’s probably when it’s getting hit the hardest), and have enough juice that one rogue query won’t take it out completely.

<rant>While we’re on that topic, can I just tell you how crazy it makes me when I discover a VM Dev server with 50+ databases (over 3 instances of SQL) on it, with only 2GB of RAM and all iSCSI attached NAS storage (over the same network as other traffic no less)? I mean, I understand that yes, this is development, and we’re going to be getting cheap goods, but come one people… my pocket calculator probably has more horsepower than that. Maintenance takes 12+ hours to complete for s***s sake!</rant>

Oh, sorry, got a little emotional there. Where were we? Oh yes…

The Master Plan… at least for now

So here is my initial take on this: the vast majority of my servers could easily be consolidated down to, let’s say, five total instances.

  1. DEV SQL 2005 instance 1
  2. DEV SQL 2005 instance 2
  3. QA SQL 2005
  4. DEV SQL 2008
  5. QA SQL 2008

The idea here is that Dev and QA should not be in the same sandbox. Development is for, well, development. It can be a little dirtier and looser in security, for the sake of convenience (but no, Mr. Developer, you still can’t have sysadmin rights. Do you know what Max DOP is? Didn’t think so, k thx bye!). QA, on the other hand, should be locked down similar to how production is, in the interests of ensuring as close to a real pre-release test as possible.

In terms of setup, I’m picturing a total of 8 VMs: 2 for each instance, configured as either an Active/Passive cluster, or (in the case of the two SQL 2005 DEV instances) and Active/Active cluster. Each would have between 4 and 8GB of RAM depending upon workload. This means that, assuming we don’t want two nodes on the same physical host, I’d need only four physical boxes, with ~12GB of RAM each. Hopefully I can pull these out of the “Free Server” pile, where unused boxes go before being shipped off to the scrap heap. They don’t need to be fancy; just 64 bit capable and compliant with VMWare ESX.

Why clusters, you might say? Well, for one, I’ve never built a cluster before, and I’m always looking for excuses to learn new things. Secondly, because I do want to have some level of HA in my leveraged system. It’s one thing if one Dev team gets hung up for a day because their server went down, quite another if a whole group is up the river.

And yes, there probably will be some cases, just as in production database support, where a leveraged system is not an option. Maybe the software requires some component to be installed on the database server itself (hey vendors, I really hate it when you do that, especially when you can’t justify why). And there will probably be some cases where the Dev teams just don’t want to give up their control, and that’s their right. But they won’t be getting my full support if that’s the route they want to take.

So that’s the plan, in a nutshell. Naturally I’ve got to get both management and the developers to buy in, but hey, that’s what my psychology degree is for, right? :)

P.S. I’d love to hear from other folks who support (or have supported) development SQL infrastructures on how they handled situations like this. Leave your thoughts in the comments.

Today I Used Access… And Liked It

Posted by Josh | Posted in Project Management, SQL Server, The Rookie DBA | Posted on 16-11-2010

Tags: , , ,

0

Yes, you heard me right. I have committed the ultimate DBA sin: I used Microsoft Access. And dammit, I enjoyed it. Somewhere, Brent Ozar, the king of Access, is plotting my violent and painful death.

Here’s the story: I’ve been using SharePoint extensively for tracking project work in my new role. It’s critical that I’m methodical and completely transparent here, because demand for my time is high and people aren’t going to take “I don’t have time to work on this until March” at face value unless I can prove it.

Overall it’s worked really well, but one frustration is the lack of things like sub-forms in SharePoint. That is, it’s really hard to be looking at an item in the Project list and add a related item in the Task list without a lot of clicking and copy / pasting.

And this is where (in the tradition of the great Emeril Lagasse), I said to myself, “Self! Remember when you were a fledgling programmer and used Access? Remember those nifty and very easily created subforms? You could use them here!” And use them I did. I set up a couple forms and nested subforms for easy addition and querying of related items, with some filtered drop-downs as well (another lacking feature in SharePoint). All these were bound to SharePoint linked lists.

And here’s why I feel mostly OK about it: it’s a tool for me to use, and it makes entering and tracking my time a lot simpler. That is going to benefit me and my clients. Yes, it’s not perfectly architect-ed, super robust, and well documented. But for now, since I’m pretty much a team of one, I’m good with that.

But still, like the shrimp in Finding Nemo says: “I am ashamed.” And I need to go take a shower now. Because I am a dirty, dirty boy.

Why Being An EMT Made Me A Better DBA

Posted by Josh | Posted in SQL Server, The Rookie DBA | Posted on 14-11-2010

Tags: , , , , ,

4

Pennsylvania Emergency Medical Technician patch

Update: Mike sent me a link to an article he wrote for SQL Server Central on this same subject, and I wanted to include it here: http://www.sqlservercentral.com/articles/Troubleshooting/66134/

When I was sixteen I became one of the youngest EMTs in the state of Pennsylvania. I was proud to volunteer my time, and found I greatly enjoyed helping people. It was both incredibly humbling and empowering at the same time: on one hand, there were times when you were utterly helpless and could do nothing to help your patients, but other times you clearly saw the immediate impact you were able to have. I can say without question my experiences, both good and bad, changed me as a person.

Fast forward fifteen years, and I find that fellow SQL tweeter Mike Walsh (blog | twitter) also has some background in emergency services. We had a nice little chat about our experiences and how they ultimately help us in our daily work. Thus, I became inspired to write about how my time riding around in an ambulance helps me be a better DBA.

Checklists And Procedures Are Essential

When we arrived on the scene of a medical emergency, we were drilled ruthlessly about a series of steps to assess our patient’s condition in rapid fashion. One of the first tasks was what we called the ABCs- Airway, Breathing, and Circulation. Basically, is the patient have a clear airway (choking, etc), are they breathing, and do they have a pulse. It was simple and easy to remember; two essential things when you’re in an adrenaline charged situation with someone’s life on your hands.

Once we were past the initial basics, we had a written list of additional information to gather: current medications, allergies, existing conditions, injuries, etc. Personally I carried that pad of paper with me in my pocket at all times while on duty, and used it without fail on every call. I don’t know about you but I would not trust someone’s life to my memory.

The same should be true about dealing with SQL Server emergencies (I use that term loosely as you’ll see later). Do you have a documented, step-by-step process to perform an assessment of the issue at hand, and take corrective measures? Is it readily available, clearly written, and has it been practiced? What about recovery procedures? If you get paged at 3AM (after a night at the bar of course) and have to perform a restore of a corrupted page, will you be going off memory or will you be walking down your checklist? Heck, forget you, what about that junior DBA who joined your team a month ago and is taking his first on-call rotation? Do you really want him flying by the seat of his pants?

Sometimes A Calm Voice Is The Best Medicine

I once was called out to a house where we found a young boy who had fallen and broken his arm. Immediately it was apparent that the injury was fairly minor (at least as minor as a broken limb could be), and that the situation was being made far more chaotic by the boy’s hysterical parents. They were yelling and screaming at each other, blaming one another for letting this happen, yelling at us for not taking proper care of their son, etc.

That night it happened that my driver was also my supervisor, a man with over ten years of experience in EMS. He quickly brought the two parents together off to the side, and said in a quiet but authoritative voice, “Ma’am, sir, I promise you we are doing everything we can to help your boy out. What I need from you two right now is to listen carefully to my medic’s questions so we can get all the information the doctors will need. Can you do that for me please?”

That simple request, combined with his soothing tone of voice, settled them instantly and brought them out of their hysteria. Once they had something to focus on they were cooperative, and we were able to get the information we needed and transport the boy to the hospital for treatment.

Similarly, the next time someone (be it a developer, a business user, or whomever) runs up to your desk in a panic, try speaking to them in a calming manner. Assure them that everything will be done to fix their situation, then ask them to help you by answering some simple questions. Is the application totally down, or is there just some functionality that is not working? How many users are affected? Are there any financial or other risks present due to the ongoing presence of the issue? In a lot of cases, those answers may even prove that the situation really isn’t as urgent as they thought.

It’s Not Life And Death

Mike and I both agreed strongly on this one: our time as volunteers gave us a healthy dose of perspective when it came to dealing with IT “Emergencies”. After all, it’s not like we’re dealing with broken bones, blood, or the prospect of a patient dying.

I don’t say this to belittle the problems we deal with as DBAs. We are paid to take care of our users and our servers, and we need to take our work seriously. But at the same time, I think we need to make sure we don’t lose track of the bigger picture of life happening around us. A server going down is not worthy of raising your blood pressure, nor is dealing with “that annoying developer” again worth losing your cool. Ultimately, life will go on, the sun will come up, and you can go home to hug your loved ones at the end of the day.

Note: I do want to single out and commend those of us who truly support mission-critical platforms, such as those in hospitals, power plants, etc, where lives may really be at stake. You have my utmost respect as a fellow data professional.

Maybe some day I’ll return to volunteering my time to help those who have become victims of life. In the mean time, I take solace knowing that the lessons I learned in my time in EMS are with me today, helping me to truly be a better DBA.

Week One Impressions

Posted by Josh | Posted in The Rookie DBA | Posted on 25-10-2010

Tags: , ,

0

Now that I’ve been through a full week in my new development-centered position, I thought I should take a few minutes to note some observations. It was overall a very pleasing first week, with already some lessons and changes made.

Developers (In General) Do Not Make Good Database Stewards

To me one of the most important roles of a DBA is that of data custodian. It’s your job to make sure data is kept safe, both from a security / access perspective, and also from an integrity / availability perspective. So far at least, my experience has shown that developers either a) don’t care or b) don’t know how to accomplish this goal. Now having been a developer myself, I do understand somewhat that their focus has to be on putting out code, not securing and stabilizing their environment. But at the same time, it was downright stunning to see some of the states these machines had gotten into. Even simple backups or integrity checks were mostly missing; granted a lost database wouldn’t mean immediate outages for clients, but it would mean lost work and project time.

People Are Remarkably Welcoming

Even though I’ve gone into this with the attitude that I am not a traffic cop here to lay down the law, I was concerned that any suggestions or attempts at change would be met with resistance. And while that has been the case for a few teams, largely I’ve gotten a very positive reception and folks seem genuinely eager to have me on board. A lot appear relieved that now they will have someone who will be watching over their systems, so they can focus on churning out the code.

The Three D’s Will Be Key

Discipline
I am going to have a lot of competing demands for my time, so it is going to be crucial that I make good use of what I have. Already projects are piling up and, while people are generally understanding that I am only one mere mortal, they also want theirs done first and fast. I’m going to have to make how I spend my time brutally efficient and wholly transparent. I think that last one is extremely important; if I’m going to be able to say with credibility that I can’t take something on until March, I’d better be able to show clearly what’s on my plate from now until then.
Documentation
Even though I’m essentially a team of one, I’m going to have to be ruthless about documenting procedure and policy. Because my backups (I technically have two of them) will not be in the midst of my daily work, documentation will be needed to ensure requests are handled appropriately and in a consistent manner when I’m not around. That’s why my rule will be that if I do something more than once, it must be written down somewhere. Period. I hate writing documentation as much as anyone, but it’s a necessary evil.
Diplomacy
If I’m going to have success getting these teams to change the way they interact with SQL Server, it’s going to take some serious cajoling, prodding, and yes, occasionally some pointed and sharp yelling. If I come charging into the room acting like some holy warrior who is here to save us all from the perils of bad T-SQL… well, I think the reaction would be as bad as that last image is ridiculous.

This is where my old psychology degree will probably come in handy. You see, I’m going to have to get into these folks heads; not in some Machiavellian, manipulative way (at least not too much), but in a genuine “let me understand your view so I can help you”, shrink-like kind of way. That means I’m going to have to listen to some degree of whining and complaining about how hard life is, how the business rules all and how they don’t have time to worry about defining a proper security model. I should seriously get one of those Freudian style couches for my office!

Ultimately, I had a great week and am still very excited for the future. Yes I’ll be busy, and yes I’m going to have an uphill battle at times. But I really do feel that I can make a difference and help those around me. Maybe I’m naive; only time will tell.

I’m Not A Production DBA Anymore

Posted by Josh | Posted in SQL Server, The Rookie DBA | Posted on 13-10-2010

Tags: , ,

0

Well, almost. As of this coming Monday at least.

That’s right kids. Starting next week, I’ll be moving into a (gasp) Development centered role, acting as the database guru for a bunch of internal development teams.

What’s this mean in a nutshell? Well, among other things…

No More Pager

Yes, I know, as Tom LaRock (blog | twitter) very aptly puts in his book DBA Survivor, “A Development Server is A Production Server To A Developer.” But, at least for now, that only extends within business hours. No more wakeup calls at 3AM to deal with slow-moving servers.

Lots More Code Reviews

Up until now, code review was an activity that was handled by the production DBA team, including myself. And while I think all of us agreed it was a vital role to have, we were never really able to give it the kind of attention it needed. After all, as production DBAs, our main focus had to be on keeping production systems up and running.

“But Josh, doesn’t reviewing code help keep production systems running smoothly by preventing bad code from getting there in the first place?”

Right you are, and that’s why a dedicated part of my new job will be performing code reviews and ensuring everything follows standards. I’d be lying if I said I’m thrilled about the prospect of filtering through all those lines of code, but I do understand its importance and I’m honored to be given that responsibility.

It’s Cleanup Time

Another part of my role is taking ownership of a good number of Development owned database servers. This process includes auditing for configuration and setup, patch level, security (can you say “Powershell” people?), and planning for remediation to bring the servers into compliance with how our production systems are setup.

This to me is just as important as reviewing code in terms of helping our teams (both Development and Production) to have success in releasing code. Think about it: if you are running all your tests under a highly privileged account, on a server which is left wide open in terms of permissions, you could have a rock-solid test-case plan and still run into major issues when you try and move into a live system. While I’m sure that I’ll encounter some resistance (“What do you mean we can’t have the ‘sa’ password any more?!”), I’m confident that in the long run everyone will benefit from this inoculation of best practice methodology. Plus, the development teams can rest easy, knowing that their servers are being backed up and DBCC’d regularly, as opposed to hoping months of work doesn’t just go down the tubes courtesy of our friend Mr. Corruption.

Overall, I’m truly excited about the change. I will miss the adrenaline of dealing with production issues and the constant problem solving. But at the same time, I’m looking forward to attacking the kinds of problems we encounter as DBAs from a proactive approach, and building, dare I say, a strong rapport with our friends who pound out code on a daily basis.

On a closing note, I’d be remiss if I didn’t thank the members of my team for the wonderful ride over the last 6 months as a production DBA (has it really only been that long?). You’ve been (and will continue to be, since I’m still going to be working with you) exceptional teammates and I can’t wait to continue contributing to our team, just from a different angle.

Covering Index = Epic Win, Part 3

Posted by Josh | Posted in SQL Server, The Rookie DBA | Posted on 23-06-2010

Tags: , ,

0

In part 2 of this series, we looked at the lengths SQL Server will go just to avoid the dreaded bookmark lookup, including ignoring an index altogether and choosing instead to scan the entire clustered index. But, we also showed how that decision was, in fact, a smart one, due to it resulting in fewer logical IOs.

But let’s face it, neither path (bookmark lookup or clustered index scan) is very efficient or scalable. So, what can we do to give SQL some better tools?

The answer, it seems, is surprisingly simple: create a covering index. That is, we need to create (or modify an existing one) an index that contains all the columns required for the query. But how do we determine what those columns are?

The easiest way I’ve found is to examine our execution plan, specifically the Bookmark Lookup node. Right click on the icon and choose “Properties”. Then, in the properties window, look for a line called “Output List”. Note: if your plan includes a scan, table or index, then use that operator as your root node instead.

If you click on that little “…” icon, what you’ll see is a nice list of the columns being requested by the query:

In this case, we see that the query is requesting the AccountID and Amount columns. So, how do we add these to the index?

Well, there are two options: 1) add them to the index key itself, 2) add them as included columns. For reasons I’ll illustrate in a moment we’ll see that number 2 is the better option, at least in this specific case.

First, adding them to the key:


DROP INDEX NCI_tempA_TransDate ON dbo.tmpA;
GO
CREATE INDEX NCI_tempA_TransDate ON dbo.tmpA (TransDate, AccountID, Amount);

Now, let’s run the same SELECT query:


SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/1/2010';

And the resulting execution plan:

…and STATISTICS IO:

Table 'tmpA'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Both clearly an improvement over the previous results seen in part two (lookup and 217 logical reads).

“Great!” you say, “I’ll go start adding columns to my indexes right now!”

Hold on there… this isn’t all rainbows yet. You see, by adding those columns to the index key itself, we could be incurring additional overhead in terms of index maintenance. Plus, I’m pretty sure if index gurus like Paul Randal (blog | twitter) catch you adding things like large VARCHAR columns to an index key, without them actually being used in filter operations, they just might hunt you down and… well, let’s just say I doubt it’d be pretty. Note: if anyone out there can tell me how to show the difference in behavior via an execution plan or STATISTICS IO I will update this page and give them due props, as I was not able to get any difference between the two plans.

So now let’s look at option two: add the columns using the INCLUDE functionality in SQL 2005+:


DROP INDEX NCI_tempA_TransDate ON dbo.tmpA;
GO
CREATE INDEX NCI_tempA_TransDate_inc ON dbo.tmpA (TransDate)
INCLUDE (AccountID, Amount);

Running our SELECT statement gives an identical execution plan and number of logical IOs as when the columns are placed in the index key. But now we’ve skipped making our index key wider than necessary. Truly a great feature.

Before I close out the series, I should mention that even this second path doesn’t appear to be completely devoid of peril. By adding these INCLUDEd columns, you will be adding to the amount of storage required by the index. For example, the following information was returned by running a query against the sys.dm_db_index_physical_stats DMV, using the page_count column for indexes on our tmpA table:

For explanatory purposes, the indexes are named as follows:

  • NCI_tempA_TransDate_inc – key = TransDate, included = AccountID, Amount
  • NCI_tempA_TransDate_inckey – key = TransDate, AccountID, Amount
  • NCI_tmpA_TransDate – key = TransDate, no included columns

So as you can see, adding INCLUDED columns to the key does indeed increase the storage required by the index. So do keep that in mind when you perform this kind of tuning work, especially on larger databases.

And so we’ve come to the end of our little series. Hopefully you will find this information useful the next time you are tuning a query and notice our little friend, the bookmark lookup (or even a index / tables scans). If anyone has any great stories to share on how this functionality helped them solve a production issue (or prevent one in development), I’d love to hear it in the comments. Props to anyone who can beat my best of reducing a query’s run time from 27 minutes to 44 seconds!

INSERT dbo.tmpA (AccountID, Amount, TransDate)
SELECT  TOP 100 name, CAST(object_id as MONEY), ’1/5/2010′
FROM sys.columns

Covering Index = Epic Win, Part 2

Posted by Josh | Posted in SQL Server, The Rookie DBA | Posted on 15-06-2010

Tags: , , , ,

1

In part one of this series, we looked at how a non-clustered index can be made into a real performance bottleneck because of a nefarious little fellow called a “bookmark lookup”. We showed how this operation, which is how SQL Server retrieves the requested columns not part of an index, can result in excess IOs and generally poorly scaling performance.

In this part, we’ll look at how SQL Server will even go so far as to ignore an index completely, just to avoid our little friend the lookup.

First off, let’s look again at our query execution plan from our SELECT query shown in part one:

We see how the most expensive part of the query is the “Key Lookup” operation, by a margin on 99 to 1. Pretty obvious that it would be the bottleneck in this case.

Now this particular query returned exactly one-hundred rows. Now let’s see what happens if we increase the number of returned rows a little bit.


SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/2/2010';

This query returns around one thousand results. Let’s look at the execution plan here:


Whoa, wait a second. Why is SQL Server using a clustered index scan operation? Don’t we have an index on the column being filtered on?

Well, yes we do. But see, SQL is generally pretty smart about choosing how to execute it’s queries (though at times I might debate that statement). And while I can’t speak for how it’s making these decisions, I would hedge a bet that one of the deciding factors in constructing an execution plan is IO cost. So, let’s look at the output of STATISTICS IO for this query.

Table 'tmpA'. Scan count 1, logical reads 549, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

549 logical reads total. Now, let’s force SQL to use our index by putting an index hint in the query:


SELECT AccountID, Amount FROM dbo.tmpA WITH (INDEX = NCI_tmpA_TransDate) WHERE TransDate = '1/2/2010';

Now you’ll have to take my word for it, but the execution plan now looks identical to the one we saw in part one. However, let’s look at the STATISTICS IO output:

Table 'tmpA'. Scan count 1, logical reads 2076, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Well that looks pretty bad. The number of logical IOs increased from 549 to over two-thousand. Now we see why SQL might choose to simply scan the clustered index.

So, why would it choose to use the index in the first case? Well, let’s look at STATISTICS IO for our original query:

SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/1/2010';
Table 'tmpA'. Scan count 1, logical reads 217, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Ok, so a total of 217 logical reads, and the index is being used. Now, let’s force it to use the clustered index and see what we get.

SELECT AccountID, Amount FROM dbo.tmpA WITH (INDEX = PK_CI_tmpA) WHERE TransDate = '1/1/2010';
Table 'tmpA'. Scan count 1, logical reads 549, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

See? In the case of our first query, SQL again made the right choice, choosing the operation which resulted in only 249 logical IOs, versus 549 for the clustered index scan.

Somewhere, a SQL developer is probably yelling “See?! Why do you doubt me?!”

OK SQL, we’ve now shown that you make the right call on what operation will be the least expensive to service the query we gave you. But, that doesn’t mean we can’t give you better tools to do the job. That’s what we’ll be covering in part three of our series. I promise we’ll get to the “epic win” part there too.