The Lone DBA – Are Your Processes Repeatable?

Posted by Josh | Posted in Life As A DBA, SQL Server, The Lone DBA | Posted on 29-08-2011

2

This post is the second in an ongoing series about how to survive as the only DBA in your organization. Since October of last year, I’ve been assigned to a team that is responsible for owning and maintaining the development infrastructure. It’s a great team of seasoned professionals, but not a single other DBA. As a result, I’ve had to think very carefully about how I go about my daily work, so as to give our customers consistently good service, while still allowing those without a lot of SQL Server related knowledge to pick up my work when I’m not available.

Design for next time, not this time

When you get a request to do something, say, restore a database and give a team some access, do you just click through the GUI quickly to get the job done, then move on? Or do you consider how you can design the process so that it can be repeated over and over again with the same result?

Before you answer that question, let’s consider why we’d want a process to be repeatable, especially as a lone DBA:

  • It minimizes the work needed to take care of similar requests in the future. Instead of starting from scratch, you can re-use existing code or knowledge. You therefore have more time to work on other, more important (but perhaps less urgent) items.
  • It results in consistent service to the client. The same input (“Please restore this database and grant me access.”) should always result in the same output (the database is restored to the right server, and the right permissions are given). The client always gets what they want, and it is delivered in the same manner.
  • It allows us to delegate effectively.It’s much easier for a junior team member to walk through a well thought process than to have a request thrown at them and told to figure it out (which usually results in a stream of questions, followed by a gruff “Oh jeez I’ll just do it myself, it’ll be faster”). Now you can focus on the work that really requires a DBA toolset, instead of boring administrative tasks.

Still think the processes you put in place don’t need to be repeatable? Perhaps you should read the first post in this series, and remind yourself of the ramifications of blindly doing things your way.

Moving on, let’s examine the characteristics of a good, repeatable process:

  • Simple – the process should accomplish what the intended result is, with a minimum of fuss and effort, with no unecessary complexity in terms of how the process works.
  • Automated – the process should require as little manual intervention as possible. If there are no steps requiring human decision making, then it should only require our help when something goes wrong.
  • Documented – the process should be clearly described, both in terms of how to complete it, as well as how it accomplishes our goals (from a technical perspective).
  • Generic – the process should be designed such that it can be tweaked at run time (to use programming lingo) to meet varying needs, as opposed to requiring lots of manual intervention.

In the next few posts in this series we’ll examine each of these in more detail. Please let me know in the comments if you can think of any other required traits of a repeatable process!

The Lone DBA – Why Your Way May Not Be The Best

Posted by Josh | Posted in Life As A DBA, SQL Server | Posted on 22-08-2011

Tags: , ,

1

This post is the first in an ongoing series about how to survive as the only DBA in your organization. Since October of last year, I’ve been assigned to a team that is responsible for owning and maintaining the development infrastructure. It’s a great team of seasoned professionals, but not a single other DBA. As a result, I’ve had to think very carefully about how I go about my daily work, so as to give our customers consistently good service, while still allowing those without a lot of SQL Server related knowledge to pick up my work when I’m not available.

We don’t need no stinkin’ team…

Sure, who needs teammates? I mean, we don’t need to take vacations, have sick days, or even take breaks to go workout at the gym or go for a run. We can just work all day, handling everything ourselves. And if we’re on vacation, we can always take our laptops with us, and tether our Crackberry to dial in to the network when things need to get done.

If you’re not laughing out loud, or at least cringing at what the last paragraph describes, I’d like to recommend a website for you to browse.

Hopefully we can all agree that we all need time off now and then. And perhaps the occasional unplanned absence will happen, despite our best attempts to be the Cal Ripken Jr. of our office and keep our no-sick-day streak alive. And when that time comes, would you rather have your customers needs smoothly handled without pause, or would you like them passed around from team to team like the proverbial hot potato? Or even better, given that wonderful “Well, can it wait until so-and-so is back in the office?” question. I don’t know about you, but I hate it when I hear that question.

You can be the man, without being the man all the time

We all like to feel important and desired; after all, it’s just a natural human need. Early in my career I craved the spotlight and guarded my knowledge carefully, lest someone else take my spot as “go to guy”. It felt great to always be the one to solve issues and get things done. Well, at least it felt great until I came home three hours late and faced a cold dinner plate, or spent the day after Christmas troubleshooting over a dialup connection, while my wife and relatives opened presents in the other room.

At some point in our careers, I believe strongly we have to choose one of two paths:

  1. We can keep being the lone wolf who does things our way all the time, however we please, because we’re the only one who ever knows how to do them.
  2. We can compromise our rigid principles, stop being stubborn, and learn that sometimes we have to consider if our teammates could do things how we’d do them.

Allow me to clarify that last option a bit: Just because you’re a whiz with T-SQL and could whip up a one-time script to update ten thousandmillion records for that cute girl / guy in Operations doesn’t mean that you should. Because like any reasonable customer, if it’s done once, they’ll expect it can be done again, whether you’re there or not. And if this person / group is an important consumer of your services, chances are you’ll be getting a phone call from your teammate (or your boss) when you’re at your kid’s Scout camp, asking how to modify your query (because you at least saved that and let the team know where it is, right?).

If your whole team is full of crack T-SQL programmers, good for you, go wild (and you are one lucky SOB); otherwise, don’t do something unless you’ve made sure your backup can do it just as well. Maybe you feel comfortable running a series of scripts to setup replication every time you restore a database, but how about the guy sitting next to you who spent most of his career as a Windows sysadmin? I’m not suggesting that you can’t do complex things that can’t be repeated with a few clicks, but do consider how to make the processes easier for those without your experience. Talk to your teammates and make sure you build a process they’ll be able to support. Maybe they’re not comfortable running scripts in Management Studio, but perhaps a wrapper Posh script that prompts them for the necessary information and takes care of things under the covers would be fine.

Believe me, you’ll still be appreciated when you return from that much-needed vacation, perhaps even more so because you made sure your customer would experience the same great service when you’re not there. And there will always be new challenges to tackle where you can flex your muscles to remind everyone just how good you are. Just remember to bring your teammates along too.

A Tale of Coding Horror – T-SQL Tuesday #21

Posted by Josh | Posted in SQL Server | Posted on 10-08-2011

Tags: , ,

0


It’s time for T-SQL Tuesday again, that great community event started by Adam Machanic (Blog | Twitter) where bloggers all over write about the wonders of SQL. This month’s topic: Crap code, and boy do I have a lot to choose from. It was hard to pick one example (my early days as a programmer were riddled with bad decisions), but I finally decided on writing about my first ever software release.

A long time ago, in a building far far away…

…there was a young man who had just joined the ranks of an operations group at a financial firm. He had quit his job as a retail store manager (a job he only took because nothing was available in his chosen field of social services) and taken a temp assignment as a means to pay the bills. His job was simple: run reports from various systems (a mainframe green-screen system, a SQL database), carry some checks around, and various other menial tasks. But still, it was nice to be out of the 12 hour workdays of retail.

Soon after beginning work, the man noticed that a great deal of his work amounted to punching in a bunch of repeat keystrokes and waiting for things to print. As a result, he quickly learned the ways of ExtraBASIC, and wrote macros to tab around the terminal screens. He also learned how to use the query designer in Access, so that he could quickly get data out of a database when requested by various teams. Within a few months, word had spread of the reporting wizardry, and he was offered a permanent job within the group.

The First Fail

While not officially a programmer, he was given several tasks around building utilities for the group he worked with. Abandoning Access for greener pastures, he downloaded Visual Basic Express and began whipping together tools. Requirements shifted on a daily basis; after all, the ops groups weren’t used to having their own private coder at their beck and call. The young man enjoyed his work, if only because it kept him out of the daily grind of operational processing. And what with the simple drag-and-drop designer, the actual coding was minimal. A click here, a quick line or two of code there, and voila, a functioning application. Naturally Access / Jet was still used as the back end database, since that was what was available.

When it came time to roll out the application, the team was gathered and a demo given. Everything worked fine, and the decision was made to start using the new app right away. Everything was in place, and the go-ahead given. But soon after the use started, problems began appearing. Data wasn’t being consistantly updated, operations took a long time to complete, results weren’t as expected. After several hours, the decision was made to go back to the previous method of operation.

When at first you don’t succeed, try, and fail, again…

The young man was discouraged by this early setback, but he learned from his mistakes. At least until the next release, which included such wonderful upgrades as using MySQL as the back-end (on a dusty desktop box) and optimized in-line SQL (as opposed to GUI generated code). It fared much better, lasting several days before major issues were uncovered.

Within several months, he was offered a job as a full time programmer. He quickly churned out code, and his teammates loved his work ethic. What they didn’t love was his propensity for writing overly complex, convoluted code, which also conveniently was bereft of any comments. He also had a knack for finding and leveraging new technologies, such as an open source job scheduler, SQL Server Express (installed on that same dusty little box), and random .NET classes pulled off the internet. This was great when it came to functionality, and the business units were pleased. That is, until he was on vacation and something went bump, and they were forced into manual processing while the man’s teammates frantically dug through .vb files.

Since then…

I’ve learned quite a bit about writing better code since that first debacle, with many, many more missteps along the way. I’ve written code that was far, far too complex for what it needed to be (think a whole class for a simple calculation), scaled terribly (a linked server as a source for an 8+ hour ETL process that involved three layers of nested cursors), and was completely undocumented. But I’ve also grown tremendously as a programmer, learning lessons such as:

  1. Write simple, reusable code. Don’t try and be clever; just write what is needed and nothing more.
  2. Don’t rely on GUI generated code. Use it as a basis, but understand what it does and tweak it to better suit your needs. I yell at developers constantly because they try and do something via the Management Studio interface instead of writing the code. Writing encourages understanding, which in turn encourages better code.
  3. Have solid test plans in place, and be methodical about following them.
  4. Code for future scale, not current. I still refer people to the paper written by the Google founders years ago, where they explicitly acknowledge and plan for greatly increasing activity.
  5. Document your code to the teeth. Some day in the future, long after you’ve moved on and requirements documents are long lost, the code will break, or need to be updated, and your comments will be the only information available about why things were done the way they were. I don’t know about you but I’d rather have people singing my praises after I’m gone, as opposed to throwing strings of four letter words (as well as various hard objects) at my picture.

I still cringe when I look back on those early days. I’m actually good friends with the guy who was my first development manager, and every time he mentions the name of some application I wrote, I scowl and apologize for having given birth to such junk. Everyone writes code that makes their skin curl years later, but what’s really important is that we strive to learn and improve our skills on a continuous basis.

T-SQL Tuesday #20 – Best Practices (And Those Who Don’t Follow Them)

Posted by Josh | Posted in SQL Server | Posted on 12-07-2011

Tags: , ,

1


It’s T-SQL Tuesday time again! This month is being hosted by Amit Banerjee (Blog | Twitter), and the topic is “Best Practices.”

I could talk all day about the various best practices I require in my environments, but I thought a more novel topic might be to write about why people don’t follow them. I’ve only been in the business a little over two years (four in IT), and I’ve already heard enough to fill a small volume. But, we’re all busy people so I’ve decided just to post what I’d consider the top two most frequent whines (and they definitely make up the lion’s share).

Without further ado…

We Don’t Have The <Time, Money, Manpower>

I’m a firm believer in the triangle of project management; you can have something cheap and fast (but poorly written), you can have it good (as in high quality) and fast (at a high cost), or you can have something good and cheap (but it will take a long time). All too often the people holding the cash choose the first one: fast and cheap. I would bet this is especially true in startups or new business ventures, where being first to market at 80% quality is better than being fourth to market at 100%.

Honestly, this is one that we, as SQL professionals, usually can’t do much about. There have been many cases in my career where I reviewed some SQL code, immediately flagged it as bad for various reasons, only to be told “We don’t have time to fix it before it goes live,” or “It’s not in our scope to make the changes you’ve requested.” Earlier on I would kick and scream, telling my boss there was no way I’d let this code in to my systems. Almost inevitably I was overruled and the code was released, and my tantrums only served to increase the bad blood between my group and the development teams. Some times the code would blow up and we’d be forced to fix it ourselves (because we’re the only ones who knew how, more on that later), or release a hastily constructed patch that probably only made things worse half the time. It is incredibly frustrating to say the least. Eventually I’ve just learned that this is largely out of my control, and all I can do is explicitly document my findings and make it known that if the code is released, it is done in a limited support, if-it-breaks-you’re-SOL manner. We’ve also tried to engage our groups earlier in the process, before the project scope is set, so that we can ensure enough time / money / manpower is thrown at the problem to solve it correctly.

We Don’t Know How To Do That

I’ve often heard the excuse of “We don’t know how to follow your standard,” or “We don’t know how to follow that best practice.” Many times when I ask why not, the answer is “We’re not SQL developers.”

Yes, you read correctly. They’re not SQL developers, so they don’t know how to write SQL properly. I’m sure you can guess my next question: “So then why are you writing T-SQL code? You wouldn’t expect me to write C# multi-threaded applications, would you? So why should you write T-SQL when you don’t know the difference between an index scan and a seek?”

Inevitably this is met with some level of blank stare, followed by more excuses of the “We can’t afford a SQL developer” or “SQL developers are hard to come by” variety.

To be honest I don’t know why this is the case, but it’s been my experience (and from what I’ve read I’m not alone) that SQL Server knowledge is often a huge gap on most development teams. The assumption is that someone who is a good .NET programmer and happens to have written a few queries can suddenly become an enterprise data architect, designing a scalable, secure, and high-performing system to process hundreds of thousands of transactions an hour. One would think that the database layer would be held sacred, since it is responsible for ensuring the enterprise’s data is kept safe; instead it often seems to end up more like an experimental lab, with structures designed on Etch-A-Sketches and napkins, not ER modeling tools.

So what’s a DBA to do? In our case, we’re attacking this on three levels:

  1. Bring in one or more true SQL Server developers, and have them handle the really complex stuff.

    Can’t afford to have a SQL guru on every group? At least bring in one or two and let them handle the real low level tasks up front, such as designing the database structure and complex retrieval patterns. Maybe the script-kiddie in the next cubicle can handle some really simple CRUD procedures, but leave the fancy stuff to the pros.
  2. Have regular sessions where staff are invited to get free in-house training on SQL Server.

    We have bi-weekly Lunch ‘n Learns, where members of development teams are invited to come and participate in sessions that include watching webinars, conference lectures, presentations by in-house SQL gurus, or round-table discussions. The participation waxes and wanes, but overall the response has been positive and at the very least, we can begin to see who really wants to learn.
  3. Grade developers on their code, and make the results public.

    We’ve begun collecting detailed grades on submitted T-SQL code, on dimensions such as security, efficiency, readability (because who wants to read a single line 5000 word query – yes, I’ve seen it), and documentation. These grades are collected in (of all things) a data warehouse, where they are reviewed and published to the development managers. This helps us in two ways: 1) to spot diamonds in the rough – those folks who, given some encouragement and guidance, could develop into good SQL developers; 2) finding repeat offenders, who, despite assistance, are not improving and are costing the company money and time in fixing their poor code. The system’s only been in place for a brief amount of time, but we’re excited about the transparency and intelligence this will offer.

There are many others…

Certainly there are many other reasons teams don’t follow standards. I’d love to hear what others think are the top reasons why standards don’t get followed in their company (or in the SQL world in general)!

Averaging PerfMon With Powershell

Posted by Josh | Posted in Powershell | Posted on 10-07-2011

Tags: ,

0

I just wanted to post this quickly before I somehow lost the syntax I developed. What I was trying to do is gather some PerfMon counters for a set period of time and return some aggregated (averaged) results. Here’s what I ended up with:

$counters = (Get-Counter -ListSet "LogicalDisk");
$counterValues = ($counters.PathsWithInstances | Where {$_ -like "*Sec/*"} | get-counter -MaxSamples 10 | select -ExpandProperty CounterSamples);
$groupedValues = $counterValues | select Path,InstanceName,CookedValue,Timestamp | Group-Object Path;
$avgValues = $groupedValues | % {New-Object PSObject -Property @{CounterName = $_.Name; AvgValue = ($_.Group | Measure-Object CookedValue -Average).Average}};
$avgValues | Out-GridView;

This ends up working really well for what I needed. Now I just need to wrap it in a function, parameterize it, and enable it to interrogate remote machines via PowerShell remoting (less firewall hassles that way).

Invoke-Sqlcmd + Invoke-Command = Posh Awesomness

Posted by Josh | Posted in Powershell | Posted on 05-07-2011

0

I have a database that contains metadata for all the servers I support, such as server and domain names, instance names, SQL versions, etc. It proves very handy when trying to determine who owns what, or find out which servers need to be upgraded, etc.

Unfortunately, it’s also extremely frustrating to maintain. Right now all data is entered manually via T-SQL commands (I do at least have procedures that encapsulate some check logic) and it’s up to the operator to enter the data correctly. Given that we humans are prone to mistakes and slip-ups (not to mention laziness) that means that inevitably stuff gets out of date.

To try and mitigate this problem (and lighten my workload in the mean time), I’m in the process of writing some Powershell scripts that will iterate over the list of servers, attack them with some Posh remoting commands, and update anything in the database that is stale.

Let’s say we want to get back a list of servers and get some information from each of them. At first I thought this would be as simple as something like this:

Invoke-Sqlcmd -ServerInstance $ServerName -Query $SqlQuery -Database dbarepos | Invoke-Command -Cred $Cred -ScriptBlock $ScriptBlock

But then when I tried this, I got a rather vague error:

Invoke-Command : Parameter set cannot be resolved using the specified named parameters.

So apparently the Invoke-Command doesn’t immediately like getting that piped input. I have a hunch it was related to something Don Jones pointed out here, about how Invoke-Command binds piped input to it’s parameters.

In any case, my workaround was to just rewrite the code a little differently, which also gave me some more verbose progress output:


$servers = Invoke-Sqlcmd -ServerInstance $server -Query $sqlcmd -Database $db;
$servers | Foreach-Object {
Write-Host Querying $Server.Name;
Invoke-Command -ComputerName $Server.Name -Cred $Cred -ScriptBlock $ScriptBlock;
}

This ends up accomplishing the same thing, albeit in a bit more code (but not much, so I’m fine). I tested this and in about 2 minutes I could pull back a tremendous amount of information on all the servers, ranging from memory to PerfMon stats to even SQL Server specific information like database names and sizes. Truly a timesaver; I really should use Powershell more often!

A Good DBA Is What The Customer Needs Them To Be, Not What They Want To Be

Posted by Josh | Posted in Life As A DBA, SQL Server | Posted on 30-04-2011

0

In my work there are things I greatly enjoy doing: performance tuning, design “think tanks”, occasional bouts of code writing, and looking at new technologies to name a few. I enjoy these because they allow me to flex my creative muscles a bit. I’ve always thought of myself as a fairly imaginative person, so I do like to let the ideas fly and see where we land.

On the other hand, there are things that, well, let’s just say I’d rather hammer nails through my fingernails than do. Building servers, for instance. I mean, does anyone really like clicking next a half a dozen times, typing in drive paths, selecting features, then watching pretty little progress bars inch their way across the screen? Only to have to re-do the same thing again when it fails 95% through (a story for another post perhaps). Or code reviews: I don’t think the English language is descriptive enough to accurately portray how awesomely boring it is scrolling through hundreds (in some cases thousands) of lines of code, looking for formatting errors, table hints, horribly named objects, and other big bad coding horrors. Every now and then I do have some interesting T-SQL to examine, but most of it is either garden variety, or so utterly incomprehensible as to make my brain twist into one giant Gordian knot. And let’s not get started on the ongoing (never-ending) work of cleaning up and remediating existing installs that haven’t been touched by a DBAs gentle hands in years.

But here’s the thing… those things I hate so much? They’re a very necessary part of being a good DBA. And they’re a major part of what my organization needs from me right now.

As much as I’d love to spend my days whiteboarding ideas for new products and playing the hero fixing bad SQL code with a flick of my wrist, right now the focus has to be on lower level chores. The environment’s still messy, with unpatched, un-backed-up servers floating about all over. There’s no consistentcy in how they’re all set up, from things like server configurations to agent jobs and storage layouts. I’ve got months of work ahead of me just to get above water.

I know what you’re thinking: all this being the case, why do I keep working like this, when surely there are positions that receive more praise, are more engaging, and do work more visibly of value to others. Yes, that’s very true, and if I wanted to, I’m sure I could get there in a short span of time. I think the same thing myself from time to time when work gets particularly frustrating or stressful. But then I remember a realization I came to some time ago: a good DBA knows that sometimes you have to be what your customer needs, not what you want to be.

The most value I can be to my employer is by ensuring everything is stable, patches, backed up, and there for them when they come in the next day. I can also ensure that the releases going out are properly done and won’t cause issues (which can in turn cost money) when rolled out. There are other people who can be heavily involved in database / application architecture and design, solve challenging T-SQL coding problems, and be the hero tuning badly performing queries. But right now, I’m the only one who can make sure that backups have run, databases aren’t corrupt, and there’s enough memory to answer queries. It’s not glitzy work, but it’s necessary. It’s the work I need to do, not the work I want to do. And I’m OK with that, because it’s what I signed up for when I took this job.

APPLYing Your SQL Knowledge With Cached Plans And XPath – TSQL Tuesday #17

Posted by Josh | Posted in SQL Server | Posted on 12-04-2011

Tags: , , ,

4

T-SQL TuesdayIt’s the second Tuesday of the month, which means it’s time for T-SQL Tuesday again! This month is hosted by Matt Velic, a fellow former accidental DBA who I had the pleasure of meeting at SQL Saturday in Philadelphia this year. The topic is the APPLY operator, and I can think of no better way to illustrate its use than with XPath and some of the absolutely awesome DMV’s available for interrogating the plan cache.

One of the big challenges I’ve faced as a DBA is proactively finding and fixing bad code before it reaches a critical stage and causes issues. One way I’ve found to effectively do this is to monitor and interrogate the plan cache on a server, looking for problem children and areas of concern across several dimensions. Fortunately for us, execution plans are stored as XML, which means we can use XPath to seek out specific elements which could indicate a poor query.

For example, here’s a query that will look at the top 10 most expensive query plans in terms of average logical IOs:

			SELECT  TOP 10
			DB_NAME(CAST(qpa.value AS INT)) DBName,
			qs.total_logical_reads / qs.execution_count avg_logical_reads,
			SUBSTRING(st.text, CASE
								WHEN qs.statement_start_offset IN (0,NULL) THEN 1
								ELSE qs.statement_start_offset/2 + 1
							   END,
							   CASE
								WHEN qs.statement_end_offset IN (0,-1,NULL) THEN LEN(st.text)
								ELSE qs.statement_end_offset/2
							   END - CASE
										WHEN qs.statement_start_offset IN (0, NULL) THEN 1
										ELSE qs.statement_start_offset/2 +1
									 END
					 ) query_text,
					 qp.query_plan
			FROM	sys.dm_exec_query_stats qs
				CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
				CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
				CROSS APPLY sys.dm_exec_plan_attributes(qs.plan_handle) qpa
			WHERE	qpa.attribute = 'dbid'
					AND qpa.value > 4
			ORDER BY qs.total_logical_reads / qs.execution_count DESC

Now this information in and of itself is fairly useful, but using a little more APPLY magic and some XPATH, we can get even more information. Let’s say, for instance, we’d like to pull out all clustered index scans. In XML execution plans, every one of these little nasties is represented by a “” element, with an attribute value of @PhysicalOp="Clustered Index Scan". Easy enough to find using the nodes() XPATH function, at which point we can also gather some useful numbers such as estimated IO and CPU cost.

WITH	XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
		TopIOQuery AS
		(
			SELECT  TOP 10
			DB_NAME(CAST(qpa.value AS INT)) DBName,
			qs.total_logical_reads / qs.execution_count avg_logical_reads,
			SUBSTRING(st.text, CASE
								WHEN qs.statement_start_offset IN (0,NULL) THEN 1
								ELSE qs.statement_start_offset/2 + 1
							   END,
							   CASE
								WHEN qs.statement_end_offset IN (0,-1,NULL) THEN LEN(st.text)
								ELSE qs.statement_end_offset/2
							   END - CASE
										WHEN qs.statement_start_offset IN (0, NULL) THEN 1
										ELSE qs.statement_start_offset/2 +1
									 END
					 ) query_text,
					 qp.query_plan
			FROM	sys.dm_exec_query_stats qs
				CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
				CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
				CROSS APPLY sys.dm_exec_plan_attributes(qs.plan_handle) qpa
			WHERE	qpa.attribute = 'dbid'
					AND qpa.value > 4
			ORDER BY qs.total_logical_reads / qs.execution_count DESC
		)
SELECT	t.DBName,
		t.avg_logical_reads,
		CAST('<q> ' + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(t.query_text,' OR ',' OR ' + CHAR(10)),' AND ',' AND ' + CHAR(10)),'&','&amp;'),'>','&gt;'),'<','&lt;') + ' </q>' AS XML) Query,
		t.query_text,
		RelOp.Col.value('(./@PhysicalOp)[1]','VARCHAR(200)') Operation,
		ISNULL(RelOp.Col.value('(.//Object[1]/@Schema)[1]','SYSNAME'),'tempdb.') + '.' + RelOp.Col.value('(.//Object[1]/@Table)[1]','SYSNAME') TableName,
		RelOp.Col.value('(./@EstimateRows)[1]','FLOAT') EstimatedRows,
		RelOp.Col.value('(./@EstimateCPU)[1]','FLOAT') EstimatedCPU,
		RelOp.Col.value('(./@EstimateIO)[1]','FLOAT') EstimatedIO,
		RelOp.Col.value('(./@EstimatedTotalSubtreeCost)[1]','FLOAT') EstimatedCost,
		RelOp.Col.query('./OutputList') OutputList,
		CAST('<p> ' + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(RelOp.Col.value('(./IndexScan/Predicate/ScalarOperator/@ScalarString)[1]','VARCHAR(MAX)'),' OR ',' OR ' + CHAR(10)),' AND ',' AND ' + CHAR(10)),'&','&amp;'),'>','&gt;'),'<','&lt;') + ' </p>' AS XML) Predicate,
		RelOp.Col.query('.') RelOpXML,
		t.query_plan QueryPlan
FROM	TopIOQuery t
			CROSS APPLY t.query_plan.nodes('//RelOp[@PhysicalOp="Clustered Index Scan"]') RelOp(col)
WHERE	RelOp.col.value('(.//Object/@Schema)[1]','SYSNAME') <> '[sys]'
ORDER BY avg_logical_reads DESC,query_text,EstimatedIO DESC

You could modify this query to return just about set of information you like. Execution plans are a veritable gold mine of information. If you’re liking what you see, I’d highly recommend you read the excellent series “Can You Dig It” by Jason Strate (blog | twitter).

One final note: I would be careful about running these on a live instance during busy hours, they can be quite resource hungry. Happy plan querying!

HA? We don’t need no stinkin’ HA (Part 4)

Posted by Josh | Posted in SQL Server | Posted on 11-04-2011

Tags: , ,

0

This is part four of a series cataloging my search for the ideal High Availability solution for my development and QA environments. The other parts:

  1. Part 1 – an introduction to why dismissing the need for HA in development / test systems is not necessarily the right decision.
  2. Part 2 – a quick poll of the community to see what others are doing.
  3. Part 3 – Clustering

In part four, we’re going to talk about database mirroring.

From Microsoft Books OnLine:

Database mirroring maintains two copies of a single database that must reside on different server instances of SQL Server Database Engine. Typically, these server instances reside on computers in different locations. One server instance serves the database to clients (the principal server). The other instance acts as a hot or warm standby server (the mirror server), depending on the configuration and state of the mirroring session. When a database mirroring session is synchronized, database mirroring provides a hot standby server that supports rapid failover without a loss of data from committed transactions. When the session is not synchronized, the mirror server is typically available as a warm standby server (with possible data loss).

I won’t dig into the internals of database mirroring much here. For a more detailed explanation, I would refer you to the excellent book Pro SQL Server 2008 Mirroring by Ken Simmons (blog | twitter) and Robert Davis (blog | twitter).

There are two basic modes for database mirroring: high-safety and high-performance. In high-safety mode, transactions are not considered committed on the principal until they have been fully sent to and processed by the mirror server. In high-performance mode, contrastingly, transactions commit immediately, then are queued up to be sent to and processed by the mirror server. This means the potential of a much higher throughput on the principal, but it also means you’re not guaranteed zero data loss if your principal goes offline. Note that high-performance mode is only available in Enterprise Edition.

Database mirroring can be setup in two configurations, architecture-wise:

  1. Two-server
    1. Principal
    2. Mirror
  2. Three-server
    1. Principal
    2. Mirror
    3. Witness

In a two server setup, the principal is the server answering queries, while the mirror sits in a warm-standby capacity. The three server setup is the same, with the addition of the Witness server, which acts as a kind of third-party watchdog. If both the mirror and the witness agree that the principal is down, then an automatic failover is initiated, with the mirror server becoming the principal and being fully able to answer queries. If, however, the witness still considers the principal to be online, no failover is initiated. You can, however, manually failover a mirroring setup through Transact-SQL or the SSMS GUI.

When I initially read about database mirroring, and saw the word “failover”, my first thought was that this would work similarly to cluster supported failovers, in that it would be largely transparent to the end user, except for a brief outage. However, this is not necessarily so. If you are using SQL Native Client or the .NET SQL Client provider, you can specify the “FailoverPartner” (OLEDB) or “Failover_partner” (ODBC) attribute in your connection string, referencing the mirror server. The driver is smart enough to try and connect to the specified partner name if the connection to the primary server fails. See this BOL article for more details. However, if you’re dealing with legacy applications that don’t use .NET or SQL Native Client, you’re stuck with a manual re-point of connections. I don’t know about you, but I have a ton of those kinds of apps lying around.

Another downside is that you cannot mirror the system databases (master, msdb, model). This means that things like SQL Agent jobs, logins, etc, will not be automatically transferred upon failover. Granted, there are plenty of ways to handle these in other fashions, but it’s definitely something to keep in mind.

Also, I would recommend you read these post (number one and two) from Paul Randal (blog | twitter) that debunk some myths around database mirroring.

Conclusion

I think database mirroring has some real promise. It doesn’t require expensive or identical hardware configurations, is fairly easy to setup (I did it in my lab in about an hour, and most of the work could probably be scripted), and is available in both Standard and Enterprise versions of SQL Server (albeit only in High-Safety mode for Standard, but since I’m not mirroring across any WANs I don’t think network throughput would be an issue).

However, the idea of a manual re-point of clients and synchronization of logins is not one I cherish. Yes, this is development, and we’re not talking about a huge impact to hundreds of users, but it would still be painful and I’d get a lot of complaints. That being said, I’m fairly certain that given some lab time, I could automate the login synchronization using something like Powershell or SSIS, and perhaps dispatch with the manual re-point using something like Network Load Balancing.

So, for now, mirroring is definitely on the “possibility” list.

DDL in Stored Procedures and The Magic Of WITH EXECUTE AS

Posted by Josh | Posted in Notes From The Lab, SQL Server | Posted on 05-04-2011

Tags: ,

1

Twice over the last week I’ve run into a scenario where a user is trying to execute a stored procedure and gets a message similar to the following:

Msg 1088, Level 16, State 7, Procedure p_TruncateTest, Line 5
Cannot find the object "TestTable" because it does not exist or you do not have permissions.

At first this was a bit puzzling, because the user in question had both EXECUTE rights on the stored procedure, along with VIEW DEFINITION rights across the schema. I’m used to the idea that as long as a user is granted rights to execute a procedure, and the procedure is owned by the same principal as the underlying object being manipulated, no other permissions are checked. It’s one of the fundamental reasons why stored procedures are such a good practice.

In this case though, things were a little more complex. It turns out that the line in question was issuing a TRUNCATE statement. Now, if this were executed via dynamic SQL, I could understand this being an issue. But it was hardcoded; nothing dynamic about it. That being the case, I wanted to see if the behavior was reproducible.

To the lab!

CREATE TABLE dbo.TestTable (c CHAR(1));
CREATE USER TruncateUser WITHOUT LOGIN;
GO

CREATE PROCEDURE dbo.p_TruncateTest
AS
BEGIN
	TRUNCATE TABLE TestDB;
END
GO
GRANT EXECUTE ON dbo.p_TruncateTest TO [TruncateUser];
GO

EXECUTE AS USER = 'TruncateUser';
EXEC dbo.p_TruncateTest;

Sure enough, the same message popped up. So it seems the TRUNCATE statement does not behave the same way as other DML style statements, such as SELECT,INSERT,UPDATE,DELETE. So that got me thinking: was this the same for other DDL statements?

REVERT;
GO

CREATE PROCEDURE dbo.p_AlterTest
AS
BEGIN
	ALTER TABLE TestTable ADD c2 CHAR(1);
END
GO
GRANT EXECUTE ON dbo.p_AlterTest TO TruncateUser;

EXECUTE AS USER = 'TruncateUser';
EXEC dbo.p_AlterTest;
GO

After executing this I was greeted with the same exact error message.

Ok, so the behavior is consistent with other forms of DDL. That being the case, I needed to find some alternatives to actually granting the user the full ALTER rights on the objects in question, since that would allow them to do all sorts of other nasty things.

Enter EXECUTE AS.

From this BOL article, we see that we can use this nifty clause to effectively elevate a calling user’s rights for the duration of the stored procedure execution. In this case, we’re going to have the procedure execute as the ‘dbo’ user. Normally I’d say it’s a better practice to have the execution be under a less privileged user, such as one that owns the application schema, and not one in the db_owner role.

REVERT;
GO

ALTER PROCEDURE dbo.p_TruncateTest
WITH EXECUTE AS 'dbo'
AS
BEGIN
	TRUNCATE TABLE dbo.TestTable;
END
GO

EXECUTE AS USER = 'TruncateUser';
EXEC dbo.p_TruncateTest;
GO

Voila! The procedure executes without issue.

Now there is one catch here: when using that WITH EXECUTE AS clause you have be careful about how you determine the name of the user performing the operation. If, for example, you use the CURRENT_USER function to log who issued the command, what you’ll see is the user named in the WITH EXECUTE AS clause, not the calling user. Instead, make sure you use the ORIGINAL_LOGIN function. Thanks to Kimberly Tripp for pointing this out.