One (Very Important) Goal For 2012 – AUTOMATE EVERYTHING

Posted by Josh | Posted in Life As A DBA, Powershell, SQL Server | Posted on 02-01-2012

Tags: , , ,

2

Yep, you heard me right. I am going to automate everything I do. Or, at least, try to.

I’m not going to talk in detail about why I’m doing this, other than to say:

  1. I have better things to do with my time than perform mundane, repeatable tasks.
  2. I recently re-read John Sansom’s excellent blog post on the subject, and, as is usually the case with John’s work, found it rather inspiring. Bravo John!

Instead, I’m going to talk a little about two kinds of “automation”, and why I generally pick one that might not, at first glance, be the right choice.

Full versus Partial Automation

I will define something that is fully automated as a task or operation which requires absolutely zero manual effort on my part. That is, I don’t have to push a button, flip a switch, type a command, or even know that it is happening, unless something goes wrong. An example of something that is already fully automated in my world is the weekly and daily database maintenance that I have installed, using Ola Hallengren’s excellent (and freely available) scripts. These run every night and ensure that my databases are backed up, corruption free, and optimized. Unless something goes *bump* and the jobs fail, I don’t get so much as an e-mail upon completion.

By contrast, a process that is partially automated requires some manual effort, but is still optimized such as to require as little work on my part as possible. I may need to open a script, enter a few parameters, or run a Powershell command, but outside of that, all the logic and processing is done behind the scenes, with perhaps a pretty little progress bar showing that work is, well, progressing. An example of this might be setting up of a server side SQL trace; I have a series of scripts ready that, given a database name and a path at which to place the trace files, will setup a standardized trace, grant rights on some signed stored procedures to allow non-DBAs to read the trace data, and even generate a README file to be sent to the requesting team with a series of instructions.

In an ideal world, I suppose that all of the mundane daily work I have to take care of would be in the former (fully automated) category. But in reality, I find that there’s a bit of a diminishing return on putting more and more effort into fully automating tasks. Consider this scenario / requirement: the request and creation of new databases.

At a basic level, the process we follow is something like this:

  1. A developer requests a new database via our in-house ticketing system.
  2. We (my team) ensure that the request at least contains the following details:
    1. The name of the database.
    2. The server on which to create it (or at least a version of SQL Server).
    3. The initial size of the database.
    4. The domain users or group that requires access to the database.
  3. Assuming we’ve got all the information required, we will go out to the database server where the databases will be placed, and determine a suitable location to place the data files. This is usually determined by simple the simple rule of “whichever volume has the most space”, with a few exceptions. If there is not enough space on the server, we’ll inform the requestor that there will be a delay, and start the process of ordering more storage.
  4. Once we know where the data and log files will be placed, we will create the database, take a full backup (to initialize the differential backup chain), and grant the access requested.

Now, I could certainly envision writing some kind of application that would allow developers to submit requests for new databases and process everything straight through based on the rules outlines above, similar to what I see on my web host’s control panel for MySQL databases. But, at the same time, constructing such as system would take a lot of time and effort. Contrast that with a partial automation solution, such as:

  • Provide an InfoPath form for the requests, forcing people to enter the necessary details in before submitting the request, thereby eliminating back-and-forth chatter.
  • Use a Powershell script that, when given a server name, database, and log file size, will connect via WMI and determine a list of suitable locations for placing the database files, then prompt the user to select one. When selections are made, the database is created and a full backup taken. This eliminates logging into the computer (to look at drives) and stepping through a bunch of GUI screens to create the database.
  • Have a second Powershell script that can accept a text file containing domain groups / users and role names, which will then grant the access required on a specified database. The file could be generated from the InfoPath form. Again, this eliminates some GUI clicks, and minimizes the chance of fat-fingering group / user names.

While this will not prevent someone from having to manually process the request, it will significantly cut down on the pain to do so, without a great deal of effort / complexity. Especially considering that these requests are fairly infrequent (perhaps one to three a month), I see no reason to spend additional time on providing a fully automated solution, when the partial one provides almost as much value.

Over the course of the year, I’ll be blogging about the various processes / means I use to accomplish this goal.

How do you automate your processes, and how do you determine a “break-even” point, if you will?

Are your processes generic?

Posted by Josh | Posted in SQL Server, The Lone DBA | Posted on 11-12-2011

Tags: , ,

0

This post is the sixth in an ongoing series about how to survive as the only DBA in your organization. Since October of last year, I’ve been assigned to a team that is responsible for owning and maintaining the development infrastructure. It’s a great team of seasoned professionals, but not a single other DBA. As a result, I’ve had to think very carefully about how I go about my daily work, so as to give our customers consistently good service, while still allowing those without a lot of SQL Server related knowledge to pick up my work when I’m not available.

Why is being generic a good thing?

When I think of the word “generic”, I usually picture those off-brand foods at most grocery stores, with their simplistic labels and lackluster colors. But in the case of processes, being “generic” really means “standardized, yet flexible”. This is a good thing, because it means your processes can answer many different (but ultimately) related needs.

Let’s take an example: a development team needs to trace activity in their database. We’re going to assume that just granting this group rights to run the trace is not an option, since, let’s say, they’ve taken the server down with a poorly done client side trace in the past (don’t laugh, it happened to me). In any case, let’s look at three options to answer this request:

Have the developers sit with you while you run a Profiler trace

I don’t like this for several reasons, not the least of which is that it is going to take a good chunk of my time. Because of course, the developer will probably have no idea what they are looking for, and may not be able to product the condition they are trying to capture on demand. It also still uses Profiler, which, as far as I’m concerned, should be banned.

Script out a one-time trace and have it run on the server

This is better, because it takes a lot less of my time to simply setup a server-side trace and let it run. I can then let the developer’s read in the trace files via something like a signed stored procedure (a topic for another day perhaps). But there’s still the one-off aspect: who’s to say that the next time the developers need this I’ll be around, have saved the trace definition, etc?

Write a templatized script that accepts a database name and a path for trace files, and use it going forward

This, to me at least, is the best option. After a slightly longer initial setup (one-time to write and document the script, plus test it), setting up subsequent traces will take very little time. In addition, the use of a template will mean a consistent experience / process for my customers, even when I’m not around. Even when I am around, it will also make it easier for DBA Junior to handle the request, leaving me to look at more interesting things. And by making the script flexible enough to handle different servers / databases, it becomes much more useful.

I go so far as to have a “no one-off” policy at work. That is, if I do something, I script it, put some parameters in, and save it off to source control. Then I publish it in our procedures manual, so that if a similar request comes in the team can handle it right away. It leads to a lot of scripts that are not highly used, but it also means less work in the long term, and a great bag of tricks in the process.

But, can something be too generic?

Sure it can. I’ve fallen into the trap many times of trying to have one process fit way too many needs, only to end up with a monstrous, un-followable mess. If you’ve got a ton of “if this is true, do this, otherwise do this” type of logic in your process, you might want to consider if you’re really answering related needs. This is kind of like the process equivalent of that awful stored procedure we’ve all seen; you know, the one that has twenty plus input parameters, and has every one in the WHERE clause as WHERE ((some_field = @some_parameter) OR (@some_parameter = NULL)). It looks good, but in the end the execution is piss poor.

The Lone DBA – Why Automating Processes Is Good

Posted by Josh | Posted in Life As A DBA, SQL Server, The Lone DBA | Posted on 17-10-2011

Tags: , ,

1

This post is the fourth in an ongoing series about how to survive as the only DBA in your organization. Since October of last year, I’ve been assigned to a team that is responsible for owning and maintaining the development infrastructure. It’s a great team of seasoned professionals, but not a single other DBA. As a result, I’ve had to think very carefully about how I go about my daily work, so as to give our customers consistently good service, while still allowing those without a lot of SQL Server related knowledge to pick up my work when I’m not available.

Automation Is A (Lone) DBA’s Best Friend

I subscribe wholly to the sentiment expressed in John Sansom’s excellent post, titled “The Best Database Administrators Automate Everything”. Time is something we cannot make more of for ourselves, and as such is highly precious. If you are spending time doing repetitive work that could be either partially or fully automated, that is time wasted.

But there’s another benefit as well to automating processes: it makes them easier to hand-off. Let’s consider the example of restoring a copy of a production database to a test server. The typical steps I would complete would be:

  1. Determine the location that the backup file was placed at by the production DBA group (I do not have access to production backups, so they must copy them down to my server).
  2. Determine where the database files should be placed on the server using this logic:
    1. If the database already exists on the server, the files should be placed at the same location as they currently reside.
    2. If the database does not exist, then the files should be placed at the default data and log location as specified in the server configuration.
  3. If the database exists, set the database to single user mode using the “WITH ROLLBACK IMMEDIATE” option to effectively kill any open connections to the database.
  4. Issue the restore command, using native commands or the Litespeed equivalent if the database came from a server using Litespeed.
  5. Execute the databases specific post restore script from the source code repository. These scripts reset permissions for developers, synchronize users, etc.

While none of these steps are particularly hard, documenting them and teaching them to others would not be a small amount of work. Consider that step two involves querying system tables (or stepping through some screens in SSMS), and steps three and four require knowledge of T-SQL commands for restoring databases.

Instead, I chose to write a Powershell script which handles as much of the logic as possible. All it requires is a server name and the name of a backup file, and it handles the rest (except step five, but I’m working on that). This way, instead of teaching all that logic, all I need to do is show someone how to find the backup file (further simplified by standardizing paths across all the servers) and run the script.

I don’t know about you, but I’d much rather write code to do a job than write a document about how to do the job manually.

The Lone DBA – What Makes A Process Simple?

Posted by Josh | Posted in Life As A DBA, SQL Server, The Lone DBA | Posted on 17-10-2011

Tags: ,

0

This post is the third in an ongoing series about how to survive as the only DBA in your organization. Since October of last year, I’ve been assigned to a team that is responsible for owning and maintaining the development infrastructure. It’s a great team of seasoned professionals, but not a single other DBA. As a result, I’ve had to think very carefully about how I go about my daily work, so as to give our customers consistently good service, while still allowing those without a lot of SQL Server related knowledge to pick up my work when I’m not available.

What Makes A Process Simple?

sim·ple   [sim-puhl] adjective

  1. easy to understand, deal with, use, etc.: a simple matter; simple tools.
  2. not elaborate or artificial; plain: a simple style
  3. not ornate or luxurious; unadorned: a simple gown
  4. unaffected; unassuming; modest: a simple manner
  5. not complicated: a simple design

When I design a process for my team to follow, I always do my best to follow the definition above, especially the first and last lines. Processes should be easy to follow and only as complex as they need to be, no more. Programmers can sometimes get carried away with complexity, because a) it’s fun, b) it lets us stretch our muscles a little. That’s all fine and good in your lab, but when you’re trying to design something to be used in the real world by actual people (other than yourself), complexity can be a huge problem.

Take this example: your team needs a way to locate information about a particular server, including what it’s used for, make and model, etc. Here’s two choices for how you fulfill this requirement.

Option A – Command line / SQL scripts and a SQL Database

Sounds easy enough, right? And a relational database seems like the perfect place to store this kind of structured information.

Well, that’s great, for someone who is comfortable writing T-SQL code to interrogate a database. And who understands the structure of the database. And, oh by the way, can even find where the database is.

I’m guilty as charged on this one. When I started on my team there was no repository for server inventory, so I made one. I’d call it a pretty well designed database, normalization and structure wise, but there’s one glaring flaw: no interface. Every operation, such as adding a server, removing one, looking up information about one, etc, is done via T-SQL scripts (hand-written mostly). Fine for me, but my teammates are going to quickly get lost and just call me when they need information.

Let’s assume I’ve even saved scripts in our code library for common operations, such as looking up a server’s information by its name (not always a valid assumption, certainly). Even with that shortcut, so the team doesn’t have to write JOIN statements, consider the list of steps required to look up a server:

  1. locate the correct script in the repository
  2. Open the script in SSMS (do they have that installed?)
  3. Connect to the server where the management database resides (do they know where that is?)
  4. Look through the script and find the “find/replace” tags (I would at least use SSMS template parameters, though it’s debatable if that is easier)
  5. Execute the script and interpret the resulting text output

Hmm… five steps and several questions. Not very simple.

Option B – A Small Web Application With A SQL Back End

With this choice, we still get the benefits of storing the data in relational format, but with a cleaner interface (even my crappy UI design skills can manage this one). Most competent admins can understand a web interface with fields and buttons, so they should have little problem using it once trained (and of course, its use would be documented).

Consider the steps required to perform the same operation here:

  1. Open a web browser and browse to the web site (do they know where it is?)
  2. Select the “Find Server By Name” option presented on the home page.
  3. Enter the server name and click the “Find” button.

We’re down to three steps and only one question (do they know what the site address / URL is). Now we’re getting into the realm of simple.

One could certainly argue that there’s an even easier solution, along the lines of a folder on a searchable network drive with a set of Word documents, one per server, in standard format. I wouldn’t discount that argument, but I just really had folders full of Word documents, especially when you have to go through all of them one at a time to change some common field.

You could also argue that the process of writing a web application (and supporting it) would outweigh the simplicity gained by having the information presented in that manner. After all, can anyone else on your team understand C# code? What happens when it breaks and you’re on vacation? That may be true, but I honestly believe that if you write your code properly (and, drumroll please, simply) then the application should be pretty much self-sufficient. Good code hums along and doesn’t require babysitting.

So the next time you’re designing a process for your team, remember to keep things as simple as you can. Your teammates will thank you.

What’s holding up my SQL Server replication?

Posted by Josh | Posted in Life As A DBA, SQL Server | Posted on 09-10-2011

Tags: , ,

0


In the last two weeks, my team at work has taken on support of a new application that makes heavy use of SQL Server replication technologies. This is one are I’ve never played around with much, so it has been a steep learning curve to say the least.

One of the more common scenarios I’ve run into is that the distribution agent encounters an error applying replicated transactions, causing the flow to stop dead in its tracks. From some research, it seems like in general this symptom indicates that the publisher and subscriber are out of sync, and usually the best option is to re-initialize the subscription. For various reasons, however, that is not an option in this case, so I’m left with resolving the issue causing the replication to fail. That means finding the actual command causing the failure, which was a little tricky.

I’ve found that the Replication Monitor GUI is not very helpful in this regard. What I tend to see is that it reports the transaction sequence number just fine, but the command text is less reliable. Mostly I’ve seen the last command in the batch, which (with my application) usually ends up being a simple “IF @@TRANCOUNT > 0 COMMIT” statement. Not very useful.

So instead, I’ve taken to using some good old T-SQL commands to get the information I want. The first is the sp_helpsubscriptionerrors stored procedure. This is executed at the Distributor and will return a list of errors for a specified subscription. The columns error_text, xact_seqno, and command_id are very useful, as they tell you the specific error message being encountered, as well as the transaction and command that are causing the problem.

Once we have this information, we can use the sp_browsereplcmds stored procedure to actually view the command(s) causing the issue. While you could feed it both the transaction sequence number (usually the same for @xact_seqno_start and @xact_seqno_end) and the command ID (@command_id), I find that it’s most helpful leaving the command ID out. This way you can see the entire transaction’s worth of commands (including the offender), giving you some good context to evaluate how to resolve the problem.

As for actually resolving the problem, well that of course depends upon what the root cause of the issue is. In our case, it seems that most times the problem lies with some developer deleting or adding a row at the subscriber, causing data validation errors. It’s a painful process, which is why our response is usually “Hey Mr. Developer, you deleted this row and now it’s causing problems. Please let us know when you’ve recreated the row and we’ll restart replication.” Let people fix their own problems.

I’m sure I’ll blog more on the subject as I encounter other problems and learn more about supporting replication.

A Tale of Coding Horror – T-SQL Tuesday #21

Posted by Josh | Posted in SQL Server | Posted on 10-08-2011

Tags: , ,

0


It’s time for T-SQL Tuesday again, that great community event started by Adam Machanic (Blog | Twitter) where bloggers all over write about the wonders of SQL. This month’s topic: Crap code, and boy do I have a lot to choose from. It was hard to pick one example (my early days as a programmer were riddled with bad decisions), but I finally decided on writing about my first ever software release.

A long time ago, in a building far far away…

…there was a young man who had just joined the ranks of an operations group at a financial firm. He had quit his job as a retail store manager (a job he only took because nothing was available in his chosen field of social services) and taken a temp assignment as a means to pay the bills. His job was simple: run reports from various systems (a mainframe green-screen system, a SQL database), carry some checks around, and various other menial tasks. But still, it was nice to be out of the 12 hour workdays of retail.

Soon after beginning work, the man noticed that a great deal of his work amounted to punching in a bunch of repeat keystrokes and waiting for things to print. As a result, he quickly learned the ways of ExtraBASIC, and wrote macros to tab around the terminal screens. He also learned how to use the query designer in Access, so that he could quickly get data out of a database when requested by various teams. Within a few months, word had spread of the reporting wizardry, and he was offered a permanent job within the group.

The First Fail

While not officially a programmer, he was given several tasks around building utilities for the group he worked with. Abandoning Access for greener pastures, he downloaded Visual Basic Express and began whipping together tools. Requirements shifted on a daily basis; after all, the ops groups weren’t used to having their own private coder at their beck and call. The young man enjoyed his work, if only because it kept him out of the daily grind of operational processing. And what with the simple drag-and-drop designer, the actual coding was minimal. A click here, a quick line or two of code there, and voila, a functioning application. Naturally Access / Jet was still used as the back end database, since that was what was available.

When it came time to roll out the application, the team was gathered and a demo given. Everything worked fine, and the decision was made to start using the new app right away. Everything was in place, and the go-ahead given. But soon after the use started, problems began appearing. Data wasn’t being consistantly updated, operations took a long time to complete, results weren’t as expected. After several hours, the decision was made to go back to the previous method of operation.

When at first you don’t succeed, try, and fail, again…

The young man was discouraged by this early setback, but he learned from his mistakes. At least until the next release, which included such wonderful upgrades as using MySQL as the back-end (on a dusty desktop box) and optimized in-line SQL (as opposed to GUI generated code). It fared much better, lasting several days before major issues were uncovered.

Within several months, he was offered a job as a full time programmer. He quickly churned out code, and his teammates loved his work ethic. What they didn’t love was his propensity for writing overly complex, convoluted code, which also conveniently was bereft of any comments. He also had a knack for finding and leveraging new technologies, such as an open source job scheduler, SQL Server Express (installed on that same dusty little box), and random .NET classes pulled off the internet. This was great when it came to functionality, and the business units were pleased. That is, until he was on vacation and something went bump, and they were forced into manual processing while the man’s teammates frantically dug through .vb files.

Since then…

I’ve learned quite a bit about writing better code since that first debacle, with many, many more missteps along the way. I’ve written code that was far, far too complex for what it needed to be (think a whole class for a simple calculation), scaled terribly (a linked server as a source for an 8+ hour ETL process that involved three layers of nested cursors), and was completely undocumented. But I’ve also grown tremendously as a programmer, learning lessons such as:

  1. Write simple, reusable code. Don’t try and be clever; just write what is needed and nothing more.
  2. Don’t rely on GUI generated code. Use it as a basis, but understand what it does and tweak it to better suit your needs. I yell at developers constantly because they try and do something via the Management Studio interface instead of writing the code. Writing encourages understanding, which in turn encourages better code.
  3. Have solid test plans in place, and be methodical about following them.
  4. Code for future scale, not current. I still refer people to the paper written by the Google founders years ago, where they explicitly acknowledge and plan for greatly increasing activity.
  5. Document your code to the teeth. Some day in the future, long after you’ve moved on and requirements documents are long lost, the code will break, or need to be updated, and your comments will be the only information available about why things were done the way they were. I don’t know about you but I’d rather have people singing my praises after I’m gone, as opposed to throwing strings of four letter words (as well as various hard objects) at my picture.

I still cringe when I look back on those early days. I’m actually good friends with the guy who was my first development manager, and every time he mentions the name of some application I wrote, I scowl and apologize for having given birth to such junk. Everyone writes code that makes their skin curl years later, but what’s really important is that we strive to learn and improve our skills on a continuous basis.

APPLYing Your SQL Knowledge With Cached Plans And XPath – TSQL Tuesday #17

Posted by Josh | Posted in SQL Server | Posted on 12-04-2011

Tags: , , ,

4

T-SQL TuesdayIt’s the second Tuesday of the month, which means it’s time for T-SQL Tuesday again! This month is hosted by Matt Velic, a fellow former accidental DBA who I had the pleasure of meeting at SQL Saturday in Philadelphia this year. The topic is the APPLY operator, and I can think of no better way to illustrate its use than with XPath and some of the absolutely awesome DMV’s available for interrogating the plan cache.

One of the big challenges I’ve faced as a DBA is proactively finding and fixing bad code before it reaches a critical stage and causes issues. One way I’ve found to effectively do this is to monitor and interrogate the plan cache on a server, looking for problem children and areas of concern across several dimensions. Fortunately for us, execution plans are stored as XML, which means we can use XPath to seek out specific elements which could indicate a poor query.

For example, here’s a query that will look at the top 10 most expensive query plans in terms of average logical IOs:

			SELECT  TOP 10
			DB_NAME(CAST(qpa.value AS INT)) DBName,
			qs.total_logical_reads / qs.execution_count avg_logical_reads,
			SUBSTRING(st.text, CASE
								WHEN qs.statement_start_offset IN (0,NULL) THEN 1
								ELSE qs.statement_start_offset/2 + 1
							   END,
							   CASE
								WHEN qs.statement_end_offset IN (0,-1,NULL) THEN LEN(st.text)
								ELSE qs.statement_end_offset/2
							   END - CASE
										WHEN qs.statement_start_offset IN (0, NULL) THEN 1
										ELSE qs.statement_start_offset/2 +1
									 END
					 ) query_text,
					 qp.query_plan
			FROM	sys.dm_exec_query_stats qs
				CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
				CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
				CROSS APPLY sys.dm_exec_plan_attributes(qs.plan_handle) qpa
			WHERE	qpa.attribute = 'dbid'
					AND qpa.value > 4
			ORDER BY qs.total_logical_reads / qs.execution_count DESC

Now this information in and of itself is fairly useful, but using a little more APPLY magic and some XPATH, we can get even more information. Let’s say, for instance, we’d like to pull out all clustered index scans. In XML execution plans, every one of these little nasties is represented by a “” element, with an attribute value of @PhysicalOp="Clustered Index Scan". Easy enough to find using the nodes() XPATH function, at which point we can also gather some useful numbers such as estimated IO and CPU cost.

WITH	XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
		TopIOQuery AS
		(
			SELECT  TOP 10
			DB_NAME(CAST(qpa.value AS INT)) DBName,
			qs.total_logical_reads / qs.execution_count avg_logical_reads,
			SUBSTRING(st.text, CASE
								WHEN qs.statement_start_offset IN (0,NULL) THEN 1
								ELSE qs.statement_start_offset/2 + 1
							   END,
							   CASE
								WHEN qs.statement_end_offset IN (0,-1,NULL) THEN LEN(st.text)
								ELSE qs.statement_end_offset/2
							   END - CASE
										WHEN qs.statement_start_offset IN (0, NULL) THEN 1
										ELSE qs.statement_start_offset/2 +1
									 END
					 ) query_text,
					 qp.query_plan
			FROM	sys.dm_exec_query_stats qs
				CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
				CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
				CROSS APPLY sys.dm_exec_plan_attributes(qs.plan_handle) qpa
			WHERE	qpa.attribute = 'dbid'
					AND qpa.value > 4
			ORDER BY qs.total_logical_reads / qs.execution_count DESC
		)
SELECT	t.DBName,
		t.avg_logical_reads,
		CAST('<q> ' + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(t.query_text,' OR ',' OR ' + CHAR(10)),' AND ',' AND ' + CHAR(10)),'&','&amp;'),'>','&gt;'),'<','&lt;') + ' </q>' AS XML) Query,
		t.query_text,
		RelOp.Col.value('(./@PhysicalOp)[1]','VARCHAR(200)') Operation,
		ISNULL(RelOp.Col.value('(.//Object[1]/@Schema)[1]','SYSNAME'),'tempdb.') + '.' + RelOp.Col.value('(.//Object[1]/@Table)[1]','SYSNAME') TableName,
		RelOp.Col.value('(./@EstimateRows)[1]','FLOAT') EstimatedRows,
		RelOp.Col.value('(./@EstimateCPU)[1]','FLOAT') EstimatedCPU,
		RelOp.Col.value('(./@EstimateIO)[1]','FLOAT') EstimatedIO,
		RelOp.Col.value('(./@EstimatedTotalSubtreeCost)[1]','FLOAT') EstimatedCost,
		RelOp.Col.query('./OutputList') OutputList,
		CAST('<p> ' + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(RelOp.Col.value('(./IndexScan/Predicate/ScalarOperator/@ScalarString)[1]','VARCHAR(MAX)'),' OR ',' OR ' + CHAR(10)),' AND ',' AND ' + CHAR(10)),'&','&amp;'),'>','&gt;'),'<','&lt;') + ' </p>' AS XML) Predicate,
		RelOp.Col.query('.') RelOpXML,
		t.query_plan QueryPlan
FROM	TopIOQuery t
			CROSS APPLY t.query_plan.nodes('//RelOp[@PhysicalOp="Clustered Index Scan"]') RelOp(col)
WHERE	RelOp.col.value('(.//Object/@Schema)[1]','SYSNAME') <> '[sys]'
ORDER BY avg_logical_reads DESC,query_text,EstimatedIO DESC

You could modify this query to return just about set of information you like. Execution plans are a veritable gold mine of information. If you’re liking what you see, I’d highly recommend you read the excellent series “Can You Dig It” by Jason Strate (blog | twitter).

One final note: I would be careful about running these on a live instance during busy hours, they can be quite resource hungry. Happy plan querying!

HA? We don’t need no stinkin’ HA (Part 4)

Posted by Josh | Posted in SQL Server | Posted on 11-04-2011

Tags: , ,

0

This is part four of a series cataloging my search for the ideal High Availability solution for my development and QA environments. The other parts:

  1. Part 1 – an introduction to why dismissing the need for HA in development / test systems is not necessarily the right decision.
  2. Part 2 – a quick poll of the community to see what others are doing.
  3. Part 3 – Clustering

In part four, we’re going to talk about database mirroring.

From Microsoft Books OnLine:

Database mirroring maintains two copies of a single database that must reside on different server instances of SQL Server Database Engine. Typically, these server instances reside on computers in different locations. One server instance serves the database to clients (the principal server). The other instance acts as a hot or warm standby server (the mirror server), depending on the configuration and state of the mirroring session. When a database mirroring session is synchronized, database mirroring provides a hot standby server that supports rapid failover without a loss of data from committed transactions. When the session is not synchronized, the mirror server is typically available as a warm standby server (with possible data loss).

I won’t dig into the internals of database mirroring much here. For a more detailed explanation, I would refer you to the excellent book Pro SQL Server 2008 Mirroring by Ken Simmons (blog | twitter) and Robert Davis (blog | twitter).

There are two basic modes for database mirroring: high-safety and high-performance. In high-safety mode, transactions are not considered committed on the principal until they have been fully sent to and processed by the mirror server. In high-performance mode, contrastingly, transactions commit immediately, then are queued up to be sent to and processed by the mirror server. This means the potential of a much higher throughput on the principal, but it also means you’re not guaranteed zero data loss if your principal goes offline. Note that high-performance mode is only available in Enterprise Edition.

Database mirroring can be setup in two configurations, architecture-wise:

  1. Two-server
    1. Principal
    2. Mirror
  2. Three-server
    1. Principal
    2. Mirror
    3. Witness

In a two server setup, the principal is the server answering queries, while the mirror sits in a warm-standby capacity. The three server setup is the same, with the addition of the Witness server, which acts as a kind of third-party watchdog. If both the mirror and the witness agree that the principal is down, then an automatic failover is initiated, with the mirror server becoming the principal and being fully able to answer queries. If, however, the witness still considers the principal to be online, no failover is initiated. You can, however, manually failover a mirroring setup through Transact-SQL or the SSMS GUI.

When I initially read about database mirroring, and saw the word “failover”, my first thought was that this would work similarly to cluster supported failovers, in that it would be largely transparent to the end user, except for a brief outage. However, this is not necessarily so. If you are using SQL Native Client or the .NET SQL Client provider, you can specify the “FailoverPartner” (OLEDB) or “Failover_partner” (ODBC) attribute in your connection string, referencing the mirror server. The driver is smart enough to try and connect to the specified partner name if the connection to the primary server fails. See this BOL article for more details. However, if you’re dealing with legacy applications that don’t use .NET or SQL Native Client, you’re stuck with a manual re-point of connections. I don’t know about you, but I have a ton of those kinds of apps lying around.

Another downside is that you cannot mirror the system databases (master, msdb, model). This means that things like SQL Agent jobs, logins, etc, will not be automatically transferred upon failover. Granted, there are plenty of ways to handle these in other fashions, but it’s definitely something to keep in mind.

Also, I would recommend you read these post (number one and two) from Paul Randal (blog | twitter) that debunk some myths around database mirroring.

Conclusion

I think database mirroring has some real promise. It doesn’t require expensive or identical hardware configurations, is fairly easy to setup (I did it in my lab in about an hour, and most of the work could probably be scripted), and is available in both Standard and Enterprise versions of SQL Server (albeit only in High-Safety mode for Standard, but since I’m not mirroring across any WANs I don’t think network throughput would be an issue).

However, the idea of a manual re-point of clients and synchronization of logins is not one I cherish. Yes, this is development, and we’re not talking about a huge impact to hundreds of users, but it would still be painful and I’d get a lot of complaints. That being said, I’m fairly certain that given some lab time, I could automate the login synchronization using something like Powershell or SSIS, and perhaps dispatch with the manual re-point using something like Network Load Balancing.

So, for now, mirroring is definitely on the “possibility” list.

DDL in Stored Procedures and The Magic Of WITH EXECUTE AS

Posted by Josh | Posted in Notes From The Lab, SQL Server | Posted on 05-04-2011

Tags: ,

0

Twice over the last week I’ve run into a scenario where a user is trying to execute a stored procedure and gets a message similar to the following:

Msg 1088, Level 16, State 7, Procedure p_TruncateTest, Line 5
Cannot find the object "TestTable" because it does not exist or you do not have permissions.

At first this was a bit puzzling, because the user in question had both EXECUTE rights on the stored procedure, along with VIEW DEFINITION rights across the schema. I’m used to the idea that as long as a user is granted rights to execute a procedure, and the procedure is owned by the same principal as the underlying object being manipulated, no other permissions are checked. It’s one of the fundamental reasons why stored procedures are such a good practice.

In this case though, things were a little more complex. It turns out that the line in question was issuing a TRUNCATE statement. Now, if this were executed via dynamic SQL, I could understand this being an issue. But it was hardcoded; nothing dynamic about it. That being the case, I wanted to see if the behavior was reproducible.

To the lab!

CREATE TABLE dbo.TestTable (c CHAR(1));
CREATE USER TruncateUser WITHOUT LOGIN;
GO

CREATE PROCEDURE dbo.p_TruncateTest
AS
BEGIN
	TRUNCATE TABLE TestDB;
END
GO
GRANT EXECUTE ON dbo.p_TruncateTest TO [TruncateUser];
GO

EXECUTE AS USER = 'TruncateUser';
EXEC dbo.p_TruncateTest;

Sure enough, the same message popped up. So it seems the TRUNCATE statement does not behave the same way as other DML style statements, such as SELECT,INSERT,UPDATE,DELETE. So that got me thinking: was this the same for other DDL statements?

REVERT;
GO

CREATE PROCEDURE dbo.p_AlterTest
AS
BEGIN
	ALTER TABLE TestTable ADD c2 CHAR(1);
END
GO
GRANT EXECUTE ON dbo.p_AlterTest TO TruncateUser;

EXECUTE AS USER = 'TruncateUser';
EXEC dbo.p_AlterTest;
GO

After executing this I was greeted with the same exact error message.

Ok, so the behavior is consistent with other forms of DDL. That being the case, I needed to find some alternatives to actually granting the user the full ALTER rights on the objects in question, since that would allow them to do all sorts of other nasty things.

Enter EXECUTE AS.

From this BOL article, we see that we can use this nifty clause to effectively elevate a calling user’s rights for the duration of the stored procedure execution. In this case, we’re going to have the procedure execute as the ‘dbo’ user. Normally I’d say it’s a better practice to have the execution be under a less privileged user, such as one that owns the application schema, and not one in the db_owner role.

REVERT;
GO

ALTER PROCEDURE dbo.p_TruncateTest
WITH EXECUTE AS 'dbo'
AS
BEGIN
	TRUNCATE TABLE dbo.TestTable;
END
GO

EXECUTE AS USER = 'TruncateUser';
EXEC dbo.p_TruncateTest;
GO

Voila! The procedure executes without issue.

Now there is one catch here: when using that WITH EXECUTE AS clause you have be careful about how you determine the name of the user performing the operation. If, for example, you use the CURRENT_USER function to log who issued the command, what you’ll see is the user named in the WITH EXECUTE AS clause, not the calling user. Instead, make sure you use the ORIGINAL_LOGIN function. Thanks to Kimberly Tripp for pointing this out.

HA? We don’t need no stinkin’ HA (Part 3)

Posted by Josh | Posted in SQL Server | Posted on 04-04-2011

Tags: , ,

0

This is part three of a series cataloging my search for the ideal High Availability solution for my development and QA environments. The other parts:

  1. Part 1 – an introduction to why dismissing the need for HA in development / test systems is not necessarily the right decision.
  2. Part 2 – a quick poll of the community to see what others are doing.

In part three, we’re going to take a look at the first option for providing HA: clustering.

So what exactly is a cluster? From MSDN:

A cluster is a group of independent computer systems, referred to as nodes, working together as a unified computing resource. A cluster provides a single name for clients to use and a single administrative interface, and it guarantees that data is consistent across nodes.

Windows Clustering encompasses two different clustering technologies. These technologies implement the following two types of clusters.

  • network load balancing cluster filters and distributes TCP/IP traffic across a range of nodes, regulating connection load according to administrator-defined port rules.
  • failover cluster provides high availability for services, applications, and other resources through an architecture that maintains a consistent image of the cluster on all nodes and that allows nodes to transfer resource ownership on demand.

In this post we’re going to look exclusively at failover clustering. Network load balancing will be looked at in a later post, combined with the log-shipping HA methodology.

Briefly, a SQL failover cluster is composed of one or more (yes, you can make a one node cluster, though it obviously negates the HA feature) nodes, each with SQL Server installed on them. The nodes are backed by a common set of shared storage, such as a series of SAN-provided LUNs. The shared storage is used to house all SQL related files (except binaries, which are stored locally on each cluster), as well as any other items that are part of the cluster, such as backup volumes or custom components. An example of the latter might include custom Powershell or other scripts used by Agent jobs or whatnot.

Prior to Windows Server 2008, Microsoft only supported clustering between identical hardware, and that hardware had to be in the Windows Server Catalog. Basically, this a huge list of hardware that has been certified by Microsoft as being a supported configuration. If your hardware isn’t in that list, and you call Microsoft for support of a cluster issue, they will “offer troubleshooting tips “(reference this KB), but a resolution is not guaranteed.

Now in my environments (and I would guess in many other development environments), the vast majority of systems are virtual machines, rather than physical servers. That being the case, we would refer to this KB article from VMWare, listing all the documents and links to information about clustering under VMWare. To me, the most useful one is the link to the Cluster team’s blog, wherein we find this tidbit:

Windows Server 2003

For a cluster solution to be supported by Microsoft it must be a tested solution which has been qualified and verified to function properly with the Failover Clustering feature.  The full Windows Server 2003 cluster support policy is documented here:  http://support.microsoft.com/kb/309395

When a cluster solution has been qualified it will receive a ‘Designed for Microsoft® Windows® Server 2003′ logo and be listed on the Windows Server Catalog under “Cluster Solutions”.

Two separate VMware configurations have received a logo and are supported in Windows Server 2003 with vSphere 4.0 and EMC storage.  One configuration is with EMC V-Max storage and the other with EMC CLARiiON CX4 storage.  Details can be found here:

These are the only two supported Windows Server 2003 guest clustering configurations.  The Windows Server 2003 cluster logo program stopped accepting new submissions as of 12/31/09, no additional configurations will be added in the future.

Basically, for Windows 2003 clustering, only the two configurations listed are supported. Outside of that, you’re on your own.

For Windows Server 2008, the outlook is slightly better:

Windows Server 2008 / Windows Server 2008 R2

The Microsoft support policy for Failover Clustering radically changed with Windows Server 2008 to become much more flexible.  The following criteria must be met for a solution to be supported by Microsoft:

1.       On a host Windows Server Failover Cluster all hardware and software components must meet the qualifications to receive the appropriate “Certified for Windows Server 2008” or “Certified for Windows Server 2008 R2” logo.

a.       If a guest cluster is running inside a virtual machine on non-Microsoft hardware virtualization software, the virtual machine must be hosted by a virtualization solution that is listed in the Server Virtualization Validation Program

2.       The solution must not fail any of the tests in the cluster Validation tool.  With virtualized servers in a cluster, run the cluster validation wizard as you would with any other new cluster. The requirement for running the wizard is the same regardless of whether you have a “host cluster” (where failover will occur between two physical computers), a “guest cluster” (where failover will occur between guest operating systems all on the same physical computer), or some other configuration that includes one or more virtualized servers.

I read this as “so long as your hosts and VM provider are certified, and the cluster validation wizard passes you, you’re supported.”

There is also an excellent whitepaper from VMWare on the subject, available here. Some points of note:

  • It includes a detailed, step-by-step set of instructions on how to configure VMs to allow for clustering.
  • Only Windows Server 2003 and Windows Server 2008 R2 are considered supported.
  • By configuring your VMs in this manner, you lose the following VMWare features:
    • HA
    • DRS
      Note: my VMWare admins tell me that these settings also break snapshots and vRanger backups, though that’s not mentioned here.

Conclusion

For SQL 2008 R2 / Windows Server 2008 R2, the cluster solution appears valid, albeit a bit of a complicated setup and at the cost of DRS and live vMotion capability. I can vouch that the configuration does work as advertised, once you get all the pieces right. If you’re looking to implement this, do yourself (and your VMWare admins) a big favor: read the [fantastic] manual.

For Windows 2008 / SQL 2008 and Windows 2003 / SQL 2005 (our standard builds) on the other hand, this doesn’t appear to be a viable solution. The two specific setups listed in the Windows Server Catalog (linked above) aren’t in use where I work, and outside of purchasing physical hardware or relying on hand-me-downs as setups are decommissioned (neither of which I would consider a good approach), I don’t see a way to get a supportable configuration.