Comparing SQL Result Sets With Powershell

Posted by Josh | Posted in Life As A SQL Developer, Powershell, SQL Server | Posted on 23-10-2012

Tags: , , ,

0

Lately I’ve been doing a lot of work re-writing some reporting code that was performing terribly. You know the old joke “Start it, then go get a cup of coffee, then come back”? Well in this case, that would be more like “Start it, get a cup of coffee, go for a run, read the morning news, eat lunch, take a nap, eat dinner, watch an hour of TV, then come back.” Ugh…

Obviously it’s important when modifying and tuning code to ensure that you don’t affect the results that come out of the procedure. Normally I like to do this using unit tests, but in this case the logic of the procedure was so complex (and relied on a bunch of underlying vendor code) that to create tests for it would have taken weeks. So instead, I opted for the strategy of simply comparing the old and new output, given a set of standard parameters. At first I did this manually, but after comparing 200+ columns a few times, I said to myself, “Self! This is silly, why not use Powershell to compare the results automatically?”

The result is the Compare-SQLResultSet function. It’s rough and currently doesn’t handle differently shaped results sets at all, but it has already been a huge time saver. I hope to improve it as time goes on, but wanted to get it out so others in the community could use it. Because if you’re doing this kind of comparison work manually… well, perhaps you should find a different line of work, because clearly you’re not getting it.

Unit Testing T-SQL – Some Opening Thoughts

Posted by Josh | Posted in Life As A SQL Developer, SQL Server, T-SQL Programming | Posted on 06-09-2012

Tags: , ,

1

One of the first goals I have in my new role as a SQL Server developer is to learn how to write unit tests for SQL Server database code. I’ve started using the open source TSQLT framework, and I thought as I go I’d record my thoughts in an ongoing series.

What have I learned so far (in about a week’s time)?

Retro-fitting Unit Testing Is A Lot Of Work

This isn’t TSQLT specific, but I think it’s a valid point. One of my first projects is to add units tests to a number of existing stored procedures. It’s a lot of work, for a number of reasons. At least several of them actually don’t have much to do with unit testing, per say, but are just indicative of other issues. For example, creating unit tests is very difficult without solid and documented business requirements. In their absence, all you can do is try and reverse-engineer the code. This is even more difficult if the code is complex and not well commented.

In addition, if the stored procedures involve multiple tables, the necessary setup code can be lengthy. Just making sure all the tables have correct values is a significant amount of trial and error work. This alone accounted for hours of getting my first unit test up and running. When you do this, make a list of all the tables and go through each one methodically. Depending on the framework you are using, and how you are organizing your tests, you may want to do this work in “setup” code (meaning, code that executes once at the beginning of testing). TSQLT has this functionality; you just include a procedure called “SetUp” in your test suite.

FakeTable Is Your Friend

TSQLT has a stored procedure called “tsqlt.FakeTable”, whose function is to “[create] an empty version of the table without the constraints in place of the specified table.” This makes isolated testing very easy, without dealing with foreign keys or constraints. Naturally, if you’re actually testing those constraints (which, apparently is also helped by more TSQLT functionality) this is a bad thing. But if you’re just testing data returns or calculations, this can be immensely helpful.

Perseverance Is Key

This is tiring work, especially when having to create all the tests after the fact. But the feeling you get when you get one working is well worth it. It’s amazingly calming to know that you can make any change you want to a piece of code and instantly know if it breaks key functionality. It’s taken a lot of the anxiety out of working on some very critical code, especially as a relative newcomer to the field of database development. And once I’m done, I’m thoroughly convinced that the amount of time this will save in the future is well worth the effort.

That’s all for now. I’ll continue to record my thoughts in this tag as I get further down the road. In the mean time, if anyone has tips for a newbie, please feel free to share!

Are You Developing SQL In A Sandbox?

Posted by Josh | Posted in Life As A SQL Developer, SQL Server | Posted on 01-09-2012

Tags: , ,

0

As a development DBA I often got requests from developers to elevate their rights to sysadmin for the purpose of “trying out some things for a proof of concept.” Examples would include things like Service Broker, replication, and CLR. And every time my answer was the same: No.

Now, before you stop reading and go off muttering “Man, what a typical DBA-hole”, you should understand why I took this view. The systems where the developers wanted this access were shared enterprise development and QA systems, which were often used by many different teams. If I had caved and given the developers this access, and they’d somehow managed to take the instance down or otherwise affect its stability, I would have had scores of people standing at my desk screaming about lost productivity. And rightfully so, because as a DBA it’s my job to make sure their systems are working.

On the other hand, there is clearly a need to allow folks to experiment with methods and architectures. Otherwise, how would we ever learn how to do new things, or know if a particular approach is going to work? But the enterprise environment isn’t the place for that. Instead, this is where one of a developer’s most important tools comes into play: a sandbox environment.

The basic concept is this: you have an area where you can play to your heart’s content without fear of affecting others. You can tear things down and rebuild them as you please (and hopefully in automated fashion, because good developers are always lazy). Personally I like to have this setup on my personal machine, using a series of virtual machines. With hard drives and memory being pretty cheap, SQL Server Evaluation Edition, and great free virtualization products like VirtualBox readily available, there really isn’t an excuse not to have something like this. Using some scripting and products like the excellent SQLSPADE automated SQL installation framework, you could spin up a couple SQL servers in a matter of an hour.

Once you’ve proven out your idea, and gotten the setup process streamlined (because no DBA ever wants to get a multi-page document full of screenshots of clicking “Next” buttons), that’s when you can move on to the shared environments, to make sure things are going to work properly in a leveraged setup.

Since moving into a full time development role, I’ve become even more convinced this is an essential tool for being a successful database developer. It makes my life so much easier not having to constantly go through channels when I need something done like setting up a login on a server. Mind you, I still agree with those processes, because they protect the shared environment from renegades like me.

Do What You Love – Introducing Josh v. Next

Posted by Josh | Posted in Life As A DBA, SQL Server | Posted on 19-08-2012

Tags: , ,

2

Several months back at SQL Saturday 121 in Philadelphia, I had a great conversation in the speaker’s room with Mike Hillwig (blog | twitter) about career paths. One point that he made which really stuck with me was (and I’m paraphrasing roughly here) that the best way to find the job you truly love is simply to figure out what it is you like about your current one, then find a job where that’s all you do. I remember he attributed the advice to someone else, but for the life of me I can’t remember who (Mike please feel free to jump in and jog my woeful memory if you read this); in any case, the words really resonated with me.

For the last three years and change I’ve been in some form of a DBA / support role. And while overall I really enjoy my job, there’s a lot of aspects of it that I don’t find terribly fun. For instance:

  • Doing routine administrative work, such as creating logins or databases on servers, fixing broken backups, etc.
  • Having an endless barrage of “this server is broke, please fix it” e-mails, when in nine out of ten cases, the problem is their code, not my server.
  • Troubleshooting hard to trace infrastructure issues. Don’t even get me started on my frustration with Kerberos and Network Load Balancing.

But, let’s face it; those tasks are part of a DBA’s role.

After Mike’s words had rattled around in my brain for a few weeks, I began thinking: “Self! What are the things that you really enjoy about your job (outside of the people on your team, who are great)?” So I pulled out my thinking notepad, and started scribbling. The top three that I came up with, in no particular order, are:

  • Solving problems – not to be confused with “troubleshooting”; perhaps a more apt description would be “determining solutions for business and technical problems”.
  • Designing data structures – I’m by no means a data modeler, but I do enjoy the process of normalizing data, and creating elegant, simple structures to store information.
  • Performance Tuning – In an ideal world this skill wouldn’t be necessary, since all code would be written properly from the start and scale well. Fortunately for me, that’s not the case. There’s just something incredibly satisfying about taking some god-awful Gordian knot of nested subqueries and views, untangling it, and seeing an application’s performance just soar through the roof.

After this exercise I began thinking about what kind of job would let me focus on these things. After some thought, I came to a startling conclusion: I had to join the dark side. I had to become a developer.

But not just any developer, mind you. I could never be one of those heads down, coding machines that are just handed specs and go off on their merry way. No, I needed to find a job where I would have a fair bit of interaction with people, while still writing the code and getting my hands dirty.

Then, almost as if by magic, an amazing offer came my way. I was given a chance to join an elite group of database architects / developers within my company. This team is essentially the SQL Server equivalent of the A-Team; If you have a problem with SQL Server, and no one else can help, you bring these guys in. They do not own any specific applications or databases themselves. Rather, they come in, do a targeted assessment of the situation, then perform surgical work to get things back on track. Need help determining how best to store your database in source control (because you are doing that, right?)? They’ll deliver tools to help script out and organize your code, plus make deploying new releases a painless process. Have a query that is running horribly and can’t figure out why? They’ll rip apart the code and rewire it behind the scenes, and show you how to do it yourself going forward. In many ways, they are a lot like an internal consulting group.

At the time the offer came down the team only had two members, but combined they have almost twice my lifetime in experience working with databases. With their workload getting larger and larger by the day, it became apparent some additional help was needed. And I’m truly honored to say I’ve been given the chance to join them.

It’s bittersweet, for sure. I have loved my time in production support, and will miss the people and fun times I’ve had there. But I’m also incredibly excited about this new opportunity and the challenges it will bring.

And to Mike (and whomever originally passed the advice down to him): thanks for the push.

The Value Of Logging–#TSQL2sDay 31

Posted by Josh | Posted in Powershell, SQL Server | Posted on 12-06-2012

Tags: , ,

0

TT150x150This month’s T-SQL Tuesday is being hosted by Aaron Nelson (blog | twitter), and is all about logging. As it happens my recent presentation at SQL Saturday 121 included a discussion around this subject in the context of keeping track of administrative activities. As a result, the topic is fairly fresh in my mind.

Rather than re-hash the whole thing, I’ll simply restate what I consider the most important point from that section of the presentation.

bart-simpson-generator.php

You can be logging everything and anything under the sun, but if you’re not reviewing and being alerted as needed, there’s really no point. How you do so is really up to you; use whatever tool you are most comfortable with. For some folks that might be SSRS, others Powershell (shameless T-SQL2sDay host brown-nose: this is my preference for collecting logs from multiple sources and displaying them all in one place), or even SSIS. It’s the act, not the method, that matters.

Personally, I like to review everything first thing in the morning, after I’ve filled my coffee cup but before I start reading over my e-mail. That way, any issues are immediately noticed and can be dealt with before I get distracted. Taking a tip I once heard from Sean McCown of MidnightDBA fame, I collect everything in one place and then send a single summary of problems, rather than dealing with individual e-mails. This makes the task seem much less daunting.

How do you collect and review your logs (of all sorts)? Because, of course, you are reviewing them. Right?

Adapting vCheck for SQL Server

Posted by Josh | Posted in Powershell, SQL Server | Posted on 11-06-2012

Tags: , ,

2

For those of you not already aware, VMware and PowerCLI guru Alan Renouf (blog | twitter) recently upgraded his excellent vCheck Powershell script framework to have a plugin friendly approach. (The framework was originally written to provide a daily report of issues identified throughout a VMWare environment.) As a result, numerous forks have been popping up for getting daily reports on all sorts of systems, from Exchange to System Center. I noticed that there didn’t appear to be anything for SQL Server, and pinged Alan to confirm if anyone had starting working on one. When he responded that no one had, I volunteered to take a stab at it. The framework is an impressive piece of work, with some robust HTML reporting features and extensibility. If you’ve got a need to do any kind of centralized, scheduled reporting on some infrastructure, I highly recommend you check out his work.

And here, dear reader, is where I need your help. I want to make this as useful and complete a daily health check as possible for all of us DBAs out there. I have my own list of items that I want to check on, but I want to get more input to make sure I’m covering all the bases.

The scripts will connect to a CMS server, and iterate over all the servers contained within while performing various health checks. At the end, a nicely formatted HTML report is delivered listing all the problems identified. Thanks to Alan’s work, all the thresholds will be completely configurable.

Here is a list of the checks I’ve thought of so far:

  1. Ping test (is the server responding to a ping)
  2. Last backup date (full, differential, log)
  3. Last DBCC date
  4. Disk space free percent
  5. Services running (SQL Server, SQL Agent)
  6. Database file space free percent
  7. High severity errors (17+)
  8. Failed logins (over a certain threshold of counts)
  9. Failed agent jobs

This is a short list, which is why I’d love to hear from you. What would you like to see in your mailbox every morning that will give you the best view of your SQL Server infrastructure?

Presenting At #SQLSat121 in Philly

Posted by Josh | Posted in SQL Server | Posted on 13-05-2012

Tags: , , ,

0

I’m excited to announce that I’ll be presenting at SQL Saturday 121 in Philadelphia on June 9th! My session is titled “Avoiding Monkey At The Monitor By Delegating“; I’ll be showing some ways to securely delegate menial DBA work so that you can focus on more important (less urgent) work. While the session is really geared towards DBAs, it could be useful for some developers as well, since we’ll be talking about things like permission chaining and certificate security as well.

I’m obviously psyched to be part of a great group of speakers. I’d be lying if I wasn’t a little nervous too, since this is my first time presenting at this level (or really any outside of my company, for that matter). So, I’ll be practicing quite a bit between now and then.

Hope to see you there!

Finding All Access Groups For A Windows Login With sys.login_token

Posted by Josh | Posted in SQL Server, T-SQL Programming | Posted on 09-03-2012

Tags: ,

0

Recently I had a requirement to determine how a certain Windows user gained access to an instance of SQL. Let’s say, for example, you want to audit certain DDL code executions for some users, but not for others. We could do this at the individual login level, but that would be tedious to keep up. Instead, we want to use Windows domain groups to selectively enable the audit process. The trouble is, when a user is logged in all we see is their individual user name, not the name of the groups that, but virtue of them being a member, give them access to the server. Or do we?

I was reading a post by Erland Sommarskog on the subject of permission granting through stored procedures, and noticed his use of a system view called sys.user_tokens. This view (according to BOL) “Returns one row for every database principal that is part of the user token.” In essence, it shows one row per role or group that the user is part of. For example, if a user is a member of a Windows group that has access to the database, a row will be present for that.

This got me thinking, was there an equivalent for the server level? Sure enough, there is the sys.login_tokens view. This shows one row per server level authentication token. For some reason it seems to return duplicate rows at times, but when joined on the sys.server_principals table it does indeed seem to work as I hoped.

This way I can tell what domain groups a user is a member of that gives them access to the server.

One (Very Important) Goal For 2012 – AUTOMATE EVERYTHING

Posted by Josh | Posted in Life As A DBA, Powershell, SQL Server | Posted on 02-01-2012

Tags: , , ,

2

Yep, you heard me right. I am going to automate everything I do. Or, at least, try to.

I’m not going to talk in detail about why I’m doing this, other than to say:

  1. I have better things to do with my time than perform mundane, repeatable tasks.
  2. I recently re-read John Sansom’s excellent blog post on the subject, and, as is usually the case with John’s work, found it rather inspiring. Bravo John!

Instead, I’m going to talk a little about two kinds of “automation”, and why I generally pick one that might not, at first glance, be the right choice.

Full versus Partial Automation

I will define something that is fully automated as a task or operation which requires absolutely zero manual effort on my part. That is, I don’t have to push a button, flip a switch, type a command, or even know that it is happening, unless something goes wrong. An example of something that is already fully automated in my world is the weekly and daily database maintenance that I have installed, using Ola Hallengren’s excellent (and freely available) scripts. These run every night and ensure that my databases are backed up, corruption free, and optimized. Unless something goes *bump* and the jobs fail, I don’t get so much as an e-mail upon completion.

By contrast, a process that is partially automated requires some manual effort, but is still optimized such as to require as little work on my part as possible. I may need to open a script, enter a few parameters, or run a Powershell command, but outside of that, all the logic and processing is done behind the scenes, with perhaps a pretty little progress bar showing that work is, well, progressing. An example of this might be setting up of a server side SQL trace; I have a series of scripts ready that, given a database name and a path at which to place the trace files, will setup a standardized trace, grant rights on some signed stored procedures to allow non-DBAs to read the trace data, and even generate a README file to be sent to the requesting team with a series of instructions.

In an ideal world, I suppose that all of the mundane daily work I have to take care of would be in the former (fully automated) category. But in reality, I find that there’s a bit of a diminishing return on putting more and more effort into fully automating tasks. Consider this scenario / requirement: the request and creation of new databases.

At a basic level, the process we follow is something like this:

  1. A developer requests a new database via our in-house ticketing system.
  2. We (my team) ensure that the request at least contains the following details:
    1. The name of the database.
    2. The server on which to create it (or at least a version of SQL Server).
    3. The initial size of the database.
    4. The domain users or group that requires access to the database.
  3. Assuming we’ve got all the information required, we will go out to the database server where the databases will be placed, and determine a suitable location to place the data files. This is usually determined by simple the simple rule of “whichever volume has the most space”, with a few exceptions. If there is not enough space on the server, we’ll inform the requestor that there will be a delay, and start the process of ordering more storage.
  4. Once we know where the data and log files will be placed, we will create the database, take a full backup (to initialize the differential backup chain), and grant the access requested.

Now, I could certainly envision writing some kind of application that would allow developers to submit requests for new databases and process everything straight through based on the rules outlines above, similar to what I see on my web host’s control panel for MySQL databases. But, at the same time, constructing such as system would take a lot of time and effort. Contrast that with a partial automation solution, such as:

  • Provide an InfoPath form for the requests, forcing people to enter the necessary details in before submitting the request, thereby eliminating back-and-forth chatter.
  • Use a Powershell script that, when given a server name, database, and log file size, will connect via WMI and determine a list of suitable locations for placing the database files, then prompt the user to select one. When selections are made, the database is created and a full backup taken. This eliminates logging into the computer (to look at drives) and stepping through a bunch of GUI screens to create the database.
  • Have a second Powershell script that can accept a text file containing domain groups / users and role names, which will then grant the access required on a specified database. The file could be generated from the InfoPath form. Again, this eliminates some GUI clicks, and minimizes the chance of fat-fingering group / user names.

While this will not prevent someone from having to manually process the request, it will significantly cut down on the pain to do so, without a great deal of effort / complexity. Especially considering that these requests are fairly infrequent (perhaps one to three a month), I see no reason to spend additional time on providing a fully automated solution, when the partial one provides almost as much value.

Over the course of the year, I’ll be blogging about the various processes / means I use to accomplish this goal.

How do you automate your processes, and how do you determine a “break-even” point, if you will?

Are your processes generic?

Posted by Josh | Posted in SQL Server, The Lone DBA | Posted on 11-12-2011

Tags: , ,

0

This post is the sixth in an ongoing series about how to survive as the only DBA in your organization. Since October of last year, I’ve been assigned to a team that is responsible for owning and maintaining the development infrastructure. It’s a great team of seasoned professionals, but not a single other DBA. As a result, I’ve had to think very carefully about how I go about my daily work, so as to give our customers consistently good service, while still allowing those without a lot of SQL Server related knowledge to pick up my work when I’m not available.

Why is being generic a good thing?

When I think of the word “generic”, I usually picture those off-brand foods at most grocery stores, with their simplistic labels and lackluster colors. But in the case of processes, being “generic” really means “standardized, yet flexible”. This is a good thing, because it means your processes can answer many different (but ultimately) related needs.

Let’s take an example: a development team needs to trace activity in their database. We’re going to assume that just granting this group rights to run the trace is not an option, since, let’s say, they’ve taken the server down with a poorly done client side trace in the past (don’t laugh, it happened to me). In any case, let’s look at three options to answer this request:

Have the developers sit with you while you run a Profiler trace

I don’t like this for several reasons, not the least of which is that it is going to take a good chunk of my time. Because of course, the developer will probably have no idea what they are looking for, and may not be able to product the condition they are trying to capture on demand. It also still uses Profiler, which, as far as I’m concerned, should be banned.

Script out a one-time trace and have it run on the server

This is better, because it takes a lot less of my time to simply setup a server-side trace and let it run. I can then let the developer’s read in the trace files via something like a signed stored procedure (a topic for another day perhaps). But there’s still the one-off aspect: who’s to say that the next time the developers need this I’ll be around, have saved the trace definition, etc?

Write a templatized script that accepts a database name and a path for trace files, and use it going forward

This, to me at least, is the best option. After a slightly longer initial setup (one-time to write and document the script, plus test it), setting up subsequent traces will take very little time. In addition, the use of a template will mean a consistent experience / process for my customers, even when I’m not around. Even when I am around, it will also make it easier for DBA Junior to handle the request, leaving me to look at more interesting things. And by making the script flexible enough to handle different servers / databases, it becomes much more useful.

I go so far as to have a “no one-off” policy at work. That is, if I do something, I script it, put some parameters in, and save it off to source control. Then I publish it in our procedures manual, so that if a similar request comes in the team can handle it right away. It leads to a lot of scripts that are not highly used, but it also means less work in the long term, and a great bag of tricks in the process.

But, can something be too generic?

Sure it can. I’ve fallen into the trap many times of trying to have one process fit way too many needs, only to end up with a monstrous, un-followable mess. If you’ve got a ton of “if this is true, do this, otherwise do this” type of logic in your process, you might want to consider if you’re really answering related needs. This is kind of like the process equivalent of that awful stored procedure we’ve all seen; you know, the one that has twenty plus input parameters, and has every one in the WHERE clause as WHERE ((some_field = @some_parameter) OR (@some_parameter = NULL)). It looks good, but in the end the execution is piss poor.