Comparing SQL Result Sets With Powershell

Posted by Josh | Posted in Life As A SQL Developer, Powershell, SQL Server | Posted on 23-10-2012

Tags: , , ,

0

Lately I’ve been doing a lot of work re-writing some reporting code that was performing terribly. You know the old joke “Start it, then go get a cup of coffee, then come back”? Well in this case, that would be more like “Start it, get a cup of coffee, go for a run, read the morning news, eat lunch, take a nap, eat dinner, watch an hour of TV, then come back.” Ugh…

Obviously it’s important when modifying and tuning code to ensure that you don’t affect the results that come out of the procedure. Normally I like to do this using unit tests, but in this case the logic of the procedure was so complex (and relied on a bunch of underlying vendor code) that to create tests for it would have taken weeks. So instead, I opted for the strategy of simply comparing the old and new output, given a set of standard parameters. At first I did this manually, but after comparing 200+ columns a few times, I said to myself, “Self! This is silly, why not use Powershell to compare the results automatically?”

The result is the Compare-SQLResultSet function. It’s rough and currently doesn’t handle differently shaped results sets at all, but it has already been a huge time saver. I hope to improve it as time goes on, but wanted to get it out so others in the community could use it. Because if you’re doing this kind of comparison work manually… well, perhaps you should find a different line of work, because clearly you’re not getting it.

Unit Testing T-SQL – Some Opening Thoughts

Posted by Josh | Posted in Life As A SQL Developer, SQL Server, T-SQL Programming | Posted on 06-09-2012

Tags: , ,

1

One of the first goals I have in my new role as a SQL Server developer is to learn how to write unit tests for SQL Server database code. I’ve started using the open source TSQLT framework, and I thought as I go I’d record my thoughts in an ongoing series.

What have I learned so far (in about a week’s time)?

Retro-fitting Unit Testing Is A Lot Of Work

This isn’t TSQLT specific, but I think it’s a valid point. One of my first projects is to add units tests to a number of existing stored procedures. It’s a lot of work, for a number of reasons. At least several of them actually don’t have much to do with unit testing, per say, but are just indicative of other issues. For example, creating unit tests is very difficult without solid and documented business requirements. In their absence, all you can do is try and reverse-engineer the code. This is even more difficult if the code is complex and not well commented.

In addition, if the stored procedures involve multiple tables, the necessary setup code can be lengthy. Just making sure all the tables have correct values is a significant amount of trial and error work. This alone accounted for hours of getting my first unit test up and running. When you do this, make a list of all the tables and go through each one methodically. Depending on the framework you are using, and how you are organizing your tests, you may want to do this work in “setup” code (meaning, code that executes once at the beginning of testing). TSQLT has this functionality; you just include a procedure called “SetUp” in your test suite.

FakeTable Is Your Friend

TSQLT has a stored procedure called “tsqlt.FakeTable”, whose function is to “[create] an empty version of the table without the constraints in place of the specified table.” This makes isolated testing very easy, without dealing with foreign keys or constraints. Naturally, if you’re actually testing those constraints (which, apparently is also helped by more TSQLT functionality) this is a bad thing. But if you’re just testing data returns or calculations, this can be immensely helpful.

Perseverance Is Key

This is tiring work, especially when having to create all the tests after the fact. But the feeling you get when you get one working is well worth it. It’s amazingly calming to know that you can make any change you want to a piece of code and instantly know if it breaks key functionality. It’s taken a lot of the anxiety out of working on some very critical code, especially as a relative newcomer to the field of database development. And once I’m done, I’m thoroughly convinced that the amount of time this will save in the future is well worth the effort.

That’s all for now. I’ll continue to record my thoughts in this tag as I get further down the road. In the mean time, if anyone has tips for a newbie, please feel free to share!

Are You Developing SQL In A Sandbox?

Posted by Josh | Posted in Life As A SQL Developer, SQL Server | Posted on 01-09-2012

Tags: , ,

0

As a development DBA I often got requests from developers to elevate their rights to sysadmin for the purpose of “trying out some things for a proof of concept.” Examples would include things like Service Broker, replication, and CLR. And every time my answer was the same: No.

Now, before you stop reading and go off muttering “Man, what a typical DBA-hole”, you should understand why I took this view. The systems where the developers wanted this access were shared enterprise development and QA systems, which were often used by many different teams. If I had caved and given the developers this access, and they’d somehow managed to take the instance down or otherwise affect its stability, I would have had scores of people standing at my desk screaming about lost productivity. And rightfully so, because as a DBA it’s my job to make sure their systems are working.

On the other hand, there is clearly a need to allow folks to experiment with methods and architectures. Otherwise, how would we ever learn how to do new things, or know if a particular approach is going to work? But the enterprise environment isn’t the place for that. Instead, this is where one of a developer’s most important tools comes into play: a sandbox environment.

The basic concept is this: you have an area where you can play to your heart’s content without fear of affecting others. You can tear things down and rebuild them as you please (and hopefully in automated fashion, because good developers are always lazy). Personally I like to have this setup on my personal machine, using a series of virtual machines. With hard drives and memory being pretty cheap, SQL Server Evaluation Edition, and great free virtualization products like VirtualBox readily available, there really isn’t an excuse not to have something like this. Using some scripting and products like the excellent SQLSPADE automated SQL installation framework, you could spin up a couple SQL servers in a matter of an hour.

Once you’ve proven out your idea, and gotten the setup process streamlined (because no DBA ever wants to get a multi-page document full of screenshots of clicking “Next” buttons), that’s when you can move on to the shared environments, to make sure things are going to work properly in a leveraged setup.

Since moving into a full time development role, I’ve become even more convinced this is an essential tool for being a successful database developer. It makes my life so much easier not having to constantly go through channels when I need something done like setting up a login on a server. Mind you, I still agree with those processes, because they protect the shared environment from renegades like me.

Do What You Love – Introducing Josh v. Next

Posted by Josh | Posted in Life As A DBA, SQL Server | Posted on 19-08-2012

Tags: , ,

2

Several months back at SQL Saturday 121 in Philadelphia, I had a great conversation in the speaker’s room with Mike Hillwig (blog | twitter) about career paths. One point that he made which really stuck with me was (and I’m paraphrasing roughly here) that the best way to find the job you truly love is simply to figure out what it is you like about your current one, then find a job where that’s all you do. I remember he attributed the advice to someone else, but for the life of me I can’t remember who (Mike please feel free to jump in and jog my woeful memory if you read this); in any case, the words really resonated with me.

For the last three years and change I’ve been in some form of a DBA / support role. And while overall I really enjoy my job, there’s a lot of aspects of it that I don’t find terribly fun. For instance:

  • Doing routine administrative work, such as creating logins or databases on servers, fixing broken backups, etc.
  • Having an endless barrage of “this server is broke, please fix it” e-mails, when in nine out of ten cases, the problem is their code, not my server.
  • Troubleshooting hard to trace infrastructure issues. Don’t even get me started on my frustration with Kerberos and Network Load Balancing.

But, let’s face it; those tasks are part of a DBA’s role.

After Mike’s words had rattled around in my brain for a few weeks, I began thinking: “Self! What are the things that you really enjoy about your job (outside of the people on your team, who are great)?” So I pulled out my thinking notepad, and started scribbling. The top three that I came up with, in no particular order, are:

  • Solving problems – not to be confused with “troubleshooting”; perhaps a more apt description would be “determining solutions for business and technical problems”.
  • Designing data structures – I’m by no means a data modeler, but I do enjoy the process of normalizing data, and creating elegant, simple structures to store information.
  • Performance Tuning – In an ideal world this skill wouldn’t be necessary, since all code would be written properly from the start and scale well. Fortunately for me, that’s not the case. There’s just something incredibly satisfying about taking some god-awful Gordian knot of nested subqueries and views, untangling it, and seeing an application’s performance just soar through the roof.

After this exercise I began thinking about what kind of job would let me focus on these things. After some thought, I came to a startling conclusion: I had to join the dark side. I had to become a developer.

But not just any developer, mind you. I could never be one of those heads down, coding machines that are just handed specs and go off on their merry way. No, I needed to find a job where I would have a fair bit of interaction with people, while still writing the code and getting my hands dirty.

Then, almost as if by magic, an amazing offer came my way. I was given a chance to join an elite group of database architects / developers within my company. This team is essentially the SQL Server equivalent of the A-Team; If you have a problem with SQL Server, and no one else can help, you bring these guys in. They do not own any specific applications or databases themselves. Rather, they come in, do a targeted assessment of the situation, then perform surgical work to get things back on track. Need help determining how best to store your database in source control (because you are doing that, right?)? They’ll deliver tools to help script out and organize your code, plus make deploying new releases a painless process. Have a query that is running horribly and can’t figure out why? They’ll rip apart the code and rewire it behind the scenes, and show you how to do it yourself going forward. In many ways, they are a lot like an internal consulting group.

At the time the offer came down the team only had two members, but combined they have almost twice my lifetime in experience working with databases. With their workload getting larger and larger by the day, it became apparent some additional help was needed. And I’m truly honored to say I’ve been given the chance to join them.

It’s bittersweet, for sure. I have loved my time in production support, and will miss the people and fun times I’ve had there. But I’m also incredibly excited about this new opportunity and the challenges it will bring.

And to Mike (and whomever originally passed the advice down to him): thanks for the push.

Introducing Execute-RunspaceJob

Posted by Josh | Posted in Powershell | Posted on 01-08-2012

Tags: ,

5

Recently I began experimenting in earnest with Powershell runspaces as an alternative to background jobs. My interest was mostly keyed by an excellent blog post on the subject by Boe Prox (blog | twitter). I’ve found that, in general, runspaces work really well for multi-threaded workloads, and, once you get past some initial hiccups, are easier to use than background jobs.

During my experimentation I had a bit of a realization around how runspaces are generally used. I found that I was generally doing the same things over and over again:

  1. Creating a script block to do the background work, which had one or more parameters.
  2. Creating a series of parameter sets containing the information about the individual “entities” to be processed.
  3. Instantiating various objects and settings for the runspace configuration, such as the number of concurrent runspaces.
  4. Starting the background threads and waiting for them to finish.
  5. Getting the data back from the runspaces, and returning warnings for any errors that occurred during processing.

After copying and slightly modifying code a few times, I said to myself, “Self! This is not good practice copying and pasting code all over the place. Why not write a reusable and generic function implementing this work and simply re-use it?” Thus Execute-RunspaceJob was born.

The function basically encapsulates and makes generic the work required to setup and execute a parameterized script block within background runspaces. It also handles tricky items such as error handling and parsing return data. Let’s look at a quick example of how to use it.

Let’s say that you have a function called Get-DiskspaceInfo, which retrieves, well, information about disk space on remote machines using WMI calls. You have a series of servers, say, twenty, that you want to collect this information from. You could certainly simply get this list of servers and pipe them into the function (because you are writing functions that accept pipeline input, right?), but that approach would not scale very well since it operates in a one-at-a-time mode. Instead, using Execute-RunspaceJob, you could have any number (up to overwhelming your computer, naturally) of concurrent background threads collecting this information, then have it return the resulting data to you after it was finished.

First, you need to construct the script block which actually executes the work:

Next, we need to construct a hashtable of parameter values. The function expects a hashtable where the key is some unique identifier for the row to be processed (like a server name, database name, file name, etc), and the value is a nested hashtable of “parameter name”=”parameter value” pairs.

Finally, we execute the function, letting the results be placed into an array variable. We use the “-ThrottleLimit” parameter to specify the number of concurrent operations that are allowed.

If any of the background operations fail, a warning message will be printed out on the screen. Once all the data is collected you can treat the array just like you would the set of data returned by a normal pipleine-style operation.

I’ve started using this all over the place, and found that it greatly increases performance of most “collect and return” type operations. For example, it will be used in my forthcoming adaptation of Alan Renouf’s excellent vCheck framework for SQL Server.

To download the latest version of the function, go here. And please, tell me if you see something wrong or have suggestions for enhancements!

Money Isn’t Everything

Posted by Josh | Posted in Life As A DBA, SQL Server | Posted on 25-06-2012

Tags: ,

0

A few days ago I read a post over on Simple Dollar titled “How Much Are Family Friendly Benefits Worth“. It made me think about the non monetary things I appreciate in my job. I’m fortunate to work for a great company with a very open mindset. If you were to visit my offices, you’d probably think we were some kind of art studio, as opposed to a financial technology company. All our buildings are wide open spaces, with everyone mingling with each other. No one has an office; not even the CEO. It’s a wonderfully creative and collaborative workspace.

I love that I can work from home when needed, and that I’m given the freedom to innovate and be creative when it comes to problem solving. There would be nothing more frustrating than being stuck in antiquated or rigid rules, simply because that’s the way things have always been done. That’s not to say we are careless, far from it! We operate very cautiously and hold ourselves to very high standards. But that doesn’t come at the expense of an entrepreneurial spirit.

Most of all, I love the people I work with. We’re a nutty, humorous bunch of geeks for sure. We rib each other constantly, but we also push each other to be better. And we always support one another when we need it. The value of a good team cannot be underestimated. I, for one, have probably stayed in my position longer than I shold have, solely because I feel a great sense of loyalty to those around me. Some might tell me that decision has hampered my advancement. So be it; I’d go to war with these folks any day.

The Value Of Logging–#TSQL2sDay 31

Posted by Josh | Posted in Powershell, SQL Server | Posted on 12-06-2012

Tags: , ,

0

TT150x150This month’s T-SQL Tuesday is being hosted by Aaron Nelson (blog | twitter), and is all about logging. As it happens my recent presentation at SQL Saturday 121 included a discussion around this subject in the context of keeping track of administrative activities. As a result, the topic is fairly fresh in my mind.

Rather than re-hash the whole thing, I’ll simply restate what I consider the most important point from that section of the presentation.

bart-simpson-generator.php

You can be logging everything and anything under the sun, but if you’re not reviewing and being alerted as needed, there’s really no point. How you do so is really up to you; use whatever tool you are most comfortable with. For some folks that might be SSRS, others Powershell (shameless T-SQL2sDay host brown-nose: this is my preference for collecting logs from multiple sources and displaying them all in one place), or even SSIS. It’s the act, not the method, that matters.

Personally, I like to review everything first thing in the morning, after I’ve filled my coffee cup but before I start reading over my e-mail. That way, any issues are immediately noticed and can be dealt with before I get distracted. Taking a tip I once heard from Sean McCown of MidnightDBA fame, I collect everything in one place and then send a single summary of problems, rather than dealing with individual e-mails. This makes the task seem much less daunting.

How do you collect and review your logs (of all sorts)? Because, of course, you are reviewing them. Right?

Adapting vCheck for SQL Server

Posted by Josh | Posted in Powershell, SQL Server | Posted on 11-06-2012

Tags: , ,

2

For those of you not already aware, VMware and PowerCLI guru Alan Renouf (blog | twitter) recently upgraded his excellent vCheck Powershell script framework to have a plugin friendly approach. (The framework was originally written to provide a daily report of issues identified throughout a VMWare environment.) As a result, numerous forks have been popping up for getting daily reports on all sorts of systems, from Exchange to System Center. I noticed that there didn’t appear to be anything for SQL Server, and pinged Alan to confirm if anyone had starting working on one. When he responded that no one had, I volunteered to take a stab at it. The framework is an impressive piece of work, with some robust HTML reporting features and extensibility. If you’ve got a need to do any kind of centralized, scheduled reporting on some infrastructure, I highly recommend you check out his work.

And here, dear reader, is where I need your help. I want to make this as useful and complete a daily health check as possible for all of us DBAs out there. I have my own list of items that I want to check on, but I want to get more input to make sure I’m covering all the bases.

The scripts will connect to a CMS server, and iterate over all the servers contained within while performing various health checks. At the end, a nicely formatted HTML report is delivered listing all the problems identified. Thanks to Alan’s work, all the thresholds will be completely configurable.

Here is a list of the checks I’ve thought of so far:

  1. Ping test (is the server responding to a ping)
  2. Last backup date (full, differential, log)
  3. Last DBCC date
  4. Disk space free percent
  5. Services running (SQL Server, SQL Agent)
  6. Database file space free percent
  7. High severity errors (17+)
  8. Failed logins (over a certain threshold of counts)
  9. Failed agent jobs

This is a short list, which is why I’d love to hear from you. What would you like to see in your mailbox every morning that will give you the best view of your SQL Server infrastructure?

Presenting At #SQLSat121 in Philly

Posted by Josh | Posted in SQL Server | Posted on 13-05-2012

Tags: , , ,

0

I’m excited to announce that I’ll be presenting at SQL Saturday 121 in Philadelphia on June 9th! My session is titled “Avoiding Monkey At The Monitor By Delegating“; I’ll be showing some ways to securely delegate menial DBA work so that you can focus on more important (less urgent) work. While the session is really geared towards DBAs, it could be useful for some developers as well, since we’ll be talking about things like permission chaining and certificate security as well.

I’m obviously psyched to be part of a great group of speakers. I’d be lying if I wasn’t a little nervous too, since this is my first time presenting at this level (or really any outside of my company, for that matter). So, I’ll be practicing quite a bit between now and then.

Hope to see you there!

T-SQL Tuesday #28 – Jack of All Trades, Master Of None

Posted by Josh | Posted in SQL Server | Posted on 13-03-2012

1

T-SQL Tuesday

It’s T-SQL Tuesday again! This month is being hosted by Argenis Fernandez, and the topic is specialization. I am late to the game because, go figure, I was held up at work troubleshooting issues. Shocking, I know, that a DBA would be kept late at work!

In my time working with SQL Server I’ve done my best to keep my focus fairly small (I would say I’m a performance / admin specialist – maybe that’s a future blog post to describe what that means). What with all the various features that are contained within the sphere of the overall SQL Server product offering, it’d be fairly easy for my ADD-riddled brain to jump completely off the deep end. The problem is, as we know, trying to be an expert in everything ends up causing you to be an expert at nothing.

But even with keeping my SQL focus narrow, my job has expanded greatly in the last year or so after I left the production DBA group. Being the only full-time DBA on the current team means that I’ve also had to pick up other skills, such as VMWare and AD domain administration to name a few. Has this hurt my SQL Server skills? I’d have to say yes, as it’s taken away time and brainpower I could have devoted to learning / fine-tuning my SQL Server skillset. But at the same time, it is a necessary evil in today’s “do more with less” world, and I need to accept that.

So how do I try and balance out the lost time? In my spare time at home, of course. Naturally things like family time take precedence, but I do make a point to spend a few hours every week playing around in my home lab setup. This has helped keep me pretty sharp, though I certainly wish I could do more. Especially with SQL 2012 coming out, the “To Learn / Play” list just keeps growing and growing.

I’d love to hear from other folks who’ve found their roles at work shifting and expanding, and how you have tried to keep some relative priority on SQL Server as your “specialty”.