Introducing Execute-RunspaceJob

Posted by Josh | Posted in Powershell | Posted on 08-01-2012

Tags: ,

5

Recently I began experimenting in earnest with Powershell runspaces as an alternative to background jobs. My interest was mostly keyed by an excellent blog post on the subject by Boe Prox (blog | twitter). I’ve found that, in general, runspaces work really well for multi-threaded workloads, and, once you get past some initial hiccups, are easier to use than background jobs.

During my experimentation I had a bit of a realization around how runspaces are generally used. I found that I was generally doing the same things over and over again:

  1. Creating a script block to do the background work, which had one or more parameters.
  2. Creating a series of parameter sets containing the information about the individual “entities” to be processed.
  3. Instantiating various objects and settings for the runspace configuration, such as the number of concurrent runspaces.
  4. Starting the background threads and waiting for them to finish.
  5. Getting the data back from the runspaces, and returning warnings for any errors that occurred during processing.

After copying and slightly modifying code a few times, I said to myself, “Self! This is not good practice copying and pasting code all over the place. Why not write a reusable and generic function implementing this work and simply re-use it?” Thus Execute-RunspaceJob was born.

The function basically encapsulates and makes generic the work required to setup and execute a parameterized script block within background runspaces. It also handles tricky items such as error handling and parsing return data. Let’s look at a quick example of how to use it.

Let’s say that you have a function called Get-DiskspaceInfo, which retrieves, well, information about disk space on remote machines using WMI calls. You have a series of servers, say, twenty, that you want to collect this information from. You could certainly simply get this list of servers and pipe them into the function (because you are writing functions that accept pipeline input, right?), but that approach would not scale very well since it operates in a one-at-a-time mode. Instead, using Execute-RunspaceJob, you could have any number (up to overwhelming your computer, naturally) of concurrent background threads collecting this information, then have it return the resulting data to you after it was finished.

First, you need to construct the script block which actually executes the work:

Next, we need to construct a hashtable of parameter values. The function expects a hashtable where the key is some unique identifier for the row to be processed (like a server name, database name, file name, etc), and the value is a nested hashtable of “parameter name”=”parameter value” pairs.

Finally, we execute the function, letting the results be placed into an array variable. We use the “-ThrottleLimit” parameter to specify the number of concurrent operations that are allowed.

If any of the background operations fail, a warning message will be printed out on the screen. Once all the data is collected you can treat the array just like you would the set of data returned by a normal pipleine-style operation.

I’ve started using this all over the place, and found that it greatly increases performance of most “collect and return” type operations. For example, it will be used in my forthcoming adaptation of Alan Renouf’s excellent vCheck framework for SQL Server.

To download the latest version of the function, go here. And please, tell me if you see something wrong or have suggestions for enhancements!

Be Sociable, Share!

Comments (5)

This is good :)

I would like to be able to run a command like the one below inside the scriptblock:

ping -l 5000 -w 4000 -n 1 $ServerName

How would I modify the block to accept something like that?

Hey Chad,

I haven’t tested it, but you might be able to do something like this (please excuse the crappy formatting if my script plugin doesn’t work for comments):

*Note: this assumes you have a variable called “$Servers” which contains an array of all server names to be pinged.

The more I think about it I don’t think that will quite work unless you put in some code to check the return code of the “ping” command. Have you considered using the “Test-Connection” native cmdlet instead? I have an example I can e-mail you if you’re interested.

Love the function, don’t quite get the hashtable. Can you explain that a little more? What’s the expected input?

Hey Josh,

I really should do another post to properly explain this. But basically, the hashtable is a list of two things: 1. a unique identifier for each object to be processed, 2. a nested hashtable of arguments and values. Think of it as a collection of parameters and identifiers for work to be done.

I’ll try and get a post together with a coherent example. I can e-mail you a sample from my SQLCheck work if you like.

Write a comment