Speeding Up the PowerShell Pipeline

PowerShell scripts can grow very slow when you (a) need to process a lot of items and (b) you are using the PowerShell pipeline. Let’s today find out why that is, and what you can do about it.

To visualize the underlying problem, let’s first create a test case that shows how PowerShell gets slowed down considerably. For this, we need a lot of items, so the code below generates a list of all files in your Windows folder, which can take a couple of seconds to generate.

# get large data sets
$files = Get-ChildItem -Path c:windows -File -Recurse -ErrorAction SilentlyContinue
$files.Count

Let’s send the files into a pipeline, and pick only files larger than 1MB. In the examples below, we are piping the content of $file solely for the purpose to have reproducible input data. In real life, you’d of course never use variables and instead stream results directly from commands.

Measure-Command {
    $largeFiles = $files | Where-Object { $_.Length -gt 1MB }
}
$largeFiles.Count

In our test, the code took between 3 and 4 seconds and produced 3485 “large” files. Results may vary on your machine.

Where-Object really is just a ForEach-Object with an If-Clause inside of it, so let’s try and replace Where-Object by If:

Measure-Command {
$largeFiles = $Files | ForEach-Object {
        if ($_.Length -gt 1MB)
        { $_ }
    }
}
$largeFiles.Count

The result is the same, yet the time is cut in half.

ForEach-Object is really just an anonymous script block with a process block, so next, try this:

Measure-Command {
$largeFiles = $Files | & {
        process
        {
            if ($_.Length -gt 1MB)
            { $_ }
        }
    }
}
$largeFiles.Count

Again, the result is the same, but the time was cut down from an initial 4 seconds to roughly 100 milliseconds (factor 40).

As it turns out, when you feed data to commands via pipeline, PowerShell invokes the parameter binder for each transmitted item, and this can add up to significant delays. Since both ForEach-Object and Where-Object use parameters, the binder is activated.

When you instead use anonymous script blocks with a process block inside but with no parameters, you bypass all of the parameter binding and can speed up PowerShell pipeline operations to a degree that makes a difference even in the wild.


psconf.eu – PowerShell Conference EU 2019 – June 4-7, Hannover Germany – visit www.psconf.eu There aren’t too many trainings around for experienced PowerShell scripters where you really still learn something new. But there’s one place you don’t want to miss: PowerShell Conference EU – with 40 renown international speakers including PowerShell team members and MVPs, plus 350 professional and creative PowerShell scripters. Registration is open at www.psconf.eu, and the full 3-track 4-days agenda becomes available soon. Once a year it’s just a smart move to come together, update know-how, learn about security and mitigations, and bring home fresh ideas and authoritative guidance. We’d sure love to see and hear from you!

Twitter This Tip! ReTweet this Tip!

GD Star Rating
loading...