In a continuation of what is, sadly, becoming a series on how the PowerShell Pipeline works … Karl Prosser brought to my attention that certain powershell commands which have an -InputObject parameter don’t actually work when you pass something into it … so I thought I should create a cmdlet to show you how to correctly handle the InputObject parameter with the ValueFromPipeline set so you can pass the input in either way.
To demonstrate the problem, try this:
$a = @("A","B","A","C")
$a | Select -First 3 -Unique
Select -First 3 -Unique -InputObject $a
This should expose two weirdnesses about how the Select-Object cmdlet works:
- The -First parameter affects the input before the -Unique parameter does.
- When you pass the input in via -InputObject, the whole array is treated as a single object, and the command basically doesn’t do anything.
The big problem with this behavior is that there’s essentially no hint that you’ve done something wrong — there’s actually no way to make Select-Object work properly except by passing the objects in via the pipeline. The bigger problem is that it would have been simple for the Microsoft team to catch this and alert you, but they didn’t — so you probably won’t even notice there’s a problem until you run it on a trivial data set like my example. The even bigger problem is that it doesn’t just affect Select-Object (try it with Where-Object, just for instance).
The simplest fix
When this came up in the #PowerShell IRC channel Oisin initially defended this as an unavoidable side effect of the way the cmdlet system works. However, after playing with the idea for a bit, we found it’s actually trivial to stop, although I found it hard to explain without actually demonstrating an alternative. The simplest possible alternative is just to throw an exception if the value is passed in as an argument instead of via the pipeline. That would preserve the same level of functionality you have now — but cause an error in those cases where it wouldn’t work anyway.
protected override void BeginProcessing
()
{
if (_input
!= null)
{
throw new ArgumentException
("You must pass InputObject via the pipeline!");
}
base.
BeginProcessing();
}
A better way to handle input
Of course, you can do better than that
. So, I hereby present the first version of my PowerShell Pipeline Template Cmdlet. It’s pretty simple really (once you get past all the cmdlet overhead): basically, you check in the BeginProcess() method to see if the InputObject parameter has been set, and set an alternate private variable. Then, in the ProcessRecord() method, we have two alternate computation paths: the normal path, and a second path for when the collection is passed in as an argument. In that case, you recurse and call the ProcessRecord method once for each item in the collection.
I’m sure some of you will have some improvements you can make, feel free to continue the development on the PowerShell Central scripts page or by sending feedback in the form below, but for now, here’s the Test-Pipeline Cmdlet Binary and the source code.
The Code
// An improvement! Now we accept a single object (like Select-Object does)
// But, unlike Select-Object, if an array is passed into the argument -InputObject
// we still manage to process each item in the array, as we would in the pipeline
//
// Try it out: "a","b","c"| Test-Pipeline -verbose
// Versus this: Test-Pipeline -verbose -input @("a","b","c")
//
// If you don't set the -verbose flag, you shouldn't be able to tell them apart
// The first way, the "1" invocation hits ProcessRecord for "a"
// ... before the "2" invocation hits BeginProcessing()
//
// Version History
// 1.0 Just throws an exception
// 2.0 Finds a way to enumerate ProcessRecord from BeginProcessing
// There is still a slight difference, which you can see if you test these:
// Test-Pipeline 1 -input @("a","b","c") -verbose | Test-Pipeline 2 -verbose
// "a","b","c" | Test-Pipeline 1 -verbose | Test-Pipeline 2 -verbose
// 2.3 Recursed from inside ProcessRecord instead of BeginProcessing
// Makes the execution look identical in the test case from 2.0
////////////////////////////////////////////////////////////////////////////////
using System;
using System.Collections.Generic;
using System.Text;
using System.Management.Automation;
using System.Collections;
namespace Huddled.
TestSnapin
{
[Cmdlet
(VerbsDiagnostic.
Test,
"Pipeline")]
public class TestPipelineCommand
: Cmdlet
{
#region Parameters
/// <summary>
/// This is just a name parameter for decorating test cases :)
/// </summary>
[Parameter
(Position
= 0,
Mandatory
= false,
ValueFromPipelineByPropertyName
= false,
HelpMessage
= "A Name for Verbose output"), ValidateNotNullOrEmpty
]
public string Name
{
get
{ return _name
; }
set
{ _name
= value
; }
}
private string _name
= "TestPipeline";
[Parameter
(Position
= 1,
Mandatory
= true,
ValueFromPipeline
= true,
HelpMessage
= "Help Text"), ValidateNotNullOrEmpty
]
public object InputObject
{
get
{ return _input
; }
set
{ _input
= value
; }
}
private object _input
;
private bool _isArgument
= false
;
#endregion
protected override void BeginProcessing
()
{
WriteVerbose
(String.
Format("Begin Processing {0}", Name
));
if (_input
!= null && _input
is ICollection
)
{
_isArgument
= true
;
StringBuilder output
= new StringBuilder
("There's input: ");
foreach (object _in
in (ICollection
)_input
)
{
output.
AppendFormat("{0}, ", _in
);
}
WriteVerbose
(output.
ToString());
}
base.
BeginProcessing();
}
protected override void ProcessRecord
()
{
if (!_isArgument
)
{
// This is the normal ProcessRecord code
WriteVerbose
(String.
Format("Process: {0}", _input
));
WriteObject
(_input
);
}
else
{
// This is what we have to do unwrap -InputObject as Arg
ICollection _collection
= _input
;
_isArgument
= false
; // unset isCollection before recursing
foreach (object _in
in (ICollection
)_collection
)
{
InputObject
= _in
;
ProcessRecord
();
}
}
}
protected override void EndProcessing
()
{
WriteVerbose
(String.
Format("End Processing {0}", Name
));
base.
EndProcessing();
}
}
}
Hi Joel,
You know, I’ve looked at your articles about cmdlets/functions in the pipeline and I feel you’re missing something. The purpose of the InputObject parameter is to pass in a collection as a single object. This is as opposed to using the pipeline where a collection is passed along the pipeline one item at a time. There are cases where you want to pass in a collection as a collection.
In this particular article, you mention how certain cmdlets don’t actually work when you use the InputObject parameter. But if you look at your example (Select -First 3 -Unique -InputObject $a), this does in fact work. It receives one object, an array. It then selects the first 3 objects, but there is only 1 so that is moot. And lastly it selects unique objects, but again there is only 1 so that is moot as well and finally the object is output using the default formatter. In this case the default formatter is showing the contents of the array. Look at these two examples:
Select -First 3 -Unique -InputObject $a | % { $_.ToString() }
$a | Select -First 3 -Unique | % { $_.ToString() }
In the first case, the output is System.Object[]. which is correct because that is the default string conversion of a collection of objects. In the second case you see the same output as if you didn’t put the foreach-object cmdlet in the pipeline, which is also correct because the items in the collection are strings and therefore when you call ToString() on them you get them output as strings.
My point in all of this is that InputObject is actually a very useful parameter, because there are cases where you really want to pass a collection as a collection into a cmdlet and then do something with it. By making InputObject instead split the collection passed in and pipeline it through, you’re forcing users to wrap collections in an array just to get them passed in as a collection, and personally I don’t feel they should have to do that.
The only thing that I can see wrong with InputObject is that it isn’t documented clearly enough.
I’d like to hear your thoughts on this, either here or via email because I’ve been working on cmdlets/functions that work in the pipeline lately and I have tried to follow the PowerShell core model of the InputObject parameter. I’d be interested in hearing arguments about how InputObject is broken so that I can make sure I’m making the right choice when it comes to that parameter.
—
Kirk Munro
Poshoholic
http://poshoholic.com
Quite simply, I disagree. The documentation for these parameters says quite clearly that inputObject “Specifies an object or objects to input to the cmdlet.” This clearly means that I should be able to pass multiple objects, and have them treated as multiple objects, not as a single array object. ... More over here
To clarify the input object behaviour try:
-inputobject $array0
SS
Yeah. The problem is … I don't want to clarify the input object behavior — I think the current behavior is wrong.