In response to Kirk Munro’s comment on my Writing Cmdlets for the PowerShell Pipeline post:
You know, I’ve looked at your articles about cmdlets/functions in the pipeline and I feel you’re missing something. The purpose of the InputObject parameter is to pass in a collection as a single object. This is as opposed to using the pipeline where a collection is passed along the pipeline one item at a time. There are cases where you want to pass in a collection as a collection.
Quite simply, I disagree. The documentation for these parameters says quite clearly that inputObject “Specifies an object or objects to input to the cmdlet.†This clearly means that I should be able to pass multiple objects, and have them treated as multiple objects, not as a single array object.
If you look at your example (Select -First 3 -Unique -InputObject $a), this does in fact work. It receives one object, an array. It then selects the first 3 objects, but there is only 1 so that is moot. And lastly it selects unique objects, but again there is only 1 so that is moot as well and finally the object is output using the default formatter. In this case the default formatter is showing the contents of the array.
In this example, Select-Object has no reason to take a single object as an input object, at all. The only time that it would be useful for Select-Object to take a single inputObject would be in combination with the property parameters. In fact, if you want to Select-Object from an array to get the first of last n objects, or to get a set of unique objects, you have to pass the objects in via the pipeline — there’s no other way to make it select from an array. If that was indeed the intent, it should have been written as a separate ParameterSet, and the documentation should be changed to reflect that only a single object can be passed in, and that you can’t use the inputObject parameter with the first, last, or unique parameters at all. That’s worse than useless, it’s misleading and confusing.
Kirk is absolutely right that if you assume that the InputObject argument is only allowed to take a single object, then the behavior is correct – but it’s not logical. In fact, the behavior you see in the output of this command is so useless as to be a bug – even if the documentation did not say the parameter accepts multiple objects as input:
But quite frankly, just because someone important wrote something useless is no reason to emulate the behavior. The inputObject parameter IS the same parameter which pipeline objects go into. There’s no logical explanation for us to get different results when we pass an array in via the parameter by name instead of via pipeline: the PowerShell pipeline passing the things in the pipeline into the –inputObject parameter … it’s not using some mystical variable like it does in script functions.
Of course, we all know the powershell pipeline unwraps arrays — that’s convenient, and we can work around it when we really want to pass an array in:
My point in all of this is that InputObject is actually a very useful parameter, because there are cases where you really want to pass a collection as a collection into a cmdlet and then do something with it. By making InputObject instead split the collection passed in and pipeline it through, you’re forcing users to wrap collections in an array just to get them passed in as a collection, and personally I don’t feel they should have to do that.
While it’s true that passing in an array is sometimes desirable that’s not the reason the parameter exists, and I don’t believe it should be the default behavior here. It should be just as easy for me to use the cmdlet with the inputObject parameter directly as it is to input them via the pipeline. If I put in unwrapping for the inputObject parameter, you can work around it in the same way I did in the examples above. Incidentally, I think *PowerShell* should unwrap arrays to ValueFromPipeline parameters regardless of whether they’re on the pipeline, but I recognize it’s probably too late for that.
Basically, this is my argument: If inputObject unwraps arrays, the syntax for passing an array by wrapping it in @(,$array) is simple, for those rare occasions when that’s actually what you want. But if it does not unwrap arrays, you’re forced to call it via a separate pipeline, because unwrapping the array and passing it in one at a time in a foreach loop will almost certainly not do the same thing, and this is much uglier — and not compatible with use within the pipeline, particularly if you need to pass the pipeline output into a different parameter.
I guess my final word would be to agree with Kirk that “InputObject … isn’t documented clearly enough†… in fact, it’s clearly behaving incorrectly according to the documentation, and that’s why I originally proposed to unwrap the inputObject parameter when it’s passed as a parameter: to make it work the way the documentation suggests it would, which seems to me to be a better way than the way it actually works.
I’ll write up more information later, but a couple people have asked for this in #PowerShell on irc.freenode.net, and I had it already written, so here you go … my ConvertFrom-Html cmdlet (in a Huddled.HtmlSnapin). It converts HTML to valid xml using the SGML Parser which was available on GotDotNet years ago. It only works with files (doesn’t do URL downloads yet). Use it like this:
The source code to my plugin may be considered public domain, and is included in the Huddled HTML SnapIn Zip.
However, the SgmlReader library is a Microsoft Sample which is licensed under the old MS Samples license which doesn’t allow reuse with viral open source software. I’ve seen some work being done on an HtmlAgilityPack on CodePlex (using a Creative Commons ASA license) but I have not really looked at it except to see that it has a several active issues related to entity encoding and dropping malformed tags which I haven’t encountered in SgmlReader …
Well, the first alpha CTP release of PowerShell 2.0 is out, and there’s a lot of new stuff in it … but I won’t repeat the list from the PowerShell blog, because I’m sure you’ve seen it five or six times already. Instead, lets just skip straight to talking about one of the features we’ve been hearing about the longest: in PowerShell 2, you can create Cmdlets in script … bringing nearly full parity between whats possible in a C# cmdlet and what’s possible in script.
There are a few caveats still (Parameter Sets aren’t working yet, and neither is help, really), and a few surprises … there’s a few downsides to PowerShell script vs C# ... but in this particular context one thing that stands out is that in C# the BeginProcessing, ProcessRecord, and EndProcessing blocks are actually methods which can call each other, and as demonstrated in my tutorial for writing cmdlets that work in the pipeline, they can be recursive — without getting duplicate variables.
In the interests of being the first to publish an interesting script cmdlet
and to continue my recent trend of talking about writing for the PowerShell pipeline, I’ve merged the logic of my script function and my pipeline cmdlet into a single sample script cmdlet for PowerShell 2.0 and it works great!
A few observations from the process, in no particular order:
$CommandLineParameters.ContainsKey because parameter variables keep their values through recursion if you don’t explicitly pass a value.$CommandLineParameters.ContainsKey works differently in the Begin block where it will return $false for arguments which will get their values from the pipeline, than in the Process block where it will treat values which were passed as CommandLineParameters the same as those which were passed via the pipeline.Cmdlet which takes a name (which must have a – in it) and a couple of other parameters followed by a function script block.Invoke-Expression.In a continuation of what is, sadly, becoming a series on how the PowerShell Pipeline works … Karl Prosser brought to my attention that certain powershell commands which have an -InputObject parameter don’t actually work when you pass something into it … so I thought I should create a cmdlet to show you how to correctly handle the InputObject parameter with the ValueFromPipeline set so you can pass the input in either way.
This should expose two weirdnesses about how the Select-Object cmdlet works:
The big problem with this behavior is that there’s essentially no hint that you’ve done something wrong — there’s actually no way to make Select-Object work properly except by passing the objects in via the pipeline. The bigger problem is that it would have been simple for the Microsoft team to catch this and alert you, but they didn’t — so you probably won’t even notice there’s a problem until you run it on a trivial data set like my example. The even bigger problem is that it doesn’t just affect Select-Object (try it with Where-Object, just for instance). (more…)