Archive for November, 2007

Well, Visual Studio 2008 launched while I was off in Houston, Texas … if you don’t have an MSDN subscription, you can check out Visual C# Express 2008 and the rest of the Visual 2008 Express editions, or the trial versions and of course, the new .Net Framework 3.5

I’ll write up more information later, but a couple people have asked for this in #PowerShell on irc.freenode.net, and I had it already written, so here you go … my ConvertFrom-Html cmdlet (in a Huddled.HtmlSnapin). It converts HTML to valid xml using the SGML Parser which was available on GotDotNet years ago. It only works with files (doesn’t do URL downloads yet). Use it like this:


$url = "http://huddledmasses.org/"
$file = Join-Path $pwd "HuddledMasses.html"

$client = new-object System.Net.WebClient
$client.DownloadFile( $url, $file ) #NOTE: You need to use a full path here, not relative

$xml = ConvertFrom-Html $file

# Or even
(ConvertFrom-Html $file).Save($file)
 

The source code to my plugin may be considered public domain, and is included in the Huddled HTML SnapIn Zip.

However, the SgmlReader library is a Microsoft Sample which is licensed under the old MS Samples license which doesn’t allow reuse with viral open source software. I’ve seen some work being done on an HtmlAgilityPack on CodePlex (using a Creative Commons ASA license) but I have not really looked at it except to see that it has a several active issues related to entity encoding and dropping malformed tags which I haven’t encountered in SgmlReader …

Well, the first alpha CTP release of PowerShell 2.0 is out, and there’s a lot of new stuff in it … but I won’t repeat the list from the PowerShell blog, because I’m sure you’ve seen it five or six times already. Instead, lets just skip straight to talking about one of the features we’ve been hearing about the longest: in PowerShell 2, you can create Cmdlets in script … bringing nearly full parity between whats possible in a C# cmdlet and what’s possible in script.

There are a few caveats still (Parameter Sets aren’t working yet, and neither is help, really), and a few surprises … there’s a few downsides to PowerShell script vs C# ... but in this particular context one thing that stands out is that in C# the BeginProcessing, ProcessRecord, and EndProcessing blocks are actually methods which can call each other, and as demonstrated in my tutorial for writing cmdlets that work in the pipeline, they can be recursive — without getting duplicate variables.

A sample Script Cmdlet

In the interests of being the first to publish an interesting script cmdlet ;) and to continue my recent trend of talking about writing for the PowerShell pipeline, I’ve merged the logic of my script function and my pipeline cmdlet into a single sample script cmdlet for PowerShell 2.0 and it works great!

A few observations from the process, in no particular order:

  • If you recursively call your cmdlet from within itself, you have to test for parameters using the new $CommandLineParameters.ContainsKey because parameter variables keep their values through recursion if you don’t explicitly pass a value.
  • $CommandLineParameters.ContainsKey works differently in the Begin block where it will return $false for arguments which will get their values from the pipeline, than in the Process block where it will treat values which were passed as CommandLineParameters the same as those which were passed via the pipeline.
  • If you want to see how your function behaves in a pipeline, you should make sure to test it at different points in the pipeline: at the front, in the middle, and at the end.
  • Cmdlets are functions: they show up in the Function provider.
  • Cmdlets are functions: they have to be dot-sourced before you can call them.
  • Cmdlets are not functions: they are a single command Cmdlet which takes a name (which must have a – in it) and a couple of other parameters followed by a function script block.
  • When you recurse by executing &($MyInvocation.InvocationName), that second invocation has an InvocationName of “&” ... so you can’t go any further (this might be a good thing, if you want to stop recursion at one level no matter what. If you want to go further, you need to put your commands into a string, and use Invoke-Expression.

#requires -version 2.0
###################################################################################################
## A Template for Script Cmdlets which can _also_ be executed in the pipeline ....
##   by Joel Bennett, in hopes it will help...
## Version History
## v1.0 - First public release (after over 9 different versions in my various other functions)
## v1.2 - Show the use of Write-Output, and change "return" in the BEGIN to "Write-Output" to avoid
##      the pooling of the output from the process block when it's invoked as a function.
## v1.3 - Switched back to "break" instead of "return" so that if you pass via the pipeline AND via
##      the inputObject, only the inputObject gets process (this is how cmdlets behave).
##      - Cleaned up the comments, and removed the confusing alternate method and $args handling
##
## v2.0 - First Version as a Script Cmdlet.
##      This is much easier with support for [ValueFromPipeline] and [ValueFromPipelineByName]
##
###################################################################################################
Cmdlet Test-PipelineV2 -ConfirmImpact low -snapin Huddled.Tests
{
   Param (
      [Position(0)] [ConsoleColor] $Color,
      [Position(1)] [Mandatory] [ValueFromPipeline] [String[]] $InputObject
   )
   BEGIN {
      if ($CommandLineParameters.ContainsKey("InputObject")) {
         # Don't do anything here, because we're about to get re-invoked...
         $FromArgs = $true
      } else {
         # Normal "run-once" BEGIN processing
         $FromArgs = $false
         Write-Verbose "Begin $Color"
      }
   }
   PROCESS {
      # We no longer have to test for $_ or even to see if the [ValueFromPipeline] param is set
      # It *HAS* to be set, because it's a [Mandatory] parameter :)
      if ($FromArgs) {
         # Don't do anything here except re-invoke ourselves.
         Write-Output $InputObject | &($MyInvocation.InvocationName) $Color
      } else {
         # Normal Pipeline-friendly per-item processing
         Write-Host "Process: $InputObject" -Fore $Color
         ## You should make a practice of explicitly calling Write-Output on things
         ## That's how you emit them into the pipeline instead of just printing them
         Write-Output $InputObject
      }
   }
   END {
      if ($FromArgs) { 
         # Don't do anything here ... it just confuses things
      } else {
         # Normal "run-once" END processing
         Write-Verbose "End $Color"
      }
   }
}

 

A test case


## Test Script:
##
## "a","b","c" | Test-PipelineV2 "Cyan" -verbose
## @("a","b","c") | Test-PipelineV2 "Cyan" -verbose
## Test-PipelineV2 "Cyan" @("a","b","c") -verbose
##
## "a","b","c" | Test-PipelineV2 "Cyan" -verbose | Test-PipelineV2 "Green" -verbose
## @("a","b","c") | Test-PipelineV2 "Cyan" -verbose | Test-PipelineV2 "Green" -verbose
## Test-PipelineV2 "Cyan" @("a","b","c") -verbose  | Test-PipelineV2 "Green" -verbose
###################################################################################################
## Expected Output (sorry, no color here...)

VERBOSE: Begin Cyan
Process: a
a
Process: b
b
Process: c
c
VERBOSE: End Cyan
VERBOSE: Begin Cyan
Process: a
a
Process: b
b
Process: c
c
VERBOSE: End Cyan
VERBOSE: Begin Cyan
Process: a
a
Process: b
b
Process: c
c
VERBOSE: End Cyan
VERBOSE: Begin Cyan
VERBOSE: Begin Green
Process: a
Process: a
a
Process: b
Process: b
b
Process: c
Process: c
c
VERBOSE: End Cyan
VERBOSE: End Green
VERBOSE: Begin Cyan
VERBOSE: Begin Green
Process: a
Process: a
a
Process: b
Process: b
b
Process: c
Process: c
c
VERBOSE: End Cyan
VERBOSE: End Green
VERBOSE: Begin Green
VERBOSE: Begin Cyan
Process: a
Process: a
a
Process: b
Process: b
b
Process: c
Process: c
c
VERBOSE: End Cyan
VERBOSE: End Green
 

In a continuation of what is, sadly, becoming a series on how the PowerShell Pipeline works … Karl Prosser brought to my attention that certain powershell commands which have an -InputObject parameter don’t actually work when you pass something into it … so I thought I should create a cmdlet to show you how to correctly handle the InputObject parameter with the ValueFromPipeline set so you can pass the input in either way.

To demonstrate the problem, try this:


$a = @("A","B","A","C")
$a | Select -First 3 -Unique
Select -First 3 -Unique -InputObject $a
 

This should expose two weirdnesses about how the Select-Object cmdlet works:

  1. The -First parameter affects the input before the -Unique parameter does.
  2. When you pass the input in via -InputObject, the whole array is treated as a single object, and the command basically doesn’t do anything.

The big problem with this behavior is that there’s essentially no hint that you’ve done something wrong — there’s actually no way to make Select-Object work properly except by passing the objects in via the pipeline. The bigger problem is that it would have been simple for the Microsoft team to catch this and alert you, but they didn’t — so you probably won’t even notice there’s a problem until you run it on a trivial data set like my example. The even bigger problem is that it doesn’t just affect Select-Object (try it with Where-Object, just for instance). (more…)