Well, just for a change, lets cut straight to the script ([new] 5pm EST Sunday):


Function Convert-Delimited([regex]$from,[string]$to)
{
  process
  {
    $_ = $_ -replace "(?:`"((?:(?:[^`"]|`"`"))+)(?:`"$from|`"`$))|(?:((?:.(?!$from))*.)(?:$from|`$))","Þ`$1`$2Þ$to"
    $_ = $_ -replace "Þ(?:$to|Þ)?`$","Þ"
    $_ = $_ -replace "`"`"","`"" -replace "`"","`"`""
    $_ = $_ -replace "Þ((?:[^Þ`"](?!$to))+)Þ($to|`$)","`$1`$2"
    $_ = $_ -replace "Þ","`"" -replace "Þ","`""
    $_
  }
}
 

Basically, this is a set of regular expressions that can be run in the pipeline to convert delimited text from one delimiter to another. If the new delimiter needs quotes added to make it work correctly, the converter will do so (actually, it adds a marker character Þ around every field, and then removes them where they are not needed, and replaces them with quotes where they are — long story, and I welcome any improvements).

Just as an example, assume a column-based text file like this Hosts.txt file (notice that the columns are separated by two or more spaces, since there are spaces within the text columns):


Host          Standout Features                   Languages                            Space/Bandwidth   Monthly
BlueHost      PostgreSQL + MySQL, Free Cert       PHP5,Perl,Python,Ruby                   300GB/3TB      .95/mo for 2yrs
DreamHost     Jabber,SVN,WebDAV                   PHP4,PHP5,Perl,Python,Ruby              500GB/5TB      .95/mo for 2yrs
NetFirms Adv  2 domains, Shared Cert              PHP4,PHP5,Perl,Ruby,Python              250GB/2TB      .95/yr with "995" code
NetFirms Bus  Windows + Linux, MsSQL, 5 domains   ASP,ASP.NET,PHP4,PHP5,Perl,Ruby,Python  300GB/3TB      .95/mo for 3mo

You can convert it to CSV by using this function, and write it back out to file like this: Get-Content Hosts.txt | Convert-Delimiter " + " "," | Set-Content Hosts.csv. Now, since my original problem was all about importing the files I also went ahead and wrote a slightly more magical function called Import-Delimited:


Function Import-Delimited([regex]$delimiter=",", [string]$PsPath="")
{
  begin{
    $script:tmp = [IO.Path]::GetTempFileName()
    write-debug "Using tempfile $($script:tmp)"
   
    Function Import-String([string]$inputString){
      if($inputString.Length -gt 0 ) {
      write-debug "Importing $inputString"
      if(($inputString -as [IO.FileInfo]).Exists) {
        Get-Content $inputString | Convert-Delimiters $delimiter "," | Add-Content $script:tmp
      } elseif( ((Join-Path $pwd $inputString) -as [IO.FileInfo]).Exists) {
        Get-Content (Join-Path $pwd $inputString) | Convert-Delimiters $delimiter "," | Add-Content $script:tmp
      } else {
        $inputString | Convert-Delimiters  $delimiter "," | Add-Content $script:tmp
      }
      } else {
        write-debug "Nothing to Import"
      }
    }
   
    Import-String $PsPath
  }
 
  process{
    Import-String $_
  }
  end
  {
    Import-Csv $script:tmp
    Remove-Item $script:tmp
  }
}
 

This will allow you to directly import most delimited text files. But not only that, it can import text from the pipeline! So you can actually create objects out of any legacy application that outputs delimited text! Just for example, with our previous text file, we could do any of the following one-liners:


$hosts = Import-Delimited " + " Hosts.txt

$hosts = Get-Content Hosts.txt | Import-Delimited " + "

$hosts = Get-Content Hosts.txt | Convert-Delimiters $delimiter "," | Import-Delimited # CSV is the default
 

Any questions? Here’s the pair of functions as a single script download. :)

Question 1: What’s with the “ + “?

It’s a regular expression meaning one or more spaces followed by a space (ie: two or more spaces).

Question 2: Regular expressions? How do I just use this to import a tab-delimited text file?

The regular expressions for separators should be pretty simple, really. For instance, a tab-delimited text file would just be “`t” and a comma-delimited file is just a comma: “,” ... if you wanted to make sure to get rid of any spaces around the commas, you could use “ +, +” but we should mostly be taking care of that already.

Question 3: I get weird errors about the value of argument “name” being invalid?

Import-Delimited actually uses Import-Csv in the background, and requires a header row which has a name for each column: You can’t have a two separators in a row, or a trailing separator (although the scripts try to clean that up). Incidentally, this is a really good point: you must have the first row be the delimited headers, otherwise your first object will be treated as the header row.