Skip to content

WGet 2 for PowerShell

[new] Edit: Added -Passthru

Edit: I made a mistake … I wrote this a few weeks ago, and made an error in the script which causes it to not work properly with most binary files … I’ve fixed the script down below now, and while I was at it, I went ahead and added a download progress report ;)

About a year ago, I wrote a script to download files from the web using PowerShell, but it was so simple that you had to specify the url to download and the file to save to. At the time I knew there was a way to find the file name of the download in the header, but I couldn’t remember how to get to the headers, and it’s not possible using System.Net.WebClient so I just dropped it.

Then the other day I saw an old post from Script Fanatic about querying Http status codes, and it reminded me of my annoyance with my wget script. So I fixed it, and I’m letting you have it.

The new version of wget for PowerShell (or Get-WebFile) is on the PowerShell Repository and it’s been completely rewritten to use System.Net.HttpWebRequest and HttpWebResponse which gives us access to the the name of the file in those cases where the URL is something like .../scripts/?dl=417 (assuming the server puts the file name in the headers as it should).

It still doesn’t support authentication, and even though the headers include the size (and I have to do the streams myself so I could easily read a few bytes at a time and give you progress reports)


## Get-WebFile (aka wget for PowerShell)
##############################################################################################################
## Downloads a file or page from the web
## History:
## v3.6 - Add -Passthru switch to output TEXT files
## v3.5 - Add -Quiet switch to turn off the progress reports ...
## v3.4 - Add progress report for files which don't report size
## v3.3 - Add progress report for files which report their size
## v3.2 - Use the pure Stream object because StreamWriter is based on TextWriter:
##        it was messing up binary files, and making mistakes with extended characters in text
## v3.1 - Unwrap the filename when it has quotes around it
## v3   - rewritten completely using HttpWebRequest + HttpWebResponse to figure out the file name, if possible
## v2   - adds a ton of parsing to make the output pretty
##        added measuring the scripts involved in the command, (uses Tokenizer)
##############################################################################################################
function Get-WebFile {
   param(
      $url = (Read-Host "The URL to download"),
      $fileName = $null,
      [switch]$Passthru,
      [switch]$quiet
   )
   
   $req = [System.Net.HttpWebRequest]::Create($url);
   $res = $req.GetResponse();
 
   if($fileName -and !(Split-Path $fileName)) {
      $fileName = Join-Path (Get-Location -PSProvider "FileSystem") $fileName
   }
   elseif((!$Passthru -and ($fileName -eq $null)) -or (($fileName -ne $null) -and (Test-Path -PathType "Container" $fileName)))
   {
      [string]$fileName = ([regex]'(?i)filename=(.*)$').Match( $res.Headers["Content-Disposition"] ).Groups[1].Value
      $fileName = $fileName.trim("\/""'")
      if(!$fileName) {
         $fileName = $res.ResponseUri.Segments[-1]
         $fileName = $fileName.trim("\/")
         if(!$fileName) {
            $fileName = Read-Host "Please provide a file name"
         }
         $fileName = $fileName.trim("\/")
         if(!([IO.FileInfo]$fileName).Extension) {
            $fileName = $fileName + "." + $res.ContentType.Split(";")[0].Split("/")[1]
         }
      }
      $fileName = Join-Path (Get-Location -PSProvider "FileSystem") $fileName
   }
   if($Passthru) {
      $encoding = [System.Text.Encoding]::GetEncoding( $res.CharacterSet )
      [string]$output = ""
   }
 
   if($res.StatusCode -eq 200) {
      [int]$goal = $res.ContentLength
      $reader = $res.GetResponseStream()
      if($fileName) {
         $writer = new-object System.IO.FileStream $fileName, "Create"
      }
      [byte[]]$buffer = new-object byte[] 4096
      [int]$total = [int]$count = 0
      do
      {
         $count = $reader.Read($buffer, 0, $buffer.Length);
         if($fileName) {
            $writer.Write($buffer, 0, $count);
         }
         if($Passthru){
            $output += $encoding.GetString($buffer,0,$count)
         } elseif(!$quiet) {
            $total += $count
            if($goal -gt 0) {
               Write-Progress "Downloading $url" "Saving $total of $goal" -id 0 -percentComplete (($total/$goal)*100)
            } else {
               Write-Progress "Downloading $url" "Saving $total bytes..." -id 0
            }
         }
      } while ($count -gt 0)
     
      $reader.Close()
      if($fileName) {
         $writer.Flush()
         $writer.Close()
      }
      if($Passthru){
         $output
      }
   }
   $res.Close();
   if($fileName) {
      ls $fileName
   }
}

2 Comments

  1. Niclas Lindgren wrote:

    I didn’t get it to work with FTP sites, but the “fix” is simple, merely change the line

    if($res.StatusCode -eq 200) {

    to

    if($res.StatusCode -eq 200 -or $res.StatusCode -eq “OpeningData”) {

    Cheers,
    Niclas

    Sunday, July 27, 2008 at 9:42 am | Permalink
  2. Paul wrote:

    I couldn’t get it to work specifying a path for the file (and the filename) – I kept getting ‘access denied’ or the path would be appended to the user home directory.

    I changed:

    if($fileName -and !(Split-Path $fileName))

    to:

    if($fileName -ne $null -and (Split-Path $fileName) -ne $null)

    which worked. I prefer specifying things in this way as I believe they are also easier to read.

    Comments in the script would also have been nice for this of us newer to PS than others!

    Friday, August 1, 2008 at 12:39 pm | Permalink