home

Detecting ZIP files with PowerShell

22 Dec 2013

Have you heard of magic numbers? Some file formats are designed such that files are always saved with a specific byte sequence in the header. JPEG, PDF, and ZIP are all such formats.

You could look for ZIP files by searching for all files with a .zip extension but a better way would be to look for all files that have 50 4b 03 04 as the first 4 bytes of the file. All ZIP files will start with those bytes. Not all ZIP files have the .zip extension.

Here's a Test-ZipFile PowerShell cmdlet that will return true or false whether the specified file has this magic header. You may also note that this cmdlet is a good citizen by accepting file path input in an idiomatic way.

This cmdlet is also a great example of accepting -Path and -LiteralPath arguments in an idiomatic way. Including wildcard support for -Path.

View on GitHub →

function Test-ZipFile
{
<#
.SYNOPSIS
    Tests for the magic ZIP file header bytes.

.DESCRIPTION
    Inspired by http://stackoverflow.com/a/1887113/31308
#>
    [CmdletBinding()]
    param(
        [Parameter(
            ParameterSetName  = "Path",
            Mandatory = $true,
            ValueFromPipeline = $true,
            ValueFromPipelineByPropertyName = $true
        )]
        [string[]]$Path,

        [Alias("PSPath")]
        [Parameter(
            ParameterSetName = "LiteralPath",
            Mandatory = $true,
            ValueFromPipeline = $true,
            ValueFromPipelineByPropertyName = $true
        )]
        [string[]]$LiteralPath
    )

    process {
        $provider = $null

        # Only expand wildcards if the -Path parameter was used.
        if ($PSCmdlet.ParameterSetName -eq "Path") {
            $filePaths = $PSCmdlet.GetResolvedProviderPathFromPSPath($Path, [ref]$provider)
        }
        elseif ($PSCmdlet.ParameterSetName -eq "LiteralPath") {
            $filePaths = $PSCmdlet.GetResolvedProviderPathFromPSPath($LiteralPath, [ref]$provider)
        }

        foreach ($filePath in $filePaths) {
            $isZip = $false
            try {
                $stream = New-Object System.IO.StreamReader -ArgumentList @($filePath)
                $reader = New-Object System.IO.BinaryReader -ArgumentList @($stream.BaseStream)
                $bytes = $reader.ReadBytes(4)
                if ($bytes.Length -eq 4) {
                    if ($bytes[0] -eq 80 -and
                        $bytes[1] -eq 75 -and
                        $bytes[2] -eq 3 -and
                        $bytes[3] -eq 4) {
                        $isZip = $true
                    }
                }
            }
            finally {
                if ($reader) {
                    $reader.Dispose()
                }
                if ($stream) {
                    $stream.Dispose()
                }
            }

            Write-Output $isZip
        }
    }
}

Test-ZipFile is part of Poshato, my personal PowerShell module of miscellaneous goodness.

Want to learn PowerShell? Check out Windows PowerShell in Action by Bruce Payette. This is the book that got me started and it's one of the best tech books I've read.

blog comments powered by Disqus