home

Differential backups using 7-zip and PowerShell

10 Dec 2013

Backups are like changing the oil on your car. It's the best thing you can do to extend the life of your engine and the best thing you can do to save yourself when your hard drive crashes, gets stolen, or otherwise corrupted.

I work remotely and needed a backup strategy for my work laptop. My company encourages you to keep critical client files and such in your personal share on the network. Most of our employees work in office so this is perfectly reasonable as the norm. (Though I bet a ton of stuff still ends up on local machines!) As a remotee however, this isn't practical for reasons that are hopefully obvious to everyone.

There are as many backup strategies, scripts, and apps as there are computers. Today though, it was PowerShell to the rescue with the help of trusty 7-zip (command line version) and robocopy.

Requirements

Implementation details

Results

Here's what it looks like.

I'll store my most recent backups locally in C:\backup. Timestamps in the names keep everything straight. Each backup has an accompanying log file.

Screenshot of backup folder

My backup script is stored in Dropbox.

Screenshot of script folder

I have a Windows Scheduled Task that runs a differential backup (diff-backup.cmd) hourly. The differential backup is usually pretty fast, so running hourly is no problem.

I have another task that runs the upload (upload-backup.cmd) once a day when I am connected to the VPN. Task Scheduler is supposedly smart enough to only run the task if the VPN connection is available and to retry on a regular basis if it isn't. I've never used this feature before so we'll see how it goes. I can always fall back to manually running this via a desktop shortcut.

The Script

My script isn't suited to sharing verbatim but I do want to share the good parts.

Selecting the most recent full backup

Differential backups have to be based on a full backup. The script is smart enough to find the most recent full backup file and use it as the base.

# Find the most recent full backup.
# Depends on Get-ChildItem returning the items sorted in ascending order (oldest backups first).
$fullBackup = Get-ChildItem -File -Path "$backupOutputPath\backup-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9].7z" | select -Last 1 -ExpandProperty FullName
if (-not ($fullBackup) -or -not (Test-Path $fullBackup -PathType Leaf)) {
    throw "No full backup was found. Must have a full backup before performing a differential."
}

7-zip args for a full backup

$7zipArgs = @(
    "a";                          # Create an archive.
    "-t7z";                       # Use the 7z format.
    "-mx=7";                      # Use a level 7 "high" compression.
    "-xr!thumbs.db";              # Exclude thumbs.db files wherever they are found.
    "-xr!*.log";                  # Exclude all *.log files as well.
    "-xr-@`"`"$excludesFile`"`""; # Exclude all paths in my excludes.txt file.
    "-ir-@`"`"$includesFile`"`""; # Include all paths in my includes.txt file.
    "$outputFile";                # Output file path (a *.7z file).
)

Notice that I doubled some quotation marks for the excludes and includes arguments. This escaping is necessary to ensure the path I'm passing in works even if it has spaces in it.

7-zip args for a differential backup

$7zipArgs = @(
    "u";                                    # Update an archive. Slightly confusing since we'll be saving those updates to a new archive file.
    "$fullBackupPath";                      # Path of the full backup we are creating a differential for.
    "-t7z";
    "-mx=7";
    "-xr!thumbs.db";
    "-xr!*.log";
    "-xr-@`"`"$excludesFile`"`"";
    "-ir-@`"`"$includesFile`"`"";
    "-u-";                                  # Don't update the original archive (the full backup).
    "-up0q3r2x2y2z0w2!`"`"$outputFile`"`""; # Flags to specify how the archive should be updated and the output file path (a *.7z file).
)

The last argument there is a doosey. Here's what those flags mean.

You can read more about 7-zip's options in their documentation.

Watching for exit codes

I want my backup to fail if 7-zip fails.

& $7zip @7zipArgs | Tee-Object -LiteralPath $logFile
if ($LASTEXITCODE -gt 1) # Ignores warnings which use exit code 1.
{
    throw "7zip failed with exit code $LASTEXITCODE"
}

Log 7-zip output to both the console and log file

Did you notice Tee-Object in the above snippet? That's what handles this. The only downside is that your console output will be laggy due to PowerShell buffering the pipeline.

Deleting old backups

Remember, I'm only keeping the most recent full backup and most recent differential backup files locally. My script automatically deletes old backup files after a new backup is successful. This is the magic used to delete the old differential backups when a new one finishes.

# Clean up old differential backup files.
# Only keep the most recent differential backup.
# Depends on Get-ChildItem returning the items sorted in ascending order (oldest backups first).
$allDiffBackups = Get-ChildItem -File -Path "$backupOutputPath\backup-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]-diff-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9].7z"
if ($allDiffBackups -is [array] ) {
    [Array]:: Reverse($allDiffBackups)
    $allDiffBackups | select -Skip 1 | % {
        Write-Host "Deleting old differential backup. File: $($_.FullName)"
        # Remove the matching log file.
        Remove-Item -LiteralPath ([System.IO.Path ]::ChangeExtension($_.FullName, ".log")) -ErrorAction SilentlyContinue
        $_
    } | Remove-Item
} 

Robocopy args to upload the backups

$robocopyArgs = @(
    "$backupOutputPath"; # Source path
    "$networkDestPath";  # Destination path
    "/Z";                # Use restartable mode when transferring files.
    "/FP";               # Log the full paths to files.
    "/NP";               # Don't log progress percentages.
    "/X";                # Log a list of 'extra' files that exist in the destination but not locally.
    "/UNILOG+$logfile";  # Append to the specified log file.
    "/TEE";              # Send output to the console in addition to the log file.
)

Those arguments will perform a shallow copy of all files in the source folder to the destination.

Note that since robocopy supports the /UNILOG+ and /TEE arguments I don't have to use Tee-Object to append to the log file.

Once again, I'll make sure the copy is successful and fail the script if the copy fails.

& $robocopy @robocopyArgs
if ($LASTEXITCODE -ge 8) # exit code when files failed to copy
{
    # robocopy exit codes: http://support.microsoft.com/kb/954404
    throw "robocopy failed with exit code $LASTEXITCODE"
}

Passing args from PowerShell to executables

There are a million ways, of course, but my favorite is to put the arguments into an array and let PowerShell splat them. I really like how this keeps the script readable even when you are passing a ton of arguments around.

$cmdArgs = @(
    "/E";
    "/UNILOG+$logfile";
) 

& some_application.exe @cmdArgs

The only thing to be aware of here is that if one of your $cmdArg items contains a SPACE then PowerShell will automatically wrap the argument in double quotes when passing it to the executable.

If you are having trouble getting your quotes and other special characters escaped properly, I've written several EchoArgs apps to help. They just echo back the arguments passed in so you can see how they are being received.

Epilogue

Hopefully this will help you get a jump start on your own backup strategy. Please, hit me up in the comments if you have any questions.

Want to learn PowerShell? Check out Windows PowerShell in Action by Bruce Payette. This is the book that got me started and it's one of the best tech books I've read.

blog comments powered by Disqus