How To: Automated Encrypted Incremental Backups on Amazon S3 with Duplicity (OS X or Ubuntu)
Purpose: setup an automatic encrypted off-site backup system that utilizes Amazon S3 with incremental backups by duplicity on the Mac (Leopard) or Ubuntu. Currently, I do have my own on-site backup system in place (nightly backups via rsync to external hard drive), but I am wary that some day my house may explode and I’ll have nothing left. Enter my new friend: the encrypted off-site backup.
before you begin
Before you can start backing things up off-site in a secure fashion, you’ll need to get a few pieces of the puzzle in place. Namely, you’ll need software installed (duplicity), a GPG key (for encryption), and an Amazon S3 account setup (for storage), and then use a backup script that can be run automatically (for laziness’s sake!).
Getting an Amazon S3 account is easy to do: head over to http://aws.amazon.com/s3/ and sign up; grab your “Access Key ID” and “Secret Access Key” and you are ready to go. There is a lot you can do with S3 (and a lot of ways to access it), but for our purposes, this is pretty much all you need.
Lastly, I would recommend spending a little time reading about duplicity (see the man page) as well as GnuPG (man page). There is a lot to consider, and I just picked the options I thought would work best for me.
Install the Software: Duplicity
For this to work we need duplicity installed with all the correct dependencies. The easiest way to do this on your Mac is to simply use MacPorts, which has an up-to-date version in the repositories (see Installing MacPorts if you don’t have it installed already). If you already have MacPorts installed, all you should have to do is run the following from the Terminal:
$ sudo port install duplicity py25-socket-ssl py25-boto
If you are using Ubuntu, you could simply run
sudo aptitude install duplicity to install the program (it is in the repositories); however, if you want to make sure you are using the latest version (which may not be available there yet), you can try this:
$ sudo apt-get build-dep duplicity
$ sudo aptitude install python-boto ncftp
$ wget http://savannah.nongnu.org/download/duplicity/duplicity-0.5.07.tar.gz
$ tar xvzf duplicity-0.5.07.tar.gz
$ cd duplicity-0.5.07/
$ sudo python setup.py install
If you ever want to upgrade again, just download and untar the latest version and run the last setup line again. It will install the newest version for you.
If everything has installed correctly, you can do a test run pretty easily on your local machine by backing up a folder to another local folder (first command) and then restoring it to a different folder (second command). If you look inside this
/test/backup-location/ you’ll see what duplicity looks like:
$ duplicity --no-encryption /test/folder/ file:///test/backup-location/
$ duplicity --no-encryption file:///test/backup-location/ /test/restore-location/
Setting Up Encryption
For duplicity to really shine, it needs to have a gpg key to encrypt your files. If you don’t already have one, you can create it by running the following (read the documentation for more information):
$ gpg --gen-key
I used all the defaults when setting up my key and chose my own passphrase. Unfortunately, in order to make this work without user input (as an automatic cron job), the passphrase is going to have to be stored somewhere on your computer locally, so, I wouldn’t use one of your usual passwords (something really long and unique would be better). Also, if you already have a gpg key (or want to use one for other purposes), I would recommend making a different one for the Amazon S3 backups — because, in the end, your password has to be stored somewhere on your computer for it to work auto-magically.
Once you have your gpg key created you can check it out by running:
$ gpg --list-keys
This shows your new key, which probably looks something like this:
pub 1024D/CA4ZA320 2008-11-15 uid Damon Timm (thornomad) firstname.lastname@example.org sub 2048g/C1E64A4F 2008-11-15
Note the public key identifier “CA4FA320″ (yours will be different); we will need that to go in our script.
final step: using a backup script
So, everything is on your system and, hopefully, working. Now, to run a backup takes a lot of typing (on the command line) and the easiest way to avoid this chore is to run a backup script. A script can store your Amazon and GPG key information and make it so you don’t have to type anything ever again!
Backing things up is a very personal task, and everyone is going to want to do it a little differently. I created my own backup script which you are happy to check out — if you have any neat features you add to suggestions, I would love to hear them and incorporate them.