Bash Script: Incremental Encrypted Backups with Duplicity (Amazon S3)

Update (5/6/12): I have not been actively developing this script lately. Zertrin has stepped up to take over the reins and offers a up-to-date and modified version with even more capabilities. Check it out over at github.

This bash script was designed to automate and simplify the remote backup process of duplicity on Amazon S3. After your script is configured, you can easily backup, restore, verify and clean (either via cron or manually) your data without having to remember lots of different command options and passphrases.

Most importantly, you can easily backup the script and your gpg key in a convenient passphrase-encrypted file. This comes in in handy if/when your machine ever does go belly up. Code is hosted at github.

how to use

To get the latest latest code in the script you can download a zip copy of the source or clone the git repository like so:

  • git clone git://github.com/thornomad/dt-s3-backup.git

You’ll also need to have a number of things in place in order to utilize this script, specifically: gpg, duplicity, an Amazon S3 account, and (optionally) s3cmd. If you need help getting all these in order, I wrote another post about putting it all together. It’s not all that difficult, but does take a few pieces of the puzzle to be in order.

Once you have the script, you will need to fill out the foobar variables with your own specific information.  I suggest testing the script on a small directory of files and a local directory for your destination first to make sure it is working.

Usage

From the README file:

COMMON USAGE EXAMPLES
=====================

* View help:
    $ dt-s3-backup.sh

* Run an incremental backup:
	$ dt-s3-backup.sh --backup

* Force a one-off full backup:
    $ dt-s3-backup.sh --full

* Restore your entire backup:
	$ dt-s3-backup.sh --restore 
    You will be prompted for a restore directory

	$ dt-s3-backup.sh --restore /home/user/restore-folder
    You can also provide a restore folder on the command line.

* Restore a specific file in the backup:
    $ dt-s3-backup.sh --restore-file
    You will be prompted for a file to restore to the current directory

    $ dt-s3-backup.sh --restore-file img/mom.jpg
    Restores the file img/mom.jpg to the current directory

    $ dt-s3-backup.sh --restore-file img/mom.jpg /home/user/i-love-mom.jpg
    Restores the file img/mom.jpg to /home/user/i-love-mom.jpg

* List files in the remote archive
	$ dt-s3-backup.sh --list-current-files

* Verify the backup
    $ dt-s3-backup.sh --verify

* Backup the script and gpg key (for safekeeping)
    $ dt-s3-backup.sh --backup-script

Changes

You can view the changelog at github.

185 Comments (newest first)

  1. hard disk data recovery service

    Bash Script: Incremental Encrypted Backups with Duplicity (Amazon S3) « damontimm.com

  2. basketball clothing infants

    Bash Script: Incremental Encrypted Backups with Duplicity (Amazon S3) « damontimm.com

  3. Social says:

    I like what you guys tend to be up too. This kind of clever work and reporting!
    Keep up the great works guys I’ve included you guys to our
    blogroll.

  4. It is the nature of the binary options; learn how to recognize the markets trending patterns.
    While these streaks are where money is made, netting a profit that binary options demo is
    presently in your hands. Discipline is very important for
    you to binary options demo earn money with Binary Options.
    At the end of the day.

  5. [...] backups I recommend use duplicity and DT-S3-Backup bash [...]

  6. [...] backups I recommend use duplicity and DT-S3-Backup bash [...]

  7. John says:

    Hi there

    I keep getting the “Oops!! ./dt-s3-backup.sh was unable to run!
    We are missing one or more important variables at the top of the script.
    Check your configuration because it appears that something has not been set yet’ is there something I’m missing, Ive gone through the script a number of times.
    Thanks in advance.

    • Damon says:

      So – there is a bunch of placeholder values in the script; I would suspect you are missing one of them. I can’t remember offhand but I think if it finds a foobar or the like in the script settings it will throw that error. Checkout the development of the clone on github since I am not using this currently. Ta

  8. olo says:

    Hi,
    does this method support Amazon Glacier?

    • Zertrin says:

      Being only a wrapper script for duplicity, it supports all backends that duplicity supports.

      For support of Amazon Glacier, you should contact duplicity’s team upstream.

      This said, it seems that the way Glacier works is very different compared to “normal” backends (S3, FTP(S), SFTP, local…) so I’m not expecting to see support of Glacier soon.

  9. olo says:

    Does this method support Amazon Glacier?

    • John says:

      Glacier wouldn’t support remote differential backup without a local index – so you’d be better off backing up using duplicity locally and then copying the backups to Glacier – writing each incremental as it appeared.

  10. John says:

    Hi.

    Thanks for sharing this.

    Does S3 preserve the file permissions backed up by duplicity?

  11. Great, thanks for sharing this blog post.Much thanks again.

  12. Lou says:

    Is there anyway specify the duplicity time variable in a restore a specific backup?
    Something like the man page give here:

    duplicity [restore] [options] [--file-to-restore ] [--time time] source_url target_directory

  13. Adam K says:

    Does anyone know the reason I get the error below when I type in an email address?

    EMAIL_TO= “contact@email.com”
    EMAIL_FROM= “backup@email.com”
    EMAIL_SUBJECT= “Server 1 backup status”

    If I remove the email addresses and the subject I dont get any errors

    root@server1 [/usr/local/sbin]# sh dt-s3-backup.sh –backup
    dt-s3-backup.sh: line 122: contact@email.com: command not found
    dt-s3-backup.sh: line 123: backup@email.com: command not found
    dt-s3-backup.sh: line 124: Server 1 backup status: command not found

  14. goto says:

    goto…

    [...]Bash Script: Incremental Encrypted Backups with Duplicity (Amazon S3) « damontimm.com[...]…

  15. Christos Prassas says:

    Hi,
    Could you please publish a version of the script for backing up to an ftp server other then Amazon?
    Your script is the only one that makes all together (backup full or incremental, restore, email, encrypt) and I must congratulate you!
    It will be very useful if we can do some backups to our ftp servers in our place (home ftp server, office ftp server, nas, etc).
    Thank you

    Regards
    Christos

    • Damon says:

      Hi Christos – I’m not actively developing the script at this time; however, you’re welcome to fork the project at github. Perhaps someone else has already forked it and you could ask them as well. Take care. Thanks.

    • Zertrin says:

      Hi Christos!

      Based on many patches for the original code of Damon spreaded across the forks of this project on GitHub, I built a consolidated version of the script that should be able to backup to other destinations than Amazon S3 as well.

      You can find this version on my repo https://github.com/zertrin/duplicity-backup

      It work for me with an SFTP backend.

      Any feedback or merge request is welcome.

  16. James says:

    I have made some changes that look like they fix the script to handle paths containing spaces properly. I have tested backup and restore.

    I’ve never contributed to an open source project before so not sure whether/how to upload the code to “git”. If someone is looking after that I’ll gladly email my version of the script.

    Thanks

    James

  17. bmilesp says:

    Hello,

    Great Script, i’ve just used it on Ubuntu 11.10 and everything works fine except that filenames containing spaces are split at the space and treated as separate files!

    Any idea how to fix this?

    Thank you very much, again!

    • bmilesp says:

      oh, forgot to mention the error message. say i have a file named “fileWith ASpace.png”

      i will get an error msg like:

      du: cannot access `/root/folder/path/fileWith’: no such file or directory
      du: cannot access `ASpace.png’: No such file or directory

      it seems to treat the space as a delimiter.

      thanks again.

  18. Stratos says:

    Thank you very much for your script. Is it possible to restore to a specific date?

    Great job!
    S

  19. Mark Ruys says:

    If you want to let the script send mail, and you use SELinux, you need to rewrite:

    ${MAIL} -s “””${EMAIL_SUBJECT}””” $EMAIL_FROM ${EMAIL_TO} < ${LOGFILE}

    to:

    cat ${LOGFILE} | ${MAIL} -s """${EMAIL_SUBJECT}""" $EMAIL_FROM ${EMAIL_TO}

  20. [...] dt-s3-backup.sh: a slick shell script that ties all these tools together [...]

  21. Dave says:

    I was thinking of using this to backup media from my server (pictures and music) which is about 100Gb. I want to just do a single full backup and only incremental backups from then on. Do you think it would be reliable enough to do this?

    • Patrick says:

      Hi Dave

      I have similar needs, and I decided to just go with s3sync for backing up my photos. This won’t store then encrypted in s3, but they will be encrypted to/from s3, and you can make the bucket be private. This will allow anyone except rogue Amazon employees from snooping through your stuff (or intruders that bypass Amazon’s protections). So, for really sensitive stuff (documents, personal information, etc) I use duplicity, and make a full backup every 2 weeks. For less sensitive stuff, like photos, I use s3sync to maintain the mirror incrementally.

      I don’t know if duplicity would work well if you kept doing incremental backups infinitely. I suspect that it wouldn’t, especially when it came time to restore from backup, since it would need to play back through all of the incremental backup files.

  22. Al says:

    Two questions.

    1) Does this include compression? If not, how would you recommend to add compression?

    2) If I reformat my HDD & would like to restore the directory, do I simply import my gpg keys to do so?

    • Damon says:

      Hey there -

      [1] I believe duplicity compresses everything by default; I think there may be a –no-compression flag if you don’t want it.

      [2] You can run the script to automatically backup your keys for your: $ dt-s3-backup.sh --backup-script. You then will have to import your new private key before attempting to restore. I would test your backup/restore system before you format your hard drive … at least, that’s what I would do!

      Good luck.

  23. Edward says:

    This might be a dumb question, so I’ll apologize in advance if it is.

    Let’s say I backup a 20GB mySQL full backup file to S3. Then, tomorrow, I run another full backup: when I push the new full backup is it going to write the entire 20GB back to S3 or does it know enough to only write the difference?

    • Damon says:

      The full backup will force a write of all 20 GB again. A “incremental” backup just does the changes. It’s a good idea to have a full backup every so often (maybe once a month), just in case there is some corruption during an incremental backup … but you don’t have to run them every day.

  24. Al says:

    A backup to file works, but I get the following message when I try to do a full to an s3 bucket:

    Traceback (most recent call last):
    File "/usr/bin/duplicity", line 1257, in
    with_tempdir(main)
    File "/usr/bin/duplicity", line 1250, in with_tempdir
    fn()
    File "/usr/bin/duplicity", line 1136, in main
    action = commandline.ProcessCommandLine(sys.argv[1:])
    File "/usr/lib/python2.6/dist-packages/duplicity/commandline.py", line 923, in ProcessCommandLine
    backup, local_pathname = set_backend(args[0], args[1])
    File "/usr/lib/python2.6/dist-packages/duplicity/commandline.py", line 816, in set_backend
    globals.backend = backend.get_backend(bend)
    File "/usr/lib/python2.6/dist-packages/duplicity/backend.py", line 153, in get_backend
    return _backends[pu.scheme](pu)
    File "/usr/lib/python2.6/dist-packages/duplicity/backends/botobackend.py", line 45, in __init__
    from boto.s3.key import Key
    ImportError: No module named boto.s3.key

    Traceback (most recent call last):
    File "/usr/bin/duplicity", line 1257, in
    with_tempdir(main)
    File "/usr/bin/duplicity", line 1250, in with_tempdir
    fn()
    File "/usr/bin/duplicity", line 1136, in main
    action = commandline.ProcessCommandLine(sys.argv[1:])
    File "/usr/lib/python2.6/dist-packages/duplicity/commandline.py", line 915, in ProcessCommandLine
    globals.backend = backend.get_backend(args[0])
    File "/usr/lib/python2.6/dist-packages/duplicity/backend.py", line 153, in get_backend
    return _backends[pu.scheme](pu)
    File "/usr/lib/python2.6/dist-packages/duplicity/backends/botobackend.py", line 45, in __init__
    from boto.s3.key import Key
    ImportError: No module named boto.s3.key

    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    An unexpected error has occurred.
    Please report the following lines to:
    s3tools-bugs@lists.sourceforge.net
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    Problem: AttributeError: 'S3Error' object has no attribute 'Code'
    S3cmd: 0.9.9.91

    Traceback (most recent call last):
    File "/usr/bin/s3cmd", line 1736, in
    main()
    File "/usr/bin/s3cmd", line 1681, in main
    cmd_func(args)
    File "/usr/bin/s3cmd", line 44, in cmd_du
    subcmd_bucket_usage(s3, uri)
    File "/usr/bin/s3cmd", line 70, in subcmd_bucket_usage
    if S3.codes.has_key(e.Code):
    AttributeError: 'S3Error' object has no attribute 'Code'

    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    An unexpected error has occurred.
    Please report the above lines to:
    s3tools-bugs@lists.sourceforge.net
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    Any suggestions?

    • Damon says:

      Do you have py-boto installed ?

      • Al says:

        That solved it. For anyone else who has this problem & is wondering, py-boto is also known as python-boto.

        • I’m getting this same error, and I do have python-boto installed:

          # rpm -q python-boto
          python-boto-1.9b-2.fc12.noarch

          The backups seem to work fine, except for this error. Any ideas how to fix?

          • Sorry – and when I say “this same error” I’m referring only to this error:

            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
            An unexpected error has occurred.
            Please report the following lines to:
            s3tools-bugs@lists.sourceforge.net
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

            Problem: AttributeError: ‘S3Error’ object has no attribute ‘Code’
            S3cmd: 1.0.0-rc1

            Traceback (most recent call last):
            File “/usr/bin/s3cmd”, line 1899, in
            main()
            File “/usr/bin/s3cmd”, line 1843, in main
            cmd_func(args)
            File “/usr/bin/s3cmd”, line 79, in cmd_du
            subcmd_bucket_usage(s3, uri)
            File “/usr/bin/s3cmd”, line 105, in subcmd_bucket_usage
            if S3.codes.has_key(e.Code):
            AttributeError: ‘S3Error’ object has no attribute ‘Code’

            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
            An unexpected error has occurred.
            Please report the above lines to:
            s3tools-bugs@lists.sourceforge.net
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  25. Chris Box says:

    Love the script – it works great. But –backup-script didn’t work for me because GPG was unable to pop up the pinentry dialog.

    The fix required was to add

    export GPG_TTY=`tty`

    before the
    tar c ${TMPDIR} | gpg -aco ${TMPFILENAME}

    line.
    Thanks,
    Chris

  26. Sebastian says:

    Hey,

    thx for this nice HowTo.
    The script runs fine, when I start it from the shell. But if I let it run with crontab (Ubuntu 11.04):

    00 23 * * * /srv/cronjobs/backup_s3.sh –backup

    it isn’t working and this is the log I get:

    ——– START DT-S3-BACKUP SCRIPT ——–

    Local and Remote metadata are synchronized, no sync needed.
    Last full backup date: none
    Last full backup is too old, forcing full backup
    ———–[ Duplicity Cleanup ]———–
    Local and Remote metadata are synchronized, no sync needed.
    Last full backup date: none
    No old backup sets found, nothing deleted.

    ———[ Source File Size Information ]———
    /home/ 23M

    ——[ Destination File Size Information ]——
    Current Remote Backup File Size: 4.0K

    ——– END DT-S3-BACKUP SCRIPT ——–

    Any ideas?
    thx

    • Damon says:

      Hmm – are you running it as root or as a user?You can uncomment line 109 and then see your log file (from the cron run) and it will output the duplicity command that is being generated … you can compare this command to when you run it from the command line and see where you are going wrong.

      Not sure what’s up from that log, honestly.

  27. Raul says:

    Works with Rackspaces Cloud Files?

    I can not create an account amazon s3 by error with the credit card (Master card). I’m not from USA

    Contact support 48 hours ago without response.

    Seems to amazon not want my money

    Thanks

  28. Shogun says:

    Hi,

    I am a new bee to linux

    I want to run a cron job with this:

    1.- I want to run an full backup first
    2.- Run an incremental backup: everyday 03:00 am.
    3.- And remove older than 30 days (already setup the script)

    And start again, 1 2 and 3.

    How to create it?
    Which command set?

    Please help me

    Thanks,

    • Damon says:

      Hi – welcome to linux. You are right, you need to setup a cron job.

      crontab -e

      Will allow you to edit your crontab.

      You don’t need to worry about step #1 because if it is your first backup, it will always be a full backup.

      To run a command every day at 3 AM you want the command to be:

      Hi – welcome to linux. You are right, you need to setup a cron job.

      0 3 * * * /location/of/your/dt-s3-backup.sh --backup

      Good luck. Google “crontab” for more info.

      • Shogun says:

        Hi Damon,

        Thanks for your help, works perfect.

        Also thanks for the Script.

        I come from Windows 7

        I started with Linux (Ubuntu) and is fantastic and powerful.

        Even I have to read more.

        Thanks

  29. Ed says:

    Hey there,

    Excellent bash script. I made one tiny change to dt-s3-backup.sh line 97 (for me anyway)

    LOG_FILE=”duplicity-`date +%Y-%m-%d-%H-%M`.txt”

    the addition of -%H ensures the hour is correctly outputted in the logs. Prior to this I think it just had minutes – hour was missing.

    Ed

    • Damon says:

      That’s a good idea – I really only added the %M tag for testing to keep track of runs I was making. I will add that when I get a chance to the script.

      Thanks.

  30. Michael says:

    Any hint why the duplicity options “remove-all-but-n-full” and “remove-older-than” don’t work like expected?
    No files are deleted on the S3 backend so the backup files on S3 grow “forever”. Users of this script have to do this manually (or using a separate cron job).

    • Damon says:

      I am not sure – I was not having this problem when I was using it. The script just generates a duplicity command … you can view this command by uncommenting line 109 in the script … this will simply output the command that the script is generating in your logfile. You can then diagnose if the command is being generated with an error inherent in it, or if the problem is with duplicity.

      • Ed says:

        Hi Damon,

        I’m having the same issue as Michael re: lots of incremental files building up on aws. I have configured your script as follows:

        STATIC_OPTIONS=”–full-if-older-than 14D –s3-use-new-style”

        CLEAN_UP_TYPE=”remove-all-but-n-full”
        CLEAN_UP_VARIABLE=”3″

        At the moment I’ve been running script for a solid week and started with a full backup and incrementals have been running hourly since then with no problems via a cronjob.

        In AWS I now have the 1 full backup i took the first time I setup the script, a ton of incrementals since then and I also just ran another manual full backup.

        I can see the new full backup in aws but I still see all the older incrementals.

        My log file from the full backup has the following related to cleanup:

        ———–[ Duplicity Cleanup ]———–
        Local and Remote metadata are synchronized, no sync needed.
        Last full backup date: Sat May 7 01:12:47 2011
        No old backup sets found, nothing deleted.

        It looks like it’s not seeing any of the incrementals. Will it delete these when full backups hits 3??

        • Damon says:

          It is supposed to! You can try the recommendation I made above (uncomment line 109) and see what the duplicity command that is being generated is …

          I don’t have anything to do with how duplicity is operating … the script just makes the command easier. If we could see the command maybe we can see the problem.

          When I was using this script it was deleting old backups as specified.

          Good luck!

          • Ed says:

            Hey Damon,

            I uncommented the “ECHO=$(which echo)” line in my script – it was line 106 so repeating the actual line here just in case others are wondering what should be on line 109 :)

            From my last log I can see the duplicity commands and it looks like your script is running the right duplicity commands (as you might have guessed!).

            I wonder if my commands are 100% right.

            I had left the defaults that you had:

            “STATIC_OPTIONS=”–full-if-older-than 14D –s3-use-new-style”

            I just realised my bucket is in the US Standard region. I’ll remove that flag and see if that makes a difference. Do you think it might?

            So I’m 100% clear – if you were running this script would it remove all the incremental files and just leave the full backups or will the incremental files remain forever in S3? Sorry if this is a really stupid question – just want to make sure I have everything setup correctly!

            Ed

            ——– START DT-S3-BACKUP SCRIPT ——–

            /usr/bin/duplicity -v3 –full-if-older-than 14D –s3-use-new-style –encrypt-key=MYKEYID –sign-key=MYKEYID –include=/$
            ———–[ Duplicity Cleanup ]———–
            /usr/bin/duplicity remove-all-but-n-full 3 –force –encrypt-key=MYKETID –sign-key=5B691EFF s3+http://MYBUCKET/

            ———[ Source File Size Information ]———

            • Damon says:

              Right – it will state in the log, too, when it is deleting things … it will list the backups it has removed. Is that $ meant to be there?

              • Ed says:

                Hey Damon,

                sorry that $ is just the terminal output where my backup list would normally be.

                I think I’m getting somewhere with this and it is probably just down to my lack of a full understanding of how duplicity works.

                My original rules as defined were:

                STATIC_OPTIONS=”–full-if-older-than 14D –s3-use-new-style”
                CLEAN_UP_TYPE=”remove-all-but-n-full”
                CLEAN_UP_VARIABLE=”3?

                Given that I’ve been only running this script for a little over a week I 99% sure that I currently only have 1 full backup (the initial one) and as 14days have not passed the script has not ran another one yet. Which therefore means the clean up conditions (remove all but 3 full) are not yet being met.

                I’m going to change my static options to do a full backup every 7 days (good balance backup integrity vs. bandwidth costs) and set incrementals to run every hour in between.

                I will then change clean up type to keep the last 2 full backups and see how that works out.

                I’ll let you know how it goes!

                Thanks again for a wonderful script and for your kind assistance and support here.

                Ed

  31. [...] (check also author’s blog post) – shell script performing a backup to a S3 bucket using [...]

  32. Michael says:

    Thanks for the script, it’s really helpful.
    One suggestion: include an option like:

    –collection-status

  33. Paul says:

    In check_variables you check for INCLIST so…: “if you want to include everything that is in root, you could leave this list empty (I think).” – then you’re wrong, right ;-)
    (I just tried using “/” as INCLIST, and that seems to work.)

    BTW; to STATIC_OPTIONS I added “–s3-european-buckets” because I’m in Europe, and on newer versions of duplicity I think it’s good to add “–s3-unencrypted-connection” for performance. (The data is encrypted already, right? And otherwise it’s really slow.)

  34. Sam says:

    Hi
    Thanks for the script and detailed post.

    I am using your backup script. I configured everything (I think) and when I do
    –verify – i get no errors.

    However when I run
    dt-s3-backup.sh –list-current-files
    or
    dt-s3-backup.sh –full

    I get the following error
    BackendException: Boto requires a bucket name.

    I can connect using the s3cmd options (like s3cmd ls).

    I am running
    Ubuntu 8.04 LTS
    Python 2.5.2
    duplicity 0.6.11
    s3cmd 0.9.5

  35. Marc says:

    Ah ok, got it
    sudo gpg –edit-key MYKEY
    trust (1)

    will allow you to force the trust on they key