Google Site SearchFN Site Search FN Blog Login FN Blog Login
Site Navigation:

HOWTO: Automate remote backups using rdiff-backup and perl

by Gavin Henry on September 3, 2004.


Continuing with my backup articles (part two of my Amanda series coming soon...), I thought I would tell you about how I do my remote backups. The program I use is rdiff-backup, with a perl script to sort out e-mail notification and logfile generation. I will take you through my script and show you how to enable ssh passwordless access using public and private keys, so no interaction is required for full automated backups.

What is it?
rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership, and modification times. Also, rdiff-backup can operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back a hard drive up to a remote location, and only the differences will be transmitted. Finally, rdiff-backup is easy to use and settings have sensical defaults.

rdiff-backup, as you read, uses the rsync library, so why am I not using rsync it's self? Well, I couldn't be bothered configuring it, as I just wanted to run one command that handled everything, including incremental backups, so let's begin.


As with all backups, you need to plan what you would like to back up. For this howto, I have chosen only one directory, using the simplest rdiff-backup method. Basically, I want to backup my perl scripts directory for safe keeping. I will use the least possible configuration parameters, but if you would like to see more options, as usual, type man rdiff-backup or visit the examples page.

Next, you should decide on what machine you will be sending the backups to. I will be using my home machine as the backup, which will receive the files from my work machine. Lastly, make sure you have ssh access to both machines, as we will be using scp.


First, we need to install rdiff-backup on both machines. If you have Dag Wieers yum or apt sources configured, then all you need to do is:

[ghenry@work myscripts] apt-get install rdiff-backup
[ghenry@work myscripts] yum install rdiff-backup
If not, grab the RPMS from rdiff-backup, half way down the front page. Install them the usual way with:
[ghenry@work myscripts] rpm -Uvh rdiff-backup-0.12.7-1.i386.rpm

Now we need to setup passwordless ssh access. If we call my home machine, home and my work machine work (clever eh?), we need to tell my home machine that my work machine is allowed to login via passkeys. The pass key pair needs to be generated on the work machine only. (See Daniels ssh article for more info).

First, generate a public/private DSA key pair on work.

[ghenry@work myscripts]$ ssh-keygen -t dsa -f ~/.ssh/id_dsa
When you are asked for a passphrase, leave it empty, this gets rid of the password. Now send the public key to home
[ghenry@work myscripts] cd .ssh
[ghenry@work myscripts] scp ghenry@home:~/.ssh
Next, log in to home and add the public key to the list of authorized keys.
[ghenry@work myscripts] ssh ghenry@home
[ghenry@home backup] cd .ssh
[ghenry@home backup] cat >> authorized_keys2
[ghenry@home backup] chmod 640 authorized_keys2
[ghenry@home backup] rm -f
Note that the filename is authorized_keys2, not authorized_keys. That's it; you're ready to ssh from work to home without having to enter a password (as long as you have not previously messed with /etc/sshd_config).

Now that's all done, we can move on to rdiff-backup and the perl script.

For those of you unfamiliar with perl, I would suggest buying the O'reilly Learning Perl book, as it is the best.

Here is my script called rdiff-script (which is always being updated and is available from my perl site). This is actually my first ever perl script, so gurus be very kind, and I know it can be shorter and cleaner, but I am getting there. I am still only on Chapter 8 of Learning Perl you know ;-):


use strict;
use warnings;
use Mail::Sendmail;
use POSIX qw(strftime);

#  program:	rdiff-script                    #
#  license:	GPL                             #
#  author:	Gavin Henry                     #
#  company:	Suretec Systems Ltd.            #
#  url:   #
#  version:	v1.0                            #
#                                               #
#  first draft : 30-08-04                       #
#  last update : 03-09-04			#

# Global variables
my $rdiff       = '/usr/bin/rdiff-backup';
my $option1     = '-v5';
my $option2     = '--print-statistics';
my $localdir    = '/your/localdir';
my $userhost 	= 'you@home';
my $remotedir 	= '/your/remotedir';
my @args        = ( $option1, $option2, $localdir, "$userhost\:\:$remotedir" );
my $to          = '';
my $from        = '';
my $sep         =  '-' x 76 . "\n";
my $time        = localtime;  
my $datestamp   =  strftime '%d.%m.%y.%T', localtime;

# Messages
print "\n", $sep, "Brought to you by Your Name.\n", $sep;
print "\n", $sep, "Initialising remote backup synchronsation on $time.\n", $sep;

# getting exit code and program output
my $bdata = `$rdiff @args`;
my $backup = $?;

# Send e-mail with a few details for success and failures
# Success
if ($backup == 0) {
my %mails = ( 
    To      => "$to",
    From    => "$from",
    Subject => "Remote backup complete from $ENV{HOSTNAME} on $time",
    Message => "The remote backup has been completed on $ENV{HOSTNAME}" 
                . " on $time with the command:\n\n $rdiff @args\n" 
                . " The commands output was \n\n$bdata\n\n"
# Success finish message
print "\n", $sep, "Remote backup complete on $time. E-mail sent with details.\n", $sep;

# Create a success logfile
open LOG, ">>$datestamp-rdiff-backup-success.log"
  or die "Cannot create logfile: $!";
print LOG "Remote backup completed on $time, with the command:\n\n$rdiff @args\n\nOutput:\n\n $bdata\n\nAn e-mail has been sent.\n";
close LOG;
print "Logfile created on $time.\n\n";

# Failure
} else {
my %mailf = ( 
    To      => "$to",
    From    => "$from",
    Subject => "Remote backup failed from $ENV{HOSTNAME} on $time",
    Message => "The remote backup has failed on $ENV{HOSTNAME}" 
                . " on $time with the command:\n\n$rdiff @args\n\n" 
                . " The commands output was \n\n$bdata\n\n"
# Failure finish message
print "\n", $sep, "Remote backup failed on $time. E-mail sent with details.\n", $sep;

# Create a failure logfile
open LOG, ">>$datestamp-rdiff-backup-failed.log"
  or die "Cannot create logfile: $!";
print LOG "Remote backup failed on $time, with the command:\n\n$rdiff @args\n\nOutput:\n\n $bdata\n\nAn e-mail has been sent.\n";
close LOG;
print "Logfile created on $time.\n\n";
die "Backup exited funny: $?" unless $backup == 0;

# Program complete

For the above to work, you need to change a few variables, namely the e-mail address and and the directories. It's pretty easy to see what's what.

You also need the Mail::Sendmail module, which can be installed whilst being root as follows:

[ghenry@work myscripts] perl -MCPAN -e shell
[ghenry@work myscripts]  install Mail::Sendmail

Once, you have changed those, you can run the script. It will first create a full backup, as there are no files on the remote machine and from then on, it will create incremental backups (seen below), or rather the difference between what is on the locale machine and remote and only send them. The script wiil give you an e-mail which will be sent on the success or failure of the backup and a logfile generated, which will be included in the body of the e-mail and also saved in the directory from which the script was called.

[ghenry@database myscripts]$ ./rdiff-script
Brought to you by Your Name.
Initialising remote backup synchronsation on Fri Sep  3 11:05:08 2004.
Processing changed file .
Incrementing mirror file /home/ghenry/backup/perl
Processing changed file myscripts
Incrementing mirror file /home/ghenry/backup/myscripts
Processing changed file myscripts/
Incrementing mirror file /home/ghenry/backup/perl/myscripts/
Processing changed file myscripts/rdiff-script
Incrementing mirror file /home/ghenry/backup/perl/myscripts/rdiff-script
Processing changed file myscripts/rdiff-script~
Incrementing mirror file /home/ghenry/backup/perl/myscripts/rdiff-script~
Remote backup complete on Fri Sep  3 11:05:08 2004. E-mail sent with details.
Logfile created on Fri Sep  3 11:05:08 2004.
Now if it fails, you will get something like:
Brought to you by Your Name.
Initialising remote backup synchronsation on Fri Sep  3 11:33:29 2004.
ssh: fakedomain: Name or service not known
Fatal Error: Truncated header string (problem probably originated remotely)
Couldn't start up the remote connection by executing
    ssh -C ghenry@fakedomain rdiff-backup --server
Remember that, under the default settings, rdiff-backup must be
installed in the PATH on the remote system.  See the man page for more
information on this.  This message may also be displayed if the remote
version of rdiff-backup is quite different from the local version (0.12.7).
Remote backup failed on Fri Sep  3 11:33:29 2004. E-mail sent with details.
Logfile created on Fri Sep  3 11:33:29 2004.
Backup exited funny: 256 at ./rdiff-script line 89.
And the contents of the logfile and e-mail are something like and the same as above for failures:
Remote backup completed on Fri Sep  3 11:28:26 2004, with the command:
/usr/bin/rdiff-backup -v5 --print-statistics /home/ghenry/perl ghenry@home::/home/ghenry/backups/perl
Output: Executing ssh -C ghenry@home rdiff-backup --server

--------------[ Session statistics ]--------------
StartTime 1094210228.00 (Fri Sep  3 12:17:08 2004)
EndTime 1094210305.33 (Fri Sep  3 12:18:25 2004)
ElapsedTime 77.33 (1 minute 17.33 seconds)
SourceFiles 10719
SourceFileSize 61807347 (58.9 MB)
MirrorFiles 10719
MirrorFileSize 61807331 (58.9 MB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
DeletedFileSize 0 (0 bytes)
ChangedFiles 3
ChangedSourceSize 3139 (3.07 KB)
ChangedMirrorSize 3123 (3.05 KB)
IncrementFiles 3
IncrementFileSize 608 (608 bytes)
TotalDestinationSizeChange 624 (624 bytes)
Errors 0
An e-mail has been sent.

This can all be automated via cron using the following settings, which will run the script once a day, every day, at 2am (shows all steps):

[ghenry@work myscripts] crontab -e 
0 2 * * * "./home/ghenry/scripts/rdiff-script"
[ghenry@work myscripts]$
crontab: installing new crontab
Of course, you can run it more then once and at anytime time you like, see man crontab for more info.


I have only touched upon the settings of rdiff-backup, but you can see how easy it is. I have taken you through installing it, configuring ssh passkeys, testing and automating. It can be even easier than this if you don't want the perl script. Just enter the crontab entry to call rdiff-backup directly. Of course you won't get the nice e-mails and tailored logfiles :-)

Well, that's it for now. For any comments or corrections, please e-mail me.