Making backups (instructions for GNU/Linux)

The cardinal rule of backing up is: assume that any one of your hard drives could go up in smoke at any moment. Zap. Could be right now. Magnetic disks will fail, and it's not a matter of if, but of when. (Backups will also save you from some user error, although that is not their primary purpose.)

Once you are thoroughly convinced of that, you will be nearly paranoid enough to implement a good backup strategy. This may seem awfully depressing, but the great news is that storage is cheap these days. As of this writing, you can get a 500GB hard disk for under $100. Having a full backup of your files when your primary disk breaks into tiny pieces is worth a lot more than $100. I've lost three disks in the last three years. In no case did I lose any data.

I am not willing to mess with the hardware or software needed to configure a RAID, so my backup solution (based on jwz's PSA) involves a second hard disk and rsync on a cron job.

Whenever I get a new hard disk for a computer, I generally make it the primary disk (containing /, /home, and swap) and graduate the previous disk to the role of a backup and swap disk. The backup partition is formatted with ext3, just like my root and home partitions. Suppose that the backup partition on the second disk is mounted at /media/sdb1 and I want to backup my homedir to /media/sdb1/phil. (My system is pretty much a stock Ubuntu install, so there is little value in backing up stuff outside of /home.)

The following script, archive-homedir, rsyncs my homedir to the backup disk:

rsync -vaxE --delete --ignore-errors /home/phil/ /media/sdb1/phil/
touch /media/sdb1/phil/last-backup

The options basically mean: print the files being copied; preserve timestamps and permissions; don't descend into other filesystems mounted under your homedir; delete files in the backup when they get deleted on the main partition; and ignore errors. The script also touches a file so you can see at a glance when the last backup was made.

This crontab file backup.crontab causes a backup to happen every day at 6AM:

0 6 * * * /path/to/archive-homedir

Install and activate it with crontab /path/to/backup.crontab. If your cron is configured to email you with the job output, you will get the list of files that was backed up every morning. Watch out for messages telling you that your disks could not be read or written. These generally mean that you need a fsck or a new disk.

I use the same kind of setup to back up my laptop disk over the network. On my laptop, archive-homedir looks like this:

rsync -vaxE --delete --ignore-errors --delete-excluded \
--filter="merge /path/to/archive-exclude-patterns" \
/home/phil/ desktop:/media/sdb1/laptop/

where archive-exclude-patterns is a file with a list of filter rules that instruct rsync to include or exclude certain files. I use this file to tell rsync not to back up some files that are not worth transferring over the network, like my web browser cache. My archive-exclude-patterns looks like this:

- /.local/share/tracker/
- /.local/share/Trash/
- /.mozilla/firefox/*/Cache/
- /.thumbnails/

On my laptop, I don't run archive-homedir on a cron job, but I do run it whenever I'm on the same LAN as my desktop.

No comments:

Post a Comment