Here I outline the solution I chose for backing up my life, er… I mean home folder. (I’m sure there’s life outside /home/tim somewhere…)
My requirements were:
- backup to dvd+rw
- >20GB of data to back up
- no obscure formats (in case I don’t have the backup tool when I need to restore)
I looked at several solutions for backups but ended up writing scripts to meet my needs.
The main script creates a tar file of my home directory, excluding certain items, which is then split into files suitable for writing to dvd+rw discs, with tar based verification, md5sums and file list text files created at the same time.
The reason for splitting to 3 files per disc is that the iso 9660 spec has a 2GB file size limit, and it’s important that the discs are as simple as possible (ie no UDF) to aid recovery in awkward situations. This is also why I avoided compression.
backup_home.sh
#!/bin/bash -v
#DVD+R SL capacity 4,700,372,992 bytes DVD, (see [wikipedia on DVD](http://en.wikipedia.org/wiki/DVD))
#ISO max file size 2GB. 4.38GB/3 = 1,566,790,997bytes = 1,494MB
#1,490MB to leave some space for listings and checksums
tar -cvv --directory /home tim --exclude-from backup_home_exclude.txt | split -b 1490m - /var/backups/tim/home/home.tar.split.
cd /var/backups/tim/home
md5sum home.tar.split.* > home.md5
cat home.tar.split.* | tar -t > home_file_list.txt
cat home.tar.split.* | tar -d --directory /home tim > home_diff.txt
ls -l home.* > home_backup_files.txt
backup_home_exclude.txt
tim/work*
tim/.Trash*
tim/.thumbnails*
This leaves me with a big pile of split files (named .aa, .ab etc) and a few text files. I proceeded to write 3 split files per disc, and put the 4 text files on every disc for convenience. I used gnome’s built in DVD writing to create the discs.
I also wanted to verify the md5 checksums as the discs were created, so I wrote another little script to make life easier. This ensures the newley written disc has been remounted properly, and runs the md5 check. So long as the 3 relevant checksums came out correctly on each disc I can be reasonably confident of recovering the data should I need it.
“eject -t” closes the cdrom, which is handy.
reload_and_verify.sh
#!/bin/bash -v
cd /media
eject
eject -t
mount /media/cdrom
cd cdrom
md5sum -c home.md5
cd /media
eject
In addition to the above mechanism (which is a pain at best, mostly due to media limitations) I keep my machines in sync with unison which I strongly recommend for both technical and non-technical users. I gather it also runs on microsoft (who?), so you might find it useful if you are mid transition.