Kevin's rsync based backup system readme. NOTE: This is a work in progress. Use at your own risk! Version 0.22 Each backup is defined by a directory named as the name of the backup. Inside of that directory are settings and tab files instructing the backup system on what to do when that named backup is run. There is also a "settings" directory that contains global settings for all backups. If a named backup sets the same setting as one in the global settings file the named backup setting takes priority. The backup settings go into settings/settings.pl and [backupname]/settings.pl. The syntax is simply perl variables. The only setting that must be set is $BackupRoot. All others have defaults which are listed and explained below. The possible settings for those files (and their defaults) are: $BackupRoot="none"; # where to backup stuff to # backups will be in hostname/_filesystem.date # under this path # This setting must be set. $ArchiveMethod="link-dest|btrfs|zfs"; # what method to use for archiving old backups. # Should be global. # # link-dest means to use rsync # --link-dest (most compatible) # (default) # # cp is the old cp -al method. This # is very slow and provides no real # benefit above link-dest. Note that # this method is no longer supported. # # btrfs is for btrfs subvolumes and # snapshots. In order to use this # method your $BackupRoot must be # a btrfs filesystem and the btrfs # command must be available and # capable of making subvolumes and # snapshots of subvolumes. This is # not simply a case of doing the same # thing as link-dest but on btrfs. It # is a completely different archiving # system that is much faster # especially for purging old backups. # # zfs is for zfs subvolumes snapshots. # This is similar to the btrfs method # described above except that it uses # the zfs filesystem which is only # reliable and efficient on # [Open]Solaris. Note that I have not # yet written the code to support this # option and I may never as btrfs on # Linux is rapidly replacing the need # for zfs in addition to Oracle # rapidly removing the availability of # it. $DefaultNumberOldBackups=30; # number of backups to keep by default $HumanReadableOutput="yes"; # Tell rsync to use human readable # outputs. $BackupACLs="no"; # Tells rsync to backup file ACLs (-A) $BackupXATTRs="no"; # Tells rsync to backup file XATTRs (-X) $UseNetworkCompression="no"; # want network compression (rsync's -z)? $UseSSHIdentyFile="none"; # set if you have a non-standard ssh key location # or leave on "none" if not. # NOTE that this option conflicts with # $ExtraRsyncParams. I could probably # fix that but I am not going to # because I think only a retarded # person would use this option instead # of the wonderful ~/.ssh/config # system already provided by ssh. $ForceChecksum="no"; # set if you want to run rsync with --checksum # (slow and rarely needed) # I only recommend using this # temporarily if you believe that a # hardware failure has corrupted files # within the backup without changing # the mtime or file size. $TransferWholeFiles="no"; # Tell rsync to always transfer whole # files instead of using delta xfers. # This is faster if doing a local only # transfer. $UpdateFilesInPlace="yes"; # Use --inplace on rsync. Should not # used if files are in use on the # backup system (doesn't matter on the # backup client). Note that this is # not compaitble with --sparse. # If you enable both you will get # --sparse not --inplace. # Note that if you use btrfs or zfs # instead of link-dest you want to # keep this turned on as it will allow # them to archive files at the block # level rather than the file level. $WriteFilesSparsely="no"; # Use --sparse on rsync. This can # save disk space on the backup server # especially if you are backing up # disk or VM images. Unfortunately it # is not copmatible with --inplace. # If you enable both you will get # --sparse not --inplace. $IgnoreHardLinks="no"; # Do NOT use --hard-links on rsync. # This has nothing to do with the use # of --link-dest but rather it causes # rsync to not backup hard link # relationships that are on the backup # source. Unfortunately doing so can # take a significant amount of RAM # depending on the number of links and # files involved so some systems # simply can't handle it. I do not # recommend using this unless you have # experienced rsync running from swap # or running out of free memory. Even # if you have try upgrading rsync # before turning this on. $BackupIsFAT="no"; # set if your backup device is FAT formatted. $BackupIsNAS="no"; # set if your backup device is a NAS # device running NFS. Use of CIFS or # SMBFS is NOT supported as they # cannot handle hard links. $CompensateForFAT="no"; # set if you are backing up a FAT filesystem # and want to compensate for the # limitations of FAT. $UseFilterFiles="no"; # set if you use filter files (-F) $ExtraRsyncParams=""; # Any extra rsync parameters that I # didn't think to make options for. # Added to the rsync command line as # is. Your responsibility to make # sure they work. $HostLevelSubvolumes="yes"; # Make the host level directory # structure subvolumes instead of # plain directories. This option only # affects btrfs and zfs based backups # not link-dest. Currently the effect # is very minimal however I do not # know what features btrfs will have # in the future so this may some day # become either useful or annoying. Note that ssh specific settings such as user name and port numbers should be specified in ~/.ssh/config. Those should not be different between backups and normal ssh usage so the backups do not have their own settings for them. In each named backup directory there is a file called backuptab. This is the list of what to backup. It also allows you to run external command to handle things like LVM snapshots. The syntaxes for the backuptab file is: host:/path:n # does backup with n old copies # ("d" means to use the global default, 0 means none) host:r!command # runs command on remote end host:l!command # runs command on local end (user@host ignored) host:/*:n # does a backup of all detected local filesystems # (df -lT) except for known unimportant filesystem # types (tmpfs iso9660 cd9660 squashfs) host://:n # does a backup of the root filesystem without excluding # other mount points (no --one-file-system) The global settings directory and the named backup directories can also contain files named excludes and includes which will be used with rsync's --exclude-from and --include-from options. Note that I find it much easier to have only an excludes file and use the +/- syntax to specify both in the same file. See man rsync for more info. The final file that can be in the named backup directories only is the infotab file. This file allows information about a backup client to be gathered if that information is not stored in a file. This is useful for backing up things like partition tables or anything else that can be dumped to a text file. The syntax for the infotab file is: host:command:file # ssh host command > file Incomplete backup handling with link-dest: Assuming $Backup=$BackupRoot/$HostName/_path_to_filesystem As backups are running they are named $Backup.incomplete. If a new backup is started and $Backup.incomplete already exists it will be used (completed) with the $Backup.current as the link-dest. Only when the backup is completely finished will the $Backup.current symlink be updated and the $Backup.incomplete renamed to $Backup.%Y-%m-%d.%H-%M-%S. The $Backup.current symlink serves as a convenient pointer to the most current complete backup and is never pointed at an incomplete backup. Incomplete backup handling with btrfs and zfs: Assuming $Backup=$BackupRoot/$HostName/_path_to_filesystem The target directory for rsync will always be simply $Backup. This should be considered to be a working directory as it will either be the same as $Backup.current or it will be a partially finished backup (either aborted or still running) waiting to be completed. Once a backup is finished the working directory which is actually a subvolume will be snapshotted to $Backup.%Y-%m-%d.%H-%M-%S and the $Backup.current symlink will be pointed to it. The $Backup.current symlink serves as a convenient pointer to the most current complete backup and is never pointed at an incomplete backup. The default rsync options are: --archive --one-file-system --hard-links --human-readable --inplace --numeric-ids --delete --delete-excluded --exclude-from=file --link-dest=previousbackup with --verbose --progress --itemize-changes being added if you run with -v The executable programs within this package are: backup: This is the main backup program and is run with the name of the backup directory you wish to run and optionally -v. getinfo: This is run by backup and gets non-file information from backup clients as specified by the infotab file. purge: This script purges old backup archives that have been tagged for deletion by backup. Note that this only applies to the link-dest method and does not work if you are using btrfs or zfs (it will tell you this if you try). TODO: *** btrfs support *** optimized offsite replicator! *** better error handler *** zfs support (might never happen) *** should have override setting for not re-using aborted incompletes I am not sure why this should be needed but it seems prudent. Recent changes: *0.14: Rewrite deleteme/purge system to be compatible with the use of multiple filesystems. * Note that there is no longer a "deletion pool" aka deleteme/. Instead of moving old backups that need to be purged to ../deleteme/ they will now simply have ".ToBePurged" appended to their names in place. If you are upgrading you can simply 'rm -rf deleteme/' as it will no longer be used and the purge program will no longer be looking there for things to delete. *0.15: Added settings: $WriteFilesSparsely and $IgnoreHardLinks described above. *0.22: Fixed --itemize-changes in non-verbose mode. Make btrfs purge like --link-dest does