Dirvish backup system

Dirvish is a great program for creating incremental backups using rsync as a backend. It ustilizes hardlinking so disk space is extreamly cheap. The hardlink will enable you to store weeks worth of backups and only use a small amount of disk. More information about dirvish can be found here:

http://www.dirvish.org/

In this article I'll discuss how I configured a dirvish server on a Gentoo file server. This dirvish server backups up seven other linux boxes as well as two Window XP boxes.

First we'll need to install the program

emerge -v dirvish

Next we'll need to create a master.conf file which will hold the 'global' configurations. You can have individual configurations based off of specific backup needs.

vi /etc/dirvish/master.conf

bank:
        /bank
exclude:
        /lost+found/
        /proc/
        /core
        /sys/
Runall:
# linux
        spock
kirk
sulu
mccoy
scotty
checkov
uhura

# windows
picard
riker

I currently have a 250GB drive installed on the linux server with reiserfs filesystem and partitioned in /bank. After going into /bank we'll need to create our 'vaults' with /dirvish directories within them. This directory will contain the individual custom config files.

cd /bank
mkdir -p spock/dirvish kirk/dirvish sulu/dirvish mccoy/dirvish scotty/dirvish checkov/dirvish uhura/dirvish picard/dirvish riker/dirvish

I will discuss /bank/spock/ first since this will model the rest of the linux boxes. Create a dirvish.conf file in the dirvish directory within.

vi /bank/spock/dirvish/default.conf

Here is where you put host information, any custom rsync options as well as the excludes:

client: spock
tree: /
index: gzip
speed-limit: 90
image-default: %Y%m%d
expire-default: +5 weeks
exclude:
        tmp
        var/tmp
        var/spool
        /usr/src/**/*.o
        lost+found/
        /usr/portage

My configuration file will store 5 weeks of backups. Again, a 7GB initial backup.. after 5 weeks will probably be around 9GB total. This depends on files, since rsync will only bring over 'new' or 'changed' files for the incremental backup.

After creating the configuration file you'll need to initialize the vault.

dirvish --vault spock --init

What you want to to see is a directory with the date format in the vault.

ls -al /bank/spock

drwxr-xr-x  3 root root 144 Aug 30 03:45 20060830
drwxr-xr-x  2 root root 112 Aug  6 11:11 dirvish

Now make sure there are no errors during the rsync process. If there are you'll need to delete the 20060830 directory, correct the errors by modifying your excludes and re-issuing the init command above. You'll know if there is a problem if you find a rsync_error file in 20060830. Errors are usually permission problems or vanished files.

Repeat this with the other linux hosts.

Now to back up the Windows boxes. This is a trickier process. You'll need to install cygwin on the Windows client. I was unable to get SSH+rsync for a pull from linux to work. This is a well documented problem with cygwin and SSH. Pushing causes no problems, but dirvish will be pulling files from Windows. Since this is on my local network, I felt comfortable with a straight rsync pull. If this is not an option, you may want to look into a VPN tunnel from the linux server to windows and using rsync through it.

First download cygwin and install it on the Windows host

http://www.cygwin.com/

Install the following packages:
+Net - Rsync
+Admin - cygrunsrv

Edit PATH for windows.. and include C:\Cygwin\bin; at the end.

From (Gaztronics): http://www.gaztronics.net/rsync.php
On Windows 2000/XP, open the Control Panel and double click on the System applet. Click on the Advanced tab, then click the Environment Variables button. Double click on the PATH statement in the 'System Variable' screen (lower of the two), add the path on the end, and click OK. Click OK to close the Environment Variables screen, then click OK to close the System Properties dialogue box. The path will be dynamically reloaded (no need to reboot).

Create a rsyncd.conf file in C:/cygwin/etc/rsyncd.conf

use chroot = false
strict modes = false
[backup]
    path = /cygdrive/c/
    comment = C Drive Backup
    read only = false

Again.. from Gaztronics.. great site:

Method 2. Step 4: If you are setting up on Windows 2003 Server (otherwise skip to the next step):

(1) Open the Windows File Explorer and go to the C: drive.

(2) Right click on the 'Cygwin' directory and select 'Properties'.

(3) Click on the 'Security' tab. The user 'Administrator' should be the first in the list and it will not have any permissions set for this folder.
(If the user 'Administrator' is not listed, you will need to add it.)

(4) Tick the 'Allow - Full Control' box in the "Permissions for Administrator" window.

(5) Click the Advanced button and tick the box for "Replace permission entries on all child objects with entries shown here that apply to child objects".

(6) Click the Apply button to set the permissions.

(7) Click the OK button to close the Advanced settings dialogue box.

(8) Click the OK button to close the Cygwin properties dialogue box.

Method 2. Step 5 Install Rsync as a Service from a 'Command Prompt' window with the following command line:

Windows 2003

(All versions & service packs, installed as Administrator)
cygrunsrv.exe -I "Rsync" -p /cygdrive/c/cygwin/bin/rsync.exe -a "--config=/cygdrive/c/cygwin/etc/rsyncd.conf --daemon --no-detach" -f "Rsync daemon service" -u Administrator -w password

Windows NT/2k/XP
(All versions & service packs, installed as Administrator)
cygrunsrv.exe -I "Rsync" -p /cygdrive/c/cygwin/bin/rsync.exe -a "--config=/cygdrive/c/cygwin/etc/rsyncd.conf --daemon --no-detach" -f "Rsync daemon service"

This will get the service running. Next verify that it's actually listening.. I like to use netstat

Start->Run->Cmd

netstat -an

Look for:

TCP    0.0.0.0:873            0.0.0.0:0              LISTENING

Now we need to edit our dirvish.conf for the windows client.. so back to the linux dirvish server. Excludes are a little weird...

cd /bank/picard/dirvish
vi default.conf

client: picard
tree: :backup
xdev: true
index: gzip
image-default: %Y%m%d
expire-default: +5 weeks
speed-limit: 90
rsync-option:
        --port
        873
        --modify-window
        2
exclude:
        System?Volume?Information
        cygwin
        Virtual?Machines
        WINDOWS
        Documents?and?Settings/LocalService/NTUSER.DAT
        Documents?and?Settings/frank/Local?Settings/Temporary?Internet?Files
        Documents?and?Settings/LocalService/ntuser.dat.LOG
        Documents?and?Settings/NetworkService/NTUSER.DAT
        Documents?and?Settings/NetworkService/ntuser.dat.LOG
        Documents?and?Settings/frank/NTUSER.DAT
        Documents?and?Settings/frank/ntuser.dat.LOG
        Documents?and?Settings/*/NTUSER.DAT
        Documents?and?Settings/*/ntuser.dat.LOG
        Documents?and?Settings/*/Local?Settings/Temp
        Documents?and?Settings/frank/Application?Data/Inbox
        Program?Files
        MSSQL7
        MSOCache
        i386
        RECYCLER
        hiberfil.sys
        pagefile.sys

The first thing I was having trouble with.. I couldn't backup the cygwin directory.. so I excluded it. Also you'll notice that you'll need to use '?' for spaces. Windows also have a lot of lock files that will cause problems. You'll need to ignore some of these files. If this is a big deal, you can use ntbackup to backup your Documents and Settings.. and then dirvish those, but you'll use a lot of disk for that.

To automate the process, and have dirvish run on a nightly basis, add this to /etc/cron.daily:

vi /etc/cron.daily/dirvish

#! /bin/sh
df -H


if [ -x /usr/sbin/dirvish ]
then
        if [ -f /etc/dirvish/master.conf ]
        then
                nice /usr/sbin/dirvish-expire; /usr/sbin/dirvish-runall
        fi
fi

Now one other problem I ran into, is that you need to SSH to the linux boxes as root. I was very uncomfortable with that, but found some solid security for this. I use a combination of keys, sshd_config settings and a little perl script to maintain security.

Each linux client, I adjust the sshd_config with the following values:

PermitRootLogin forced-commands-only
PubkeyAuthentication yes
AuthorizedKeysFile      .ssh/authorized_keys

This will allow you to log in as a normal user, but if root logs in, it will need two things.
1) a key
2) an allowed command

On the dirvish box, I created keys:

cd /root
ssh-keygen -d

Don't supply a password
This should have made a /root/.ssh/ directory with the following in there:

id_dsa
id_dsa.pub

id_dsa is a private key
id_dsa.pub is a public key

Take the contents of id_dsa.pub and paste them to the authorized_keys file on the linux client. This is located on /root/.ssh/authorized_keys if it's not there, create it.

Now we edit that with the following.. so it should look similar to this..

no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,from="10.0.0.32",
command="/usr/local/bin/authprogs" ssh-dss
AAAAB3NzaC1kc3MAAACBAKJwKYiMwIgjFSiblahblahb
lahblahtOlGkoHdlydYHbsDAlnoc+FOXqHk9erwfBmk
7CHnvJ4D9OebWmCkPIpgc1VLCzxA9TEeV3xuonKvQf1
9KVwgHdm7+gw/l2gz+FCCM3CsWgIrh/v15MIQSuqwUr
dXgbe8S2SIMoPH2z30x/jWsjeg6AAEg2Fg== root@dirvish

I apologize for the formatting of the key above.

10.0.0.32 is my dirvish server. So when root logs in and matches the key, it's forced to run /usr/local/bin/authprogs.

Get the authprogs script here:
http://www.hackinglinuxexposed.com/tools/authprogs/src/authprogs

The article discussing this script is here:
http://www.hackinglinuxexposed.com/articles/20030115.html

You'll then need to make a authprogs.conf in /root/.ssh directory. This is a list of allowed commands:

[ 10.0.0.32 ]
        rsync --server --sender -vlHogDtpr --bwlimit=9000 --numeric-ids . /

So root will need a key, if root has a key the only command allowed is what's issued above.