next up previous contents
Next: Coda Up: High Availability Content Under Previous: New Linux NFS Server   Contents


Rsync

Rsync provides a mechanism for replication. It is specifically designed for synchronisation of files over low bandwidth links [TM96]. The algorithm used can however extend to synchronisation over links of any speed.

The algorithm works by taking a rolling checksum that is very inexpensive to calculate of blocks on the foreign files and comparing this to the checksums for the local files. This checksum is designed to cheaply identify blocks that are likely to have changed. Ordinarily a strong MD5 checksum comparison is then made if a miss-match is found and only the changed blocks are transfered.

With a high speed link such as afforded by a LAN the computational expense of the strong checksum can be avoided by sacrificing some bandwidth. An option to rsync allows for any differing files to be transfered whole rather than performing the strong checksum pahse. This option is particularly attractive in a situation where we are using already loaded servers on a LAN and we are synchronising predominantly small files.

Rsync can be used to keep multiple front end servers or back end NFS servers in synchronisation with each other. In such a situation one server should be set up as a master where all modifications are made and rsync could be run periodically to propogate any changes to the other servers. Figure 2 illustrates a topology where our proverbial HTTP servers house their own data and this is synchonised using rsync.


Figure 2: Servers Synchronised Using Rsync

The main problem whith this solution is the high load that is borne during the synchronisation process. Additionally this solution is not partucuarly scalable as there is no mechanism for broadcasting the sychronisation and hence a seprate syncronisation process needs to be run for each server that data needs to be propogated to. This solution is however advantageous in the absence of any other replication mechanism as it does allow for easy data recovery in the event of disk failure on one of the servers.


next up previous contents
Next: Coda Up: High Availability Content Under Previous: New Linux NFS Server   Contents
Horms
1999-03-07