Sunday, April 17, 2011

Synchronize home directories from multiple clients to a server

I'm using multiple Linux laptops/desktops and want them to "share" home directories.

NFS is unfortunately not an option. Therefor I was trying to create a bash script using rsync but I can't figure out how to do it.

This is my example right now


#!/bin/bash

sync() { rsync -azvR --exclude-from=/home/ME/.rsync_excludes --delete -e 'ssh -ax' $1 $2 }

sync /home/ME server.domain:/home/ME #sync server.domain:/home/ME /home/ME

I think this would work great if I only where using one single client machine which updates the server files. Correct?

What if I delete a file in one client? That file want be deleted on the other client (after sync's of cause)?

Can I use rsync for this purpose? Should I look for an other program? Hopefully not though...

Edit: Since this solution shouldn't be only for me I would appreciate if the solution would be sort of automatically. Edit2: Maybe there must be a solution including a repo in somehow. Subversion, Git, Mercurial or someting else.

From stackoverflow
  • Why not do this using Subversion ? The linked article details how the author synchronises and stores history using source control (you don't have to use Subversion, obviously - there are alternatives)

    Hkkathome : Yeah it would be an idea but I would rather appreciate if it would be done "magicly" and automaticly. Specially since it is not just me using the systems. Thanks anyway.
    Brian Agnew : So (perhaps) on login the client performs a 'svn up' or similar ?
    kdgregory : Yeah, for an automagical approach, this should work better than anything with rsync
    Hkkathome : Okey, maybe. So I should create an automatic add for all new files in home directory then. Maybe...
    lhunath : Whatever you do, do NOT automate subversion. You'll notice the horror of doing an `svn up` on login as soon as you have network troubles or are using a laptop and open it up anywhere without network access. By the way; who wants to wait 10 minutes on slow SVN to up your homedir when logging in?
    lhunath : That aside; remember that you need to deal with conflict merging. If you automate subversion you'll corrupt your entire homedir as soon as there are conflicts (until you manually resolve them). See my reply for a solution that was built for this type of need.
    Brian Agnew : Yes. You obviously have to take into account potential connectivity issues. Conflicts are more interesting e.g. I manage stuff this way and conflicts tend not to be an issue since changes are small and infrequent. Other use cases apply.
  • Looking at what you've done, this should work ... you just need to ensure that each client gets synced to the server after you're finished working on it. I use the following, which I invoke manually on a per-directory basis:

    function syncDown() {
        f=${1/\\/$/}/;
        rsync -acuvz --exclude 'CVS' --exclude '*.class' --exclude '.classpath' server:projects/$f $f;
        }
    
    function syncUp() {
        f=${1/\\/$/}/;
        rsync -acuvz --exclude 'CVS' --exclude '*.class' $f server:projects/$f;
        }
    

    If you're looking for unattended, automated synchronization, then you're not going to get it: you'll always have race conditions where you work on one client but that work gets overwritten by a sync from another.

    Hkkathome : Maybe this is the best way of doing it. I think the user maybe can afford doing this. It is actually just running a couple of commands and need of adding files to a repo and stuff like that. Thanks for your answer...my brain works slowly right now (friday afternoon here in Stockholm, Sweden). :-)
  • It looks like you probably already know this, but, just to emphasize the point for those who may see this question in the future:

    rsync only does one-way synchronization. If you want bi-directional sync, you need to use something else. (cvs/svn/git/etc. would be appropriate "something else"s, but a revision control system may not be the optimal choice if you don't need an update history.)

    In practical terms, this means if you're rsyncing from A to B, then each sync will make the directory on B look exactly like the directory on A - any changes made on B since the last sync will be lost (barring excludes and with the caveat that rsync will only delete files if --delete is specified). This sort of arrangement with an authoritative master version which is then pushed out to other locations is appropriate in many cases, but any sort of collaborative work is not among them.

    Hkkathome : Seems like the solution is to use some kind of repo. We are using Perforce allready but that would of cause not work for this. :-) Gonna have a look at Git or Mercurial.
  • rsync is good to keep one location in sync with a master. Or in other terms, mirror A to B. That's not what you're doing, though. You'd have to rsync A to B and B to A. Which brings a whole new set of problems. If a file disappeared, do you need to delete in on the other side or rsync it back? Maybe it was modified on the other side; you can't check.

    Anyway; the solution to this problem comes in the form of unison. That's a tool (works on Linux, OS X, Windows, BSD, ...) (has CLI tools, GUI tools, and can be scheduled nicely in cron) which will keep your home directory or any other directory nicely in sync, and is made to be able to deal with almost any type of conflict or problem. Those people thought it all out way better than we could here.

    Alternatively, there's SCMs. Many people use SCMs for managing their home directories. Subversion is popular for this, but I wouldn't recommend it at all. It will not only consume massive amounts of space, make everything horribly slow and force your keeping in sync on depending on an active connection to the master repository. There's alternatives, like GIT, and others, but they all have their downsides.

    Either way, any SCM-based solution violates one very big rule of SCMs: You should never keep big binary data in there. SCMs are not made for this. You don't keep your photo collections, movies, documents, downloads, and stuff like that in an SCM, even though you may want to keep them in sync or keep a history on them (especially so for pictures/documents).

    It's important to understand that there is a difference between keeping backups and keeping in sync. Your backups should be kept in a remote/detached location and can contain a history of everything you own. I personally recommend rdiff-backup for this. It keeps history of everything beautifully, uses the rsync algorithm under the hood to minimize traffic and accessing the backup location looks like the most current state of the backup: You can just browse through it like you do normal files.

    To summarize, I recommend you combine unison and rdiff-backup for an all-round solution to keeping your data safe and reliably in sync.

    Hkkathome : This will probably be the solution for me. I vote you up for that. I will look more in to it and hopefully make this the right answer...

0 comments:

Post a Comment