in Projects

Using rsync, files-from, and FIFOs to Transfer a Dynamically Generated List of Files

I posted recently about using rsync to create a nested directory on a remote machine, and I thought I’d continue with another terrible idea I had while using rsync. I needed to sync files from a remote machine, with the list of files being synced specifically determined at the time the rsync was run (on the remote host!). I ultimately did use the trick from the last blog post, but this is another interesting way of solving the problem.

Rsync provides a —-files-from parameter that allows you to specify a filename containing a list of files to transfer. It also contains syntax for telling rsync that this file lives on the remote machine:

In addition, the –files-from file can be read from the remote host instead of the local host if you specify a “host:” in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of “:” to mean “use the remote end of the transfer.

This can come in handy for a variety of reasons, specifically if you need an exact list of files that can’t be globbed on the remote side. This becomes even more interesting if that list of files should be dynamic each time the rsync is ran. There are a few outrageous ways to accomplish this task:

  • make the files-from file a FUSE point, where reads to the file are generated dynamically. This was generally not received well by others, but was an option.
  • Use the trick from my last post on forcing rsync to create remote directories to run a script that generates the files-from file prior to running rsync. I hadn’t figured this trick out when I was originally considering the idea, or I would have used this. I eventually did.
  • Alternatively, burn a fork and SSH into the machine first to ensure that the script runs, and then run rsync.
  • Use a script running as a daemon on the remote to manage a FIFO that hands out dynamically generated lists when the file is accessed.

Other than introducing the complexity of another daemon running (and another point of failure), I was intrigued by the FIFO idea and ran off to the perlipc documentation. Some short twiddling gave me this:

#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw(mkfifo);
 
my $FIFO = "/tmp/proof";
 
while (1) {
    unless (-p $FIFO) {
        unlink $FIFO;   # discard any failure, will catch later
        mkfifo($FIFO, 0700) or die "can't mkfifo $FIFO: $!";
    }
 
    # next line blocks till there's a reader
    open my $ffh, '>', $FIFO or die "can't open $FIFO: $!";
 
    for my $file (glob("/tmp/*.needed")){
        # do some filtering here
        print $ffh "$file\n";
    }
    close $ffh or die "can't close $FIFO: $!";
    sleep 2; # to avoid dup signals
}

I’ve mostly cleaned up the example from the perl IPC documentation and fitted it with some example requirements. A more robust script would need to be crafted, but as long as this script is running, you can use the following rsync line to have dynamically generated lists of files in rsync:

rsync –files-from=:/tmp/proof /tmp remote:/home/proof #/tmp/proof is the FIFO name

Ultimately I fell back to making rsync run the command for me prior to running. I joined the command and rsync with a double-ampersand to make sure that both operations succeed together (since it relies on the list of files to be there):

rsync –rsync-path=”generate-files.pl && rsync” –files-from=:/output /tmp/ remote:/home/proof/

I’m glad I didn’t go with FUSE, but I might make a proof of concept out of spite.

Write a Comment

Comment