I recently ran into an problem where I needed a remote directory to exist before rsyncing data over to it. Rsync will only create the remote directory for one level, meaning that the parent path must exist:
rsync file [email protected]:/tmp/
rsync file [email protected]:/tmp/imaginary/
There’s a few StackOverflow questions about this (OK, the last one is from SuperUser), but none of them solve the problem above. The man page for rsync has the answer tucked away under the —-rsync-path parameter:
Use this to specify what program is to be run on the remote machine to start-up rsync. Often used when rsync is not in the default remote-shell’s path (e.g. –rsync-path=/usr/local/bin/rsync). Note that PROGRAM is run with the help of a shell, so it can be any program, script, or command sequence you’d care to run, so long as it does not corrupt the standard-in & standard-out that rsync is using to communicate.
We can use this knowledge and the example in the man page to make rsync do exactly what we want:
rsync -aq –rsync-path=”mkdir -p /tmp/imaginary/ && rsync” file [email protected]:/tmp/imaginary/
This technique is much more efficient than fork-execing an SSH process to run “mkdir -p” first. To test, I compared both versions (rsync only vs ssh, then rsync) 100 times in a for loop. It’s not the most scientific test in the world, but I think it represents some real-world usage:
$ time ./rsync_test.sh
$ time ./ssh-and-rsync.sh
34 second wall-time decrease, and near half-time decrease in user and sys! Cheers to rsync for making this a feature and for the StackOverflow questions making me refusing to believe the truth.