Skip to content

Using UberFTP to get low level protocol access to GridFTP servers

fr4nk5ch31n3r edited this page Nov 13, 2012 · 3 revisions

Using uberftp to get low-level protocol access to GridFTP servers

Background

uberftp is an interactive commandline client for GridFTP developed by the university of Illinois and NCSA and is available from GitHub. It also supports non-interactive usage which allows its usage in scripts. In addition to transfering data and list remote dirs (as supported by globus-url-copy (guc)) it also supports deletion of remote files and dirs, as well as creation of remote dirs, remote renaming of files, cating files, chgrp and chmod and can return the size of remote files.

To be clear, all these features are supported by standard GridFTP servers, but not necessarily used by guc.

Data transfers with uberftp can also be done with mutliple parallel streams and adapted TCP buffer sizes. Concurrency and pipelining are not supported currently.

uberftp has another nice feature: The [l]quote command, which enables sending of arbitrary strings over the control channel to connected GridFTP servers. With this facility one has direct low level protocol access to GridFTP servers. This makes it easy to "replay" debugged guc sessions manually in interactive mode or using GridFTP functionality that is not supported directly by uberftp. But to really make use of this facility, it must be scriptable. Is this possible?

Yes!

Scripting uberftp

Experiments with uberftp showed that it also accepts commands send to stdin, for example with a command similar to the following:

$ echo "pwd; ls; bye" | uberftp -P 2811 gridftp.domain.tld

The problem is, that with this command one cannot evaluate the output of uberftp during a session. uberftp will exit, even without the bye at the end. But to make striped transfers, the output of the SPAS command is needed, which could look like the following:

229-Entering Striped Passive Mode.
 192,168,0,1,79,212
 192,168,0,2,79,127
 192,168,0,3,79,8
 192,168,0,4,79,64
229 End

To connect to these data nodes, one would use the following command:

SPOR 192,168,0,1,79,212 \
192,168,0,2,79,127 \
192,168,0,3,79,8 \
192,168,0,4,79,64

So what is the solution to this issue?

Using named pipes and tail!

A simple setup could look like the following:

$ tail -f stdin_uberftp | uberftp [-P 2812 gridftp.domain.tld] > stdout.txt &

Start uberftp with stdin connected to the output of a tail command that reads from a named pipe. The output of uberftp is redirected to a file. The host and port can also be omitted, because this can be entered via the stdin of the uberftp command.

To feed commands to the uberftp command, one can use the following example:

$ echo "command1" > stdin_uberftp

Please make sure that each command ends with a new line - which is the default if echo is used without -n. The uberftp command will not exit until one quits the session, which means that consecutive commands can be send with the echo command above.

To evaluate the output of uberftp, one can read the output file after each command invocation and perhaps truncate it to ease evaluation. The following example can be used:

$ cat stdout.txt | {grep|sed} <SOMETHING>
$ >stdout.txt ##  truncate output file

Fortunately the uberftp command survives the truncation of the output file.