# Using Curl for FTP over SSL file transfers

While not used much anymore, FTP still finds a use in legacy
settings, with the catch that it has to be used securely. Enter curl
[1], with its support for secure FTP connections (FTP over SSL). If
you haven't used curl, it is a great tool that lends itself to
scripted data transfers quite nicely. I'll quote from the curl man
page:

>curl is a tool to transfer data from or to a server, using one of
>the supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP,
>HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS,
>RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP). The
>command is designed to work without user interaction.
>curl offers a busload of useful tricks like proxy support, user
>authentication, FTP upload, HTTP post, SSL connections, cookies,
>file transfer resume, Metalink, and more.

Anyway, using curl with FTP over SSL is usually done something like
this:

```
curl -v --cacert /etc/ssl/certs/cert.pem \
 --ssl --ssl-reqd -T "/file/to/upload/file.txt" \
 ftp://user:pass@ftp.example.com:port
```

Let's go over these options: 

* -v: Gives verbose debugging output. Lines starting with ">" mean
  data sent by curl. Lines starting with "<" show data received by
  curl. Lines starting with "*" display additional information
  presented by curl.
* --cacert: Specifies which file contains the SSL certificate(s)
  used to verify the server. This file must be in PEM format.
* --ssl --ssl-reqd: Try to use SSL or TLS for the FTP connection. If the
  server does not support SSL/TLS, curl will fallback to unencrypted
  FTP, so we force the use of encryption with --ssl-reqd.
* -T: Specifies a file to upload

The last part of the command line
ftp://user:pass@ftp.example.com:port is simply a way to specify the
username, password, host and port all in one shot.

### How FTP Works

Before I get to the problem, I need to explain a bit about how FTP
works [2]. FTP operates in one of two modes - active or passive. In
active mode, the client connects to the server on a control port
(usually TCP port 21), then starts listening on a random high port
and sends this port number back to the server. The server then
connects back to the client on the specified port (usually the
server's source TCP port is 20). Active mode isn't used much or even
recommended anymore, since the reverse connection from the server to
the client is frequently blocked, and can be a security risk if not
handled properly by intervening firewalls. Contrast this with
passive mode, in which the client makes an initial connection to the
server on the control port, then waits for the server to send an IP
address and port number. The client connects to the specified IP
address and port and then sends the data. From a firewall's
perspective, this is much nicer, since the control and data
connections are in the same direction and the ports are
well-defined. Most FTP clients now default to passive mode, curl
included.

### The problem

Now, a problem can arise when the server sends back the IP address
from a passive mode request. If the server is not configured
properly, it will send back it's own host IP address, which is
almost always a private IP address and different from the address
the client connected to. Usually a firewall or router is doing
Network Address Translation (NAT) to map requests from the server's
public IP address to the server's internal IP address. When the
client gets this IP address from the server, it is trying to connect
to a non-routable IP address and the connection times out. How do
you know when this problem has manifested itself? Take a look at
this partial debug output from curl:

```
... 
> PASV
< 227 Entering Passive Mode (172,19,2,90,41,20)
* Trying 172.19.2.90...
```

Here the client has sent the PASV command, which asks the server for
a passive data connection. The server returns a string of six
decimal numbers, representing the IP address (first four digits) and
port (last two digits). Here the IP address is 172.19.2.90 - a
non-routable IP address as per [RFC 1918][3]. When the client tries
to connect to this address, it will fail.

### The solution...sort of

In 1998 [RFC 2428][4] was released, which specified 'Extended
Passive Mode', specifically meant to address this problem. In
extended passive mode, only the port is returned to the client, the
client assumes the IP address of the server has not changed. The
problem with this solution is that many FTP servers still do not
support extended passive mode. If you try, you will see something
like this:

```
> EPSV
* Connect data stream passively
PASV
< 227 Entering Passive Mode (172,19,2,90,41,20)
* Trying 172.19.2.90...
```

...and we're back to the same problem again. 

### The Real Solution

Curl has a neat solution to this problem, requiring two additional
options. The first is --disable-epsv, which prevents curl from
sending the EPSV command - it will just default to standard passive
mode. The second is --ftp-skip-pasv-ip, which tells curl to ignore
the IP address returned by the server, and to connect back to the
server IP address specified in the command line. Let's put it all
together:

```
curl -v --cacert /etc/ssl/certs/cert.pem \
--disable-epsv --ftp-skip-pasv-ip \
--ssl --ssl-reqd -T "/file/to/upload/file.txt" \
ftp://user:pass@ftp.example.com:port
```

If this succeeds, you'll see something like this:

```
* SSL certificate verify ok.
...
< 226- Transfer complete - acknowledgment message is pending.
QUIT 
< 221 Goodbye.
```

The final 226 Transfer complete is the sign that the file was
transferred to the server successfully.

[1]: http://curl.haxx.se/
[2]: gopher://gopher.unixlore.net/0/docs/rfc/rfc959.txt
[3]: gopher://gopher.unixlore.net/0/docs/rfc/rfc1918.txt
[4]: gopher://gopher.unixlore.net/0/docs/rfc/rfc2428.txt