SSH connection stability

Under:
IF
  • Your SSH connections are closed from home
  • You get disconnected from any nodes without any reasons?
  • ... and you are a PuTTY user
  • ... or an Uglix SSH client user
This page is for you. If you are another user, use different clients and so on, this page may still be informative and help you stabalize your connection (the same principles apply).

PuTTY users

PuTTY to connect to gateway (from a home connection), you have to

  • set a session, be sure to enable SSH

  • go to the 'Connection' menu and have the following options box checked

    • Disable Nagle's algorithm (TCP_NODELAY option)

    • Enable TCP keepalives (SO_KEEPALIVE option)

  • Furthermore, in 'Connection' -> 'SSH' -> 'Tunnels' enable the option

    • Enable X11 forwarding

    • Enable MIT-Magic-Cookie-1

  • Save the session

Documentation on those features (explanation for the interested) are added at the end of this document.


SSH Users

SSH users and owner of their system could first of all be sure to manipulate the SSH client configuration file and be sure settings are turned on by default. The client configuration is likely located as /etc/ssh_config or /usr/local/etc/ssh_config depending on where you have ssh installed.

But if you do NOT have access to the configuration file, the client can nonetheless pass on options from the command line. Those options would have the same name as they would appear in the config file.

Especially, KEEP_ALIVE is controlled via the SSH configuration option TCPKeepAlive.

% ssh -o TCPKeepAlive=yes

You will note in the next section that a spoofing issue exists with keep alive (I know it works well, but please consider the ServerAliveCountMax mechanism) so, you may use instead

% ssh -o TCPKeepAlive=no -o ServerAliveInterval=15

Note that the value 15 in our example is purely empirical. There are NO magic values and you need to test your connection and detect when (after what time) you get kicked out and disconnected and set the parameters from your client accordingly. Let's explain the default first and come back to this and a rule of thumb.

There are two relevant parameters (in addition of TCPKeepAlive):


ServerAliveInterval

Sets a timeout interval in seconds after which if no data has been received from the server, ssh will send a message through the encrypted channel to request a response from the server. The default is 0, indicating that these messages will not be sent to the server.

This option applies to protocol version 2 only.


ServerAliveCountMax

Sets the number of server alive messages (see above) which may be sent without ssh receiving any messages back from the server. If this threshold is reached while server alive messages are being sent, ssh will disconnect from the server, terminating the session. It is important to note that the use of server alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The server alive mechanism is valuable when the client or server depend on knowing when a connection has become inactive.

The default value is 3. If, for example, ServerAliveInterval (above) is set to 15, and ServerAliveCountMax is left at the default, if the server becomes unresponsive ssh will disconnect after approximately 45 seconds.


In our example

% ssh -o TCPKeepAlive=no -o ServerAliveInterval=15

The recipe should be: if you get disconnected after N seconds, play with the above and be sure to set a

time of ServerAliveInterval*ServerAliveCountMax <= 0.8*N, N being the timeout. Since ServerAliveCountMax is typically not modified, in our example we assume the default value of 3 and therefore a a 3x15 = 45 seconds (and we guessed a disconnect every minute or so). If you set the value too low, the client will send to much "chatting" to the server and there will be a traffic impact.


Appendix

Nagle's algorithm

This was written based on this article.

RPC implementations on TCP should disable Nagle. This reduces average RPC request latency on TCP, and makes network trace tools work a little nicer.

Determines whether Nagle's algorithm is to be used. The Nagle's algorithm tries to conserve bandwidth by minimizing the number of segments that are sent. When applications wish to decrease network latency and increase performance, they can disable Nagle's algorithm (that is enable TCP_NODELAY). Data will be sent earlier, at the cost of an increase in bandwidth consumption.


KeepAlive

The KEEPALIVE option of the TCP/IP Protocol ensures that connections are kept alive even while they are idle. When a connection to a client is inactive for a period of time (the timeout period), the operating system sends KEEPALIVE packets at regular intervals. On most systems, the default timeout period is two hours (7,200,000 ms).

If the network hardware or software drops connections that have been idle for less than the two hour default, the Windows Client session will fail. KEEPALIVE timeouts are configured at the operating system level for all connections that have KEEPALIVE enabled.

If the network hardware or software (including firewalls) have a idle limit of one hour, then the KEEPALIVE timeout must be less than one hour. To rectify this situation TCP/IP KEEPALIVE settings can be lowered to fit inside the firewall limits. The implementation of TCP KEEPALIVE may vary from vendor to vendor. The original definition is quite old and described in RFC 1122.


MIT Magic cookie

To avoid unauthorized connections to your X display, the command xauth for encrypted X connections is widely used. When you login, a .Xauthority file is created in your home directory ($HOME). Even SSH initiate the creation of a magic cookie and without it, no display could be opened. Note that since the .Xauthority file IS the file containing the MIT Magic cookie, if you ever run out of disk quota or the file system is full, this file CANNOT be created or updated (even from the sshd impersonating the user) and consequently, no X connections can be opened.

The .Xauthority file sometimes contains information from older sessions, but this is not important, as a new key is created at every login session. The Xauthority is simple and powerful, and eliminates many of the security problems with X.