5.24. Access related problems

During the operation of the device it is possible that the GUI or CLI access is not working. It is important to notice various different styles of not working:

5.24.1. Unable to access the GUI

For the Steelhead appliance, the GUI consists of two pieces: A management application and a web server which acts as front-end of the management application.

When connecting to the root of the GUI, for example via the URL http://10.0.1.5/, the web server will answer the request for the root of the GUI with a redirection to /mgmt/, the path of the management application, in this example at http://10.0.1.5/mgmt/. At the root of the GUI there will be the login screen or the overview screen of the management application.

If the redirection page does not get loaded, then there is a problem with connecting to the web server. Check with ping to see if the primary interface of the Steelhead appliance is still reachable and with telnet to port 80 to see if the web server is running.

Figure 5.185. Steelhead GUI redirection screen

Steelhead GUI redirection screen

With regards to the case where the redirection page does get loaded, but the management application does not load: This behaviour has been observed in the following situations:

  • When the /var partition is full. This can be checked with the command support show disk:

    Figure 5.186. Output of the command "support show disk"

    Filesystem         1024-blocks      Used Available Capacity Mounted on
    [...]
    /dev/sda3             16513448  16513448         0     100% /var
    [...]
    


    If this is the case, please call Riverbed TAC to help clean up the /var partition.

  • When the eUSB flash memory is locked for a long time and the machine isn't running the RiOS versions which deals with it. See KB article S15568 and S15587 for the full details.

  • When the management application is failing: The next steps would be to connect to the CLI and issue the command pm process webasd restart.

  • When there is something wrong with the optimized sessions towards that Steelhead appliance. Try using HTTPS instead of HTTP to overcome any optimization related issues.

5.24.2. Unable to connect to the CLI

There are several steps in the setup of the SSH session:

  • (1) The command to start the SSH client.

  • (2) The setup of the TCP session over which the SSH session gets setup.

  • (3) The SSH server banner.

  • (4) The host key exchange.

  • (5) The DNS lookup of the client IP address by the server.

  • (6) The login banner.

  • (7) The password authentication.

  • (8) The first output of the login shell.

Figure 5.187. An SSH session with verbose logging

1 edwin@t43>ssh -v admin@10.0.1.5 
  OpenSSH_5.8p2_hpn13v11 FreeBSD-20110503, OpenSSL 0.9.8q 2 Dec 2010
  debug1: Reading configuration data /usr/home/edwin/.ssh/config
  debug1: Reading configuration data /etc/ssh/ssh_config
2 debug1: Connecting to 10.0.1.5 [10.0.1.5] port 22.
  debug1: Connection established.
  debug1: identity file /usr/home/edwin/.ssh/id_dsa type -1
  debug1: identity file /usr/home/edwin/.ssh/id_dsa-cert type -1
3 debug1: Remote protocol version 1.99, remote software version OpenSSH_5.2
  debug1: match: OpenSSH_5.2 pat OpenSSH*
  debug1: Remote is not HPN-aware
  debug1: Enabling compatibility mode for protocol 2.0
  debug1: Local version string SSH-2.0-OpenSSH_5.8p2_hpn13v11 FreeBSD-20110503
  debug1: SSH2_MSG_KEXINIT sent
  debug1: SSH2_MSG_KEXINIT received
  debug1: kex: server->client aes128-ctr hmac-md5 none
  debug1: kex: client->server aes128-ctr hmac-md5 none
  debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
  debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
  debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
  debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
4 debug1: Server host key: RSA ae:f3:6b:8c:6f:a0:e7:4d:01:1f:9c:cf:91:ad:14:e7
  The authenticity of host '10.0.1.5 (10.0.1.5)' can't be established.
  RSA key fingerprint is ae:f3:6b:8c:6f:a0:e7:4d:01:1f:9c:cf:91:ad:14:e7.
  Are you sure you want to continue connecting (yes/no)? yes
  Warning: Permanently added '10.0.1.5' (RSA) to the list of known hosts.
  debug1: ssh_rsa_verify: signature correct
  debug1: SSH2_MSG_NEWKEYS sent
  debug1: expecting SSH2_MSG_NEWKEYS
5 debug1: SSH2_MSG_NEWKEYS received
  debug1: Roaming not allowed by server
  debug1: SSH2_MSG_SERVICE_REQUEST sent
  debug1: SSH2_MSG_SERVICE_ACCEPT received
6 Riverbed Steelhead
  debug1: Authentications that can continue: publickey,password
  debug1: Next authentication method: publickey
  debug1: Trying private key: /usr/home/edwin/.ssh/id_rsa
  debug1: Trying private key: /usr/home/edwin/.ssh/id_dsa
  debug1: Trying private key: /usr/home/edwin/.ssh/id_ecdsa
  debug1: Next authentication method: password
7 admin@10.0.1.5's password: 
  debug1: Authentication succeeded (password).
  Authenticated to 10.0.1.5 ([10.0.1.5]:22).
  debug1: HPN to Non-HPN Connection
  debug1: Final hpn_buffer_size = 2097152
  debug1: HPN Disabled: 0, HPN Buffer Size: 2097152
  debug1: channel 0: new [client-session]
  debug1: Enabled Dynamic Window Scaling

  debug1: Requesting no-more-sessions@openssh.com
  debug1: Entering interactive session.
8 Last login: Sun Jul 29 01:55:15 2012 from 10.0.1.1

5.24.2.1. SSH TCP session does not get setup

If the TCP session times out, then it could be either a routing issue, firewall issue or the Steelhead appliance is turned off.

If the SSH client comes back with ssh: connect to host 10.0.1.5 port 22: Connection refused then there is no SSH server running on that IP address.

The easiest way to check if the SSH server is running is to use telnet to setup a TCP session to the primary interface on port 22:

Figure 5.188. Telnet session to the SSH service

[~] edwin@t43>telnet 10.0.1.5 22  
Trying 10.0.1.5...
Connected to 10.0.1.5.
Escape character is '^]'.
SSH-1.99-OpenSSH_5.2

If the TCP session is setup and there is a SSH banner, then the SSH service is working. If the SSH banner is not displayed, then the kernel on the Steelhead appliance is still working but the user-land is having problems.

5.24.2.2. SSH host key has changed

When a Steelhead appliance gets replaced in the network, the IP address or hostname will stay the same but the SSH host key will be different. OpenSSH will give a warning like this:

Figure 5.189. Setup of an SSH session to the CLI of a replaced Steelhead appliance

$ ssh admin@10.0.1.5
Riverbed Steelhead
Password:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
e:f3:6b:8c:6f:a0:e7:4d:01:1f:9c:cf:91:ad:14:e7.
Please contact your system administrator.
Add correct host key in /usr/home/edwin/.ssh/known_hosts to get rid of this message.
Offending RSA key in /usr/home/edwin/.ssh/known_hosts:50
RSA host key for 10.0.1.5 has changed and you have requested strict checking.
Host key verification failed.

If this happens, confirm that the device was replaced.

To overcome this issue with the OpenSSH SSH client, remove the line in the file mentioned. With the PuTTY SSH client a dialog box will be shown with the option to replace the key.

5.24.2.3. SSH Reverse DNS timeout

If the SSH client has a delay of about 60 seconds before displaying the password prompt, then the issue is most likely related to reverse DNS lookup failure: Confirm that the DNS servers configured on the Steelhead appliance are correct and reachable.

5.24.2.4. SSH login shell not started at all.

If, after the authentication, the login shell is not started then there is most likely a disk failure on the Steelhead appliance.

5.24.2.5. Prompt only disappears after a very long delay

It can happen that the SSH client asks for the username and password, but doesn't get to the CLI. This has experienced when an appliance has a high system load, for example because of the generation of a lot of process dumps. The way out of that is to reboot the appliance in single user mode and remove all process dumps from /var/opt/tms/snapshots/.staging/.

Next steps: Please contact the Riverbed TAC for assistance. Access to the serial console is required, as well as permission to reboot the Steelhead appliance.

5.24.2.6. Reduced CLI shell

During the login to the CLI, the login shell cli opens a communication channel to the process mgmtd. If this channel cannot be made, the prompt will show CLI> and only accept a certain subset of commands. Once the communication channel to the mgmtd process is available the shell will connect to it and all commands be available again.

If the process mgmtd does not come back, the next step would be to boot the appliance in single user mode and investigate why the process mgmtd didn't start.