During the operation of the device it is possible that the GUI or CLI access is not working. It is important to notice various different styles of not working:
For the Steelhead appliance, the GUI consists of two pieces: A management application and a web server which acts as front-end of the management application.
When connecting to the root of the GUI, for example via the URL http://10.0.1.5/, the web server will answer the request for the root of the GUI with a redirection to /mgmt/, the path of the management application, in this example at http://10.0.1.5/mgmt/. At the root of the GUI there will be the login screen or the overview screen of the management application.
If the redirection page does not get loaded, then there is a problem with connecting to the web server. Check with ping to see if the primary interface of the Steelhead appliance is still reachable and with telnet to port 80 to see if the web server is running.
With regards to the case where the redirection page does get loaded, but the management application does not load: This behaviour has been observed in the following situations:
When the
/var
partition is full. This can be checked with the
command
support show disk
:
Figure 5.186. Output of the command "support show disk"
Filesystem 1024-blocks Used Available Capacity Mounted on [...] /dev/sda3 16513448 16513448 0 100% /var [...]
If this is the case, please call Riverbed TAC to help clean up the /var partition.
When the eUSB flash memory is locked for a long time and the machine isn't running the RiOS versions which deals with it. See KB article S15568 and S15587 for the full details.
When the management application is failing: The next steps would
be to connect to the CLI and issue the command
pm process webasd restart
.
When there is something wrong with the optimized sessions towards that Steelhead appliance. Try using HTTPS instead of HTTP to overcome any optimization related issues.
There are several steps in the setup of the SSH session:
(1) The command to start the SSH client.
(2) The setup of the TCP session over which the SSH session gets setup.
(3) The SSH server banner.
(4) The host key exchange.
(5) The DNS lookup of the client IP address by the server.
(6) The login banner.
(7) The password authentication.
(8) The first output of the login shell.
Figure 5.187. An SSH session with verbose logging
1 edwin@t43>ssh -v admin@10.0.1.5 OpenSSH_5.8p2_hpn13v11 FreeBSD-20110503, OpenSSL 0.9.8q 2 Dec 2010 debug1: Reading configuration data /usr/home/edwin/.ssh/config debug1: Reading configuration data /etc/ssh/ssh_config 2 debug1: Connecting to 10.0.1.5 [10.0.1.5] port 22. debug1: Connection established. debug1: identity file /usr/home/edwin/.ssh/id_dsa type -1 debug1: identity file /usr/home/edwin/.ssh/id_dsa-cert type -1 3 debug1: Remote protocol version 1.99, remote software version OpenSSH_5.2 debug1: match: OpenSSH_5.2 pat OpenSSH* debug1: Remote is not HPN-aware debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.8p2_hpn13v11 FreeBSD-20110503 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY 4 debug1: Server host key: RSA ae:f3:6b:8c:6f:a0:e7:4d:01:1f:9c:cf:91:ad:14:e7 The authenticity of host '10.0.1.5 (10.0.1.5)' can't be established. RSA key fingerprint is ae:f3:6b:8c:6f:a0:e7:4d:01:1f:9c:cf:91:ad:14:e7. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '10.0.1.5' (RSA) to the list of known hosts. debug1: ssh_rsa_verify: signature correct debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS 5 debug1: SSH2_MSG_NEWKEYS received debug1: Roaming not allowed by server debug1: SSH2_MSG_SERVICE_REQUEST sent debug1: SSH2_MSG_SERVICE_ACCEPT received 6 Riverbed Steelhead debug1: Authentications that can continue: publickey,password debug1: Next authentication method: publickey debug1: Trying private key: /usr/home/edwin/.ssh/id_rsa debug1: Trying private key: /usr/home/edwin/.ssh/id_dsa debug1: Trying private key: /usr/home/edwin/.ssh/id_ecdsa debug1: Next authentication method: password 7 admin@10.0.1.5's password: debug1: Authentication succeeded (password). Authenticated to 10.0.1.5 ([10.0.1.5]:22). debug1: HPN to Non-HPN Connection debug1: Final hpn_buffer_size = 2097152 debug1: HPN Disabled: 0, HPN Buffer Size: 2097152 debug1: channel 0: new [client-session] debug1: Enabled Dynamic Window Scaling debug1: Requesting no-more-sessions@openssh.com debug1: Entering interactive session. 8 Last login: Sun Jul 29 01:55:15 2012 from 10.0.1.1
If the TCP session times out, then it could be either a routing issue, firewall issue or the Steelhead appliance is turned off.
If the SSH client comes back with ssh: connect to host 10.0.1.5 port 22: Connection refused then there is no SSH server running on that IP address.
The easiest way to check if the SSH server is running is to use telnet to setup a TCP session to the primary interface on port 22:
Figure 5.188. Telnet session to the SSH service
[~] edwin@t43>telnet 10.0.1.5 22 Trying 10.0.1.5... Connected to 10.0.1.5. Escape character is '^]'. SSH-1.99-OpenSSH_5.2
If the TCP session is setup and there is a SSH banner, then the SSH service is working. If the SSH banner is not displayed, then the kernel on the Steelhead appliance is still working but the user-land is having problems.
When a Steelhead appliance gets replaced in the network, the IP address or hostname will stay the same but the SSH host key will be different. OpenSSH will give a warning like this:
Figure 5.189. Setup of an SSH session to the CLI of a replaced Steelhead appliance
$ ssh admin@10.0.1.5 Riverbed Steelhead Password: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the RSA key sent by the remote host is e:f3:6b:8c:6f:a0:e7:4d:01:1f:9c:cf:91:ad:14:e7. Please contact your system administrator. Add correct host key in /usr/home/edwin/.ssh/known_hosts to get rid of this message. Offending RSA key in /usr/home/edwin/.ssh/known_hosts:50 RSA host key for 10.0.1.5 has changed and you have requested strict checking. Host key verification failed.
If this happens, confirm that the device was replaced.
To overcome this issue with the OpenSSH SSH client, remove the line in the file mentioned. With the PuTTY SSH client a dialog box will be shown with the option to replace the key.
If the SSH client has a delay of about 60 seconds before displaying the password prompt, then the issue is most likely related to reverse DNS lookup failure: Confirm that the DNS servers configured on the Steelhead appliance are correct and reachable.
If, after the authentication, the login shell is not started then there is most likely a disk failure on the Steelhead appliance.
It can happen that the SSH client asks for the username and password, but doesn't get to the CLI. This has experienced when an appliance has a high system load, for example because of the generation of a lot of process dumps. The way out of that is to reboot the appliance in single user mode and remove all process dumps from /var/opt/tms/snapshots/.staging/.
Next steps: Please contact the Riverbed TAC for assistance. Access to the serial console is required, as well as permission to reboot the Steelhead appliance.
During the login to the CLI, the login shell
cli
opens a communication channel to the process
mgmtd.
If this channel cannot be made, the prompt will show
CLI>
and only accept a certain subset of commands. Once the communication
channel to the mgmtd process is available the shell will connect
to it and all commands be available again.
If the process mgmtd does not come back, the next step would be to boot the appliance in single user mode and investigate why the process mgmtd didn't start.