5.13. Data store related errors

The Data store is the database with references to frames the Steelhead appliance has learned. There are two possible issues with regarding to data store:

5.13.1. Data store identification issues

The Data store ID is a unique identifier to identify the data store on a Steelhead appliance. There is only one reason why a data store ID used on two Steelhead appliances could be the same and that is because they are part of a data store synchronization cluster: The backup node in the data store cluster will assume the data store ID of the master node.

Once a Steelhead is taken out of a data store synchronization cluster, its data store should be cleared before redeploying it. If this doesn't happen and the two Steelhead appliances detect each other in the field, they will complain about the duplicate data store ID.

Figure 5.61. Duplicate data store ID warnings from a former data store synchronization cluster node

SH sport[123] [splice/oob.NOTICE] 1 {- -} Got a iochannel for an existing oob to peer: 192 \
    .168.1.6
SH sport[123] [splice/oob.ALERT] 1 {- -} ALARM Peer sport id (478552) is the same as mine! \
     Closing connection to peer with store id (478552) remote address (192.168.1.6:7800). 
SH sport[123] [connect_pool.NOTICE] - {- -} Destroying pool to peer: 192.168.1.6

If this happens, remove the obsolete data store synchronization configuration from the Steelhead appliance and restart the optimization service in the GUI with the option Clear the data store or with the CLI command restart clean.

Sometimes a data store synchronization cluster is deployed in a serial cluster, and it is possible that the first Steelhead appliance in the cluster will try to peer with the second Steelhead appliance in the cluster. This will show up as:

Figure 5.62. Duplicate data store ID warning in a data store synchronization cluster

SH sport[2967]: [sport/config_info.ALERT] - {- -} ALARM Peer (10.0.1.5) store id (685523)  \
    is the same as mine! Check for replicated datastore steelheads used as a peer.

If this happens, add peering rules on the two Steelhead appliances to prevent auto-discovery happening between the two of them:

Figure 5.63. Prevent optimization from the data store synchronization peer 192.168.1.7

SSH1 # show in-path peering rules
Rule Type C Source             Destination        Port   Peer            SSL-Cap
---- ---- - ------------------ ------------------ ------ --------------- -------
1    pass A all-ip             all-ip             all    192.168.1.7     no-chk
      desc: Prevent optimization from SSH2 (inpath0_0)
1    pass A all-ip             all-ip             all    all-ipv4        in-cap
      desc: Default rule to passthrough connections destined to currently bypassed SSL cli \
    ent-server pairs
2    auto A all-ip             all-ip             443    all-ipv4        cap
      desc: Default rule to auto-discover and attempt to optimize connections destined to  \
    port 443 as SSL
def  auto A all-ip             all-ip             all    all-ipv4        no-chk
---- ---- - ------------------ ------------------ ------ --------------- -------
3 user added rule(s)
(C) Cloud-accel mode:            A=Auto
                                 P=Passthru

5.13.1.1. Duplicate data store ID on Steelhead mobile clients.

Data store IDs in Steelhead Mobile Clients are generated by the PCs themselves when the Steelhead Mobile Client software is started for the first time. Before Steelhead Mobile version 3.1.3, the source for this Data store ID is the Security Identifier provided via the Windows operating system.

Although the Security Identifier is supposed to be unique, it is not. For normal operation of the Windows operating system and the various integrations into Active Directory, this is not an issue. For applications which use it as a source of uniqueness for the machine this is a problem.

The source of the problem of duplicate data store IDs on Steelhead mobile clients is caused by installation of new computers via cloning them from a master computer: The clones of this master machine have the same Security Identifier which will cause the creation of the same Data store ID on the Steelhead Mobile Clients.

Figure 5.64. Detection of duplicate labels

SH sport[123] [segpage.ERR] - {- -} Duplicate DEF {}25 hash 3792051065887642647 vs. 420433 \
    /96599509:1411#0{}115 hash 7975538566276211596, memcmp() -1
SH sport[123] [defunpacker.ALERT] - {- -} ALARM (clnt: 10.0.1.1:60663 serv: 192.168.1.1:53 \
     cfe: 10.0.1.1:2770 sfe: 192.168.1.5:7810) name maps to more than one segment has a st \
    eelhead been improperly installed and/or steelhead mobiles are sharing config files po \
    ssibly through copying or cloning?

In this case the clnt and the cfe are the same, which indicates that this is a Steelhead Mobile Client.

To overcome this issue, the following steps need to be taken:

  • After the installation of the Steelhead Mobile Client on the master machine, remove the configuration file C:\Documents and Settings\All Users\Application Data\Riverbed\Steelhead_Mobile\config\sport.xml.

  • After the cloning of the machine, run the program newsid.exe on the newly cloned machine. This will cause it to generate a new Security Identifier.

  • Stop the Steelhead Mobile Client, issue the command rbtdebug --new-sid in the directory C:\Program Files\Riverbed\Steelhead Mobile and restart the Steelhead Mobile Client.

  • Restart the Steelhead Mobile Client with the command rbtsport -C in the directory C:\Program Files\Riverbed\Steelhead Mobile.

Now the Steelhead Mobile Client is ready to be used.

Note that this issue is resolved in Steelhead Mobile Client version 3.1.3 and 3.2 and later where the Steelhead Mobile Client Data store ID gets determined by a different method than the Security Identifier.

5.13.2. Data store contents issues

If there are multiple Steelhead appliances in the network with the same Data store ID, data referencing will be hampered:

  • The Steelhead appliance can have a reference with a label which the peer Steelhead appliance doesn't know about. The peer Steelhead appliance will request to have the definition send again and no error will be raised.

  • The Steelhead appliance can generate a new reference with a label which the peer Steelhead appliance already has. Both Steelhead appliances will destroy the reference and raise an error.

  • The Steelhead appliance can have a reference with a label which the peer Steelhead appliance has, but with a different checksum. Both Steelhead appliances will destroy the reference and an error will be raised.

In the log files it will show up like:

Figure 5.65. Detection of duplicate labels

SH sport[123] [segpage.ERR] - {- -} Duplicate DEF {}25 hash 3792051065887642647 vs. 420433 \
    /96599509:1411#0{}115 hash 7975538566276211596, memcmp() -1
SH sport[123] [defunpacker.ALERT] - {- -} ALARM (clnt: 10.0.1.1:60663 serv: 192.168.1.1:53 \
     cfe: 10.0.1.5:2770 sfe: 192.168.1.5:7810) name maps to more than one segment has a st \
    eelhead been improperly installed and/or steelhead mobiles are sharing config files po \
    ssibly through copying or cloning?

In this case the issue was happening between the Steelhead appliances with in-path IP addresses of 192.168.250.1 and 192.168.7.12. The TCP session it was optimizing had the IP addresses of 192.168.250.1 and 192.168.7.7.

5.13.2.1. False alerts about duplicate labels

When the data store on a Steelhead appliance gets cleared, the contents are only forgotten on that Steelhead appliance but the other devices in the network still have the references. In due time they will be removed by garbage collection methods in the optimization service, but for now they are there. If the Steelhead appliance with the empty data store generates a new label which still exist on another Steelhead appliance, the checksum of the new reference will most likely not match the checksum of the old reference. The Steelhead appliance which will receive the new definition will detect this duplicate definition and complain about it with the same kind of logs as above.

Keeping track of when data stores are cleared is a good thing to catch these false positives.

To overcome this alarm, use the command service error reset:

Figure 5.66. Use of the command "service error reset"

SH # service error reset
Please note that it may take a few seconds for the alarm to reset.