5.12. Data Store Synchronization

Data Store Synchronization is a technique which can be used for two Steelhead appliances to synchronize their data stores so that both Steelhead appliances have the same references known in their data store. This not a feature specific for serial clusters or high-availability clusters, but should also be used for parallel clusters and Interceptor / WCCP clusters.

Figure 5.59. Data Store Synchronization can be used in various designs

 .----------.               .----------.          .----------.          
 |  Router  |               |  Router  |          |  Router  |
 '----------'               '----------'          '----------'
      |                        |    |                 |
      |                   .----'    '---.         .--------.
      |                   |             |         |   IC   |
 .----------.             |             |         '--------'
 |   SH 1   |---.   .-----------.  .-----------.      |       .--------.
 '----------'   |   |    SH 1   |  |   SH 2    |      |   .---|  SH 1  |-.
      |         |   '-----------'  '-----------'      |   |   '--------' |
      |         |         |   |      |  |             |   |   .--------. |
 .----------.   |         |   '------'  |             |   |---|  SH 2  |-'
 |   SH 2   |---'         |             |             |   |   '--------'
 '----------'             |             |             |   |
      |              .----------.  .----------.  .----------.
      |              |  Switch  |  |  Switch  |  |  Switch  |
 .----------.        '----------'  '----------'  '----------'
 |  Switch  |
 '----------'
Serial deployment       Parallel deployment      Virtual in-path deployment

This has two advantages:

When a Data Store Synchronization cluster gets configured, one of the Steelhead appliances is made the master and other one is made the slave. This has nothing to do with the direction of the flow data but with which data store ID is used in the replicated data store.

When one of the Steelhead appliances fails, either the master or the slave, the replacement Steelhead appliance will always have to be configured to be the slave. The Steelhead appliance which didn't get replaced will always become the master.

When a Data Store Synchronization cluster gets unconfigured or the Data Store Synchronization cluster gets redeployed in the network, both the Steelhead appliances will need to have their data store synchronization feature disabled and their data stores cleared (via the command restart clean on the CLI). If this doesn't happen, both Steelhead appliances will show Service Alarm errors because they will see a Steelhead appliance in the network with the same data store ID and that is a situation which should not happen.

Data Store Synchronization clusters should be the same hardware and run the same RiOS version. If there are mismatches the synchronization service will throw an alarm and the status of the Steelhead appliance will become degraded.

When a Steelhead appliance in a Data Store Synchronization cluster comes back after a reboot or power-down, all new references learned are replicated from the other Steelhead appliance. In the log files this can be seen as:

Figure 5.60. Data Store Synchronization catching up after a restart

SH sport[123] [replicate/client.INFO] - {- -} Connected from: 40282 to: 7744 
SH sport[123] [replicate/client.NOTICE] - {- -} Client Connected to 10.23.18.212:7744 tota \
    l_pages: 85258240 pages_in_use: 85258238 
SH sport[123] [replicate/client.INFO] - {- -} Recvd header info with version: 3 store_id_: \
     482671 
SH sport[123] [replicate/storesync.INFO] - {- -}  current store_id: 482671 remote store_id \
    : 482671 
SH sport[123] [replicate/storesync.NOTICE] - {- -} Running with storeid 482671 
SH sport[123] [replicate/client.NOTICE] - {- -} Requesting keepup start 
SH sport[123] [replicate/client.NOTICE] - {- -} Requesting catchup start npages: 85258240 
SH sport[123] [replicate/client.INFO] - {- -} [ping] Scheduling ping every 60 seconds.  
SH sport[123] [replicate/sync_server.NOTICE] - {- -} Came to end of store.  Will end catch \
    up next time client request indexes. 
SH sport[123] [replicate/sync_server.NOTICE] - {- -} requesting end to catchup cur_page_:  \
    85258240 npages_: 85258240 
SH sport[123] [replicate/sync_server.NOTICE] - {- -} Done with catchup  Sent: 24573 pages  \
     
SH sport[123] [replicate/client.NOTICE] - {- -} client catchup is complete. catchup pages_ \
    written: 31720