Restoring a Data Grid

To restore a data grid, the following entities must be restored:
  • The primary realm service’s database
  • The grid configuration in the realm service
  • The state keepers
  • The nodes of each copyset

Procedure

  1. Determine the ActiveSpaces checkpoint to use for restoring the data grid.
  2. Determine the realm service database backup associated with the checkpoint.
  3. Stop all data grid processes. For example, clients, proxies, nodes, state keepers, tibdgadmind.
  4. Stop all realm services. For example, primary, backups, and satellite servers.
  5. Copy each node's checkpoints directory to a safe place.
  6. Restore the primary realm service's database from the backup associated with the checkpoint.
  7. Restart any other realm services.
  8. Restore the following processes:
    1. Restore the grid configuration in the primary realm service from the ActiveSpaces checkpoint. This is to ensure that the grid configuration is consistent as realm service database is backed up outside of our checkpoint process. Remember that the operations such as adding or deleting a table are not synchronized.
    2. Restore the state keepers from the checkpoint. Ensure all state keepers are running.
    3. Restore each node from its respective checkpoint data directory. Ensure all nodes are running.
  9. Restart the following items:
    1. Restart any remaining data grid processes such as tibdgadmind, or proxies.
    2. Restart ActiveSpaces clients.
    If a restore to a checkpoint which is not the latest is performed, the files are removed from each node's checkpoints subdirectory for the checkpoints taken after the checkpoint is being restored. Therefore, it is important to save each node's checkpoints subdirectory prior to the restore in case you decide you needed to restore from a later checkpoint.