Error Recovery Policy
During startup the EMS server can encounter a number of errors while it recovers information from the stores.
Potential errors include:
- Low-level file errors. For example, corrupted disk records.
- Low-level object-specific errors. For example, a record that is missing an expected field.
- Inter-object errors. For example, a session record with no corresponding connection record.
When the EMS server encounters one of these errors during startup, the recovery policy is:
- By default, the server exits startup completely when a corrupt disk record error is detected. Because the state can not be safely restored, the server can not proceed with the rest of the recovery. You can then examine your configuration settings for errors. If necessary, you can then copy the store and configuration files for examination by TIBCO Support.
- You can direct the server to delete bad records by including the -forceStart command line option. This prevents corruption of the server runtime state.
- The server exits if it runs out of memory during startup.
It is important to backup all stores before restarting the server with the -forceStart option, because data will be lost when the problematic records are deleted. To back up file-based stores, you can simply create a copy of the store files. For grid stores and FTL stores, you will need to back up the associated ActiveSpaces or FTL deployment. Refer to the TIBCO ActiveSpaces Administration and TIBCO FTL Administration product guides for instructions on creating backups for these products.
Keep in mind that different type of records are stored in the stores. The most obvious are the persistent JMS Messages that your applications have sent. However, other internal records are also stored. If a consumer record used to persist durable subscriber state information were to be corrupted and later deleted with the -forceStart option, all JMS messages that were persisted (and valid in the sense that they were not corrupted) would also be lost because the durable subscription itself would not be recovered.
When running in this mode, the server still reports any errors found during the recovery, but problematic records are deleted and the recovery proceeds. This mode may report more issues than are reported without the -forceStart option, because without that flag the server stops with the very first error.