Data Redistribution

Data redistribution is done in the background and does not block ongoing operations when data is being transferred.

When the sending copyset completes its migration of data, it briefly delays live operations when assigning ownership to the new copyset. During this interval, transactions that were started during the migration process might fail, and iterator creation and query execution might fail. If a row moves during data redistribution, transacted reads might become invalid during a transaction, meaning that the transaction might fail to commit. Given that possibility, an application must avoid taking action on a transacted read until it learns that the transaction commit has returned successfully. After this, there is a period of time where other processes (nodes and proxies) in the data grid begin to learn about the change in ownership of the data now at the new copyset. So, it is expected that operations occurring during that window can experience a timeout error at the client while the processes learn about the new configuration.

Statements and table listeners created before the data redistribution are out of date once the data redistribution is completed and receives an invalid resource error at the client. The object must be destroyed in the client application and re-created.

An existing copyset that sends data to a new copyset retains its data on disk until the data redistribution process is complete. The rows previously owned by the copyset are deleted as a background operation.

Note: For capacity planning purposes, it is possible that the portion of data being contributed by a copyset at the moment the redistribution is completed exists at both the old copyset and new copyset. In other words, in a one to two copyset redistribution scenario where the one existing copyset contains 100GB of data and is contributing 50GB to a new copyset, there would be a time where the total aggregate disk usage would be 150GB (this does not account for any additional disk usage by background activities like compaction).

Caution:

The following are important considerations to manage copysets and optimize your ActiveSpaces deployments:

  • The first copyset defined in the grid is responsible for global transaction coordination, and cannot be removed.

  • Carefully consider how the load on the tibdgnode processes changes when data is redistributed. When moving from 10 copysets to five, every tibdgnode process has approximately twice the load as before the redistribution.

  • Checkpoint data is not redistributed. If checkpoints are in use, a copyset might service checkpoint requests even if all other data has been redistributed. Removing a copyset might impact checkpoint availability. (Use the tibdg checkpoint list to know the checkpoints that are no longer available due to copyset removal.)

  • Ensure that data redistribution is complete before running the tibdg copyset remove command. If ActiveSpaces is in the process of redistributing data, it does not allow you to remove a copyset.

  • Monitor the redistribution process using the tibdg status command.

  • Exercise caution when removing copysets, as this action permanently alters the grid configuration.