Defining and enabling partitions

Defining and enabling partitions
Prev	Chapter 7. High availability	Next

There are two mechanisms to define partitions:

Programmatically using an API
Administration tools

Programmatic definition of partitions is covered in this section. See the ActiveSpaces® Transactions Administration Guide for details on defining partitions using the administration tools.


	Mixing partition management via the API and administration commands must be done carefully to ensure that an operator using the administrative commands does not see unexpected results.

Partitions can be dynamically defined as needed. A partition definition consists of:

a cluster-wide unique name
optional partition properties
initial active node
optional list of replica nodes ordered by priority

The optional replica node list is used to define the replica nodes for a partition. The replica nodes are specified in priority order. If the current active node for a partition fails, the first replica node becomes the active node. Each replica node definition consists of:

node name
whether to use synchronous or asynchronous replication

Once a partition is defined it cannot be removed.

Partitions should be defined and enabled on all nodes in the cluster that need knowledge of the partition.

The supported partition properties are:

broadcast partition definition updates - controls the broadcasting of partition definition updates to all nodes in the cluster. Disabling this behavior is useful for simulating a multi-master scenario. See the section called “Simulating a multi-master scenario” for details.
force replication - used to control replication during a migration. If this value is set to true replication will be done to all currently active replica nodes for the partition. The default value for this property causes no replication to be done to the current replica nodes during a migration. This option can be used to force resynchronization of all replica nodes for a partition. In general, this should be avoided and a replica node should be brought offline and back online to resynchronize replica data.
number of threads - controls the number of threads that are used to perform object migration. If the number of objects in the partition is less than the value of the objects locked per transaction property, only a single thread will be used. If the value of the number of objects locked per transaction property is zero, or the calling transaction has one or more partitioned objects locked, the value of this property is ignored and the caller's thread will be used to perform the migration.
objects locked per transaction - controls the number of objects that are locked during a migration or update per transaction. Setting this property to a value greater than zero allows application work to continue concurrently while a partition is being migrated or updated. Setting this property to a value of zero causes all objects in the partition to be locked in a single transaction.
restore from node - define which node a partition should be restored from in a multi-master scenario. This property should be used if the partition is active on multiple nodes in the cluster. If this property is set, the current node must be the current active node for the partition when the partition is defined. If this value is not specified a cluster wide broadcast is used to determine which node a restore should be done from.
sparse partition audit - controls whether the node list defined for a sparse partition is audited to match the node list of the current active node when the sparse partition is enabled. Disabling this audit may be useful to resolve partition definition ordering dependencies across nodes in a cluster. For example, a sparse partition may be restored before the partition on an active node is restored following a failover. In this case, if the sparse partition audit is not disabled, the enabling of the sparse partition will fail with a node list mismatch exception.

Example 7.2, “Defining and enabling a partition” shows the steps required to define and enable a new partition. The steps are:

Call PartitionManager.definePartition(...) to define the partition name, the optional partition properties, the initial active node, and optionally one or more replica nodes.
Call PartitionManager.enablePartitions(...) to enable all defined partitions, or partition.enable(...) to enable a specific partition, to the cluster.

The PartitionManager.definePartition(...) method is synchronous - it blocks until the active node in the partition definition is available. If the active node for the partition is not available, a com.kabira.platform.ResourceUnavailableException is thrown, for example:

Java main class sandbox.highavailability.InvalidActiveNode.main exited with an exception.
com.kabira.platform.ResourceUnavailableException: Remote node 'bogus' cannot be accessed, current state is 'Undiscovered'
 at com.kabira.platform.disteng.DEPartitionManager.definePartition(Native Method)
 at com.kabira.platform.highavailability.PartitionManager.definePartition(PartitionManager.java:810)	at com.kabira.platform.highavailability.PartitionManager.definePartition(PartitionManager.java:810)
 at com.kabira.platform.highavailability.PartitionManager.definePartition(PartitionManager.java:444)
 at sandbox.highavailability.InvalidActiveNode$1.run(InvalidActiveNode.java:16)
 at com.kabira.platform.Transaction.execute(Transaction.java:484)
 at com.kabira.platform.Transaction.execute(Transaction.java:542)
 at sandbox.highavailability.InvalidActiveNode.main(InvalidActiveNode.java:18)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
 at java.lang.reflect.Method.invoke(Unknown Source)
 at com.kabira.platform.MainWrapper.invokeMain(MainWrapper.java:65)
 at com.kabira.platform.highavailability.PartitionManager.definePartition(PartitionManager.java:810)

The amount of time to wait for a node to be come active before the com.kabira.platform.ResourceUnavailableException is thrown is controlled by the nodeActiveTimeoutSeconds configuration value. See the ActiveSpaces® Transactions Administration Guide for details.

The PartitionManager.enablePartitions(...) and partition.enable(...) methods are synchronous - they block until any required object migrations are complete.


	When enabling multiple partitions using the `PartitionManager.enablePartitions(...)` method, the order in which the partitions are enabled is undefined. The only guarantee is that all partitions have been enabled when `PartitionManager.enablePartitions(...)` returns. If ordering is required when enabling partitions, use the `partition.enable(...)` method.

The PartitionManager.EnableAction parameter to the enable methods controls how partitions are activated into the cluster.

JOIN_CLUSTER - No object deletion or merging is done as part of activating partitions. This action is appropriate for initializing new nodes, or following hard failures that required shared memory to be removed.
JOIN_CLUSTER_PURGE - Deletes any preexisting partitioned objects on the local node for the partition(s) being enabled before doing any object migration. This option is typically used after a node was gracefully removed from service and it is now being brought back online.
JOIN_CLUSTER_RESTORE - Objects are both deleted and merged as part of activating the partitions. This option is used to restore partitions that were in a multi-master scenario. Objects are restored from the node specified in the restore from node partition property.

When restoring from a multi-master scenario, ObjectNotUnique exceptions can occur based on the PartitionManager.EnableAction value:

JOIN_CLUSTER - An ObjectNotUnique exception can occur because shared memory is not cleared when the partition is enabled. If the restore from node has an object with the same key value, an ObjectNotUnique exception is thrown and object migration will terminate.
JOIN_CLUSTER_PURGE - An ObjectNotUnique exception cannot occur because local shared memory is cleared before object migration occurs.
JOIN_CLUSTER_RESTORE - An ObjectNotUnique exception cannot occur. The local object instance is deleted and copied from the restore from node.

Objects that exist only on a single node during a multi-master scenario are treated differently depending on the PartitionManager.EnableAction value. This inconsistency can occur if objects were created or deleted during a multi-master scenario. Assuming that an object exists on the node being restored, but not on the restore from node, the following summarizes the behavior:

JOIN_CLUSTER - The object instance on the node being restored is orphaned - it is no longer a partitioned object since the node being restored from has no knowledge of this object.
JOIN_CLUSTER_PURGE - The object on the node being restored is deleted since shared memory is cleared before the objects are migrated from the restore from node.
JOIN_CLUSTER_RESTORE - The object on the node being restored is kept during the merge with the restore from node. This object appears to have been resurrected from the perspective of the restore from node, if it was previously deleted on that node.

Partition definition and enabling should be done in a separate transaction from any object creations that will use the new partition to minimize the risk of deadlocks. Creating objects in a partition that is not active causes a com.kabira.platform.ResourceUnavailableException, for example:

[A] com.kabira.platform.ResourceUnavailableException: partition 'Odd' is not active
[A]   at com.kabira.platform.ManagedObject.createSMObject(Native Method)
[A]   at com.kabira.snippets.highavailability.FlintStone.<init>(PartitionDefinition.java:17)
[A]   at com.kabira.snippets.highavailability.PartitionDefinition$2.run(PartitionDefinition.java:101)
[A]   at com.kabira.platform.Transaction.execute(Transaction.java:303)
[A]   at com.kabira.snippets.highavailability.PartitionDefinition.main(PartitionDefinition.java:93)
[A]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[A]   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[A]   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[A]   at java.lang.reflect.Method.invoke(Method.java:597)
[A]   at com.kabira.platform.MainWrapper.invokeMain(MainWrapper.java:49)

Example 7.2. Defining and enabling a partition

//     $Revision: 1.1.2.2 $
package com.kabira.snippets.highavailability;

import com.kabira.platform.Transaction;
import com.kabira.platform.Transaction.Rollback;
import com.kabira.platform.annotation.Managed;
import com.kabira.platform.highavailability.PartitionManager;
import static com.kabira.platform.highavailability.PartitionManager.EnableAction.JOIN_CLUSTER;
import com.kabira.platform.highavailability.PartitionMapper;
import com.kabira.platform.highavailability.ReplicaNode;
import static com.kabira.platform.highavailability.ReplicaNode.ReplicationType.SYNCHRONOUS;

/**
 * This snippet shows how to define and enable new partitions
 * <p>
 * <h2> Target Nodes</h2>
 * <ul>
 * <li> <b>domainnode</b> = "A"
 * </ul>
 */
public class PartitionDefinition
{
    @Managed
    private static class FlintStone
    {

        FlintStone(String name)
        {
            this.name = name;
        }
        final String name;
    }
    
    //
    //  Partition mapper that puts object in partitions by name
    //
    private static class AssignPartitions extends PartitionMapper
    {
        @Override
        public String getPartition(Object obj)
        {
            FlintStone flintStone = (FlintStone) obj;

            return flintStone.name;
        }
    }
    
    /**
     * Main entry point
     *
     * @param args Not used
     */
    public static void main(String[] args)
    {
        new Transaction("Partition Definition")
        {
            @Override
            protected void run() throws Rollback
            {
                
                //
                //  Install the partition mapper
                //
                PartitionManager.setMapper(FlintStone.class, new AssignPartitions());
                
                //
                //  Set up the replicas
                //
                ReplicaNode[] fredReplicas = new ReplicaNode[]
                {
                    new ReplicaNode("B", SYNCHRONOUS)
                };
                ReplicaNode[] barneyReplicas = new ReplicaNode[]
                {
                    new ReplicaNode("B", SYNCHRONOUS)
                };

                //
                //  Define two partitions
                //
                PartitionManager.definePartition("Fred", null, "A", fredReplicas);
                PartitionManager.definePartition("Barney", null, "C", barneyReplicas);

                //
                //  Enable the partitions
                //
                PartitionManager.enablePartitions(JOIN_CLUSTER);

            }
        }.execute();

        new Transaction("Create Objects")
        {
            @Override
            protected void run() throws Rollback
            {
                //
                //  Create Fred and Barney
                //
                FlintStone fred = new FlintStone("Fred");
                FlintStone barney = new FlintStone("Barney");

                //
                //  Display assigned partitions
                //
                System.out.println(fred.name + " is in "
                    + PartitionManager.getObjectPartition(fred).getName());
                System.out.println(barney.name + " is in "
                    + PartitionManager.getObjectPartition(barney).getName());

            }
        }.execute();
    }
}

When Example 7.2, “Defining and enabling a partition” is run it outputs the information in Example 7.3, “Partitioning example output”.

Example 7.3. Partitioning example output

[A] Fred is in Fred
[A] Barney is in Barney

Partition migration

There are two mechanisms to migrate partitions:

Programmatically using an API
Administration tools.

Programmatic migration of partitions is covered here. See the ActiveSpaces® Transactions Administration Guide for details on migrating partitions using the administration tools.

Migrating a partition involves optionally changing the partition properties, active node, or replica nodes associated with the partition. Partition migration must be done on the current active node for the partition. Depending on the changes to the partition, the active node may change, and object data may be copied to other nodes in the cluster.

Partition properties specified when a partition was defined can be overridden. This only affects the properties for the duration of the partition.migrate() execution. It does not change the default properties associated with a partition.

Calling partition.migrate() with any partitioned objects locked in the transaction causes the objects locked per transactions property to be treated as if it had a value of zero. This causes all objects in the partition to be locked in the calling transaction during the partition migration. It is strongly recommended that partition.migrate() always be called from a transaction with no partitioned objects locked.

Example 7.4, “Migrating a partition” shows an example of migrating a partition.

Example 7.4. Migrating a partition

//     $Revision: 1.1.2.1 $
package com.kabira.snippets.highavailability;

import com.kabira.platform.Transaction;
import com.kabira.platform.Transaction.Rollback;
import com.kabira.platform.annotation.Managed;
import com.kabira.platform.highavailability.Partition;
import com.kabira.platform.highavailability.PartitionManager;
import static com.kabira.platform.highavailability.PartitionManager.EnableAction.JOIN_CLUSTER;
import com.kabira.platform.highavailability.PartitionMapper;
import com.kabira.platform.highavailability.PartitionNotFound;
import com.kabira.platform.highavailability.ReplicaNode;
import static com.kabira.platform.highavailability.ReplicaNode.ReplicationType.SYNCHRONOUS;
import com.kabira.platform.property.Status;

/**
 * This snippet shows how to migrate a partition
 * <p>
 * <h2> Target Nodes</h2>
 * <ul>
 * <li> <b>domainnode</b> = "A"
 * </ul>
 */
public class PartitionMigration
{
    @Managed
    private static class AnObject
    {
    }

    //
    //  Partition mapper that maps objects to either Even or Odd
    //
    private static class UpdateMapper extends PartitionMapper
    {
        @Override
        public String getPartition(Object obj)
        {
            return PARTITION_NAME;
        }
    }

    /**
     * Main entry point
     *
     * @param args Not used
     * @throws java.lang.InterruptedException
     */
    public static void main(String[] args) throws InterruptedException
    {
        new Transaction("Initialization")
        {
            @Override
            protected void run() throws Rollback
            {
                //
                //  Install the partition mapper
                //
                PartitionManager.setMapper(AnObject.class, new UpdateMapper());

                //
                //  Define the partition
                //
                ReplicaNode[] replicas = new ReplicaNode[]
                {
                    new ReplicaNode("B", SYNCHRONOUS),
                    new ReplicaNode("C", SYNCHRONOUS)
                };
                PartitionManager.definePartition(PARTITION_NAME, null, "A", replicas);

                //
                //  Enable the partition
                //
                PartitionManager.enablePartitions(JOIN_CLUSTER);
            }
        }.execute();

        new Transaction("Create Object")
        {
            @Override
            protected void run() throws Rollback
            {
                m_object = new AnObject();

                //
                //  Get the partition for the object
                //
                Partition partition = PartitionManager.getObjectPartition(m_object);

                //
                //  Display the active node
                //
                System.out.println("Active node is " + partition.getActiveNode());
            }
        }.execute();

        new Transaction("Migrate Partition")
        {
            @Override
            protected void run() throws Rollback
            {
                //
                //  Get the partition for the object
                //
                Partition partition = PartitionManager.getObjectPartition(m_object);

                //
                //  Partition migration can only be done on the current 
                //  active node
                //
                if (partition.getActiveNode().equals(
                    System.getProperty(Status.NODE_NAME)) == false)
                {
                    return;
                }

                //
                //  Mirgate the partition to make node C the active node.
                //
                //  Partition migration can only be from the current active node.
                //
                //  The default properties specified when the partition was
                //  defined are used.
                //
                ReplicaNode[] replicas = new ReplicaNode[]
                {
                    new ReplicaNode("B", SYNCHRONOUS),
                    new ReplicaNode("A", SYNCHRONOUS)
                };
                partition.migrate(null, "C", replicas);

                System.out.println("Migrated partition to node C");
            }
        }.execute();

        waitForMigration();

        new Transaction("Display Active Node")
        {
            @Override
            protected void run() throws Rollback
            {
                //
                //  Get the partition for the object
                //
                Partition partition = PartitionManager.getObjectPartition(m_object);

                //
                //  Display the active node again
                //
                System.out.println("Active node is " + partition.getActiveNode());
            }
        }.execute();
    }

    private static void waitForMigration() throws InterruptedException
    {
        System.out.println("Waiting for migration...");

        while (m_partitionFound == false)
        {
            Thread.sleep(5000);

            new Transaction("Wait for Migration")
            {
                @Override
                protected void run()
                {
                    try
                    {
                        Partition partition = PartitionManager.getPartition(PARTITION_NAME);

                        //
                        //  Wait for partition to migrate to node C
                        //
                        m_partitionFound = partition.getActiveNode().equals("C");
                    }
                    catch (PartitionNotFound ex)
                    {
                        // not available yet
                    }
                }
            }.execute();
        }

        System.out.println("Partition migration complete.");
    }

    private static boolean m_partitionFound = false;
    private static AnObject m_object = null;
    private static final String PARTITION_NAME = "Partition Migration Snippet";
}

When Example 7.4, “Migrating a partition” is run it outputs the information in Example 7.5, “Partition migration output” (annotation added and output reordered for clarity).

Example 7.5. Partition migration output

#
# Partition is active on node A
#
[C] Active node is A
[B] Active node is A
[A] Active node is A

#
# Nodes B and C are waiting for partition migration
#
[C] Waiting for migration...
[B] Waiting for migration...

#
# Node A migrates partition to node C
#
[A] Migrated partition to node C
[A] Waiting for migration...

#
# Migration completes
#
[C] Partition migration complete.
[B] Partition migration complete.
[A] Partition migration complete.

#
# Partition is now active on node C
#
[C] Active node is C
[B] Active node is C
[A] Active node is C

Prev	Up	Next
Defining a partition mapper	Home	Initializing or restoring a node