Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved


Chapter 3 Performing Basic TIBCO ActiveSpaces Tasks : Managing Spaces

Managing Spaces
Spaces are the main feature offered by ActiveSpaces. They can be seen as a form of virtual shared memory that can be used by many applications distributed over a network to store, retrieve, and consume data (as well as a data distribution and synchronization mechanism) in a platform independent manner.
A successful connection to a particular metaspace enables the user to define, drop, join, and leave spaces, and also get an existing space’s definition and list of members.
Defining a Space
In ActiveSpaces, a space must first be defined in the metaspace before it can be joined by applications and agents. After a space is defined, it is actually created when a member of the metaspace joins it (and becomes the first member of that space). Conversely, it is destroyed (although it remains defined) when the last member leaves it (and there are no more members of the space).
There are two ways to define a user space within a metaspace: through the Admin CLI tool, or by using API calls.
If the space definition does not exist (that is, it was not defined earlier using the Admin CLI tool, or by another application using defineSpace), the space definition is created and stored in the metaspace (more specifically, it is stored in the system spaces).
Space Definition Through the Admin CLI
To use the Admin CLI tool to define a space, you must first connect to the desired metaspace using the connect command, and then use the define space or create space command.
The following example shows the use of the define space CLI command:
define space name 'myspace' (field name 'key' type 'integer', field name 'value' type 'string') key ('key')
Space Definition Through the API
Using the SpaceDef object In the Java API, defining a space is done using the defineSpace method of the Metaspace object, which takes a SpaceDefinition object as its sole parameter. In the C API, you define a space by calling the tibasSpaceDef_Create() function.
If the space was already defined in the metaspace, then defineSpace compares the space definition that was passed to it as an argument with the space definition currently stored in the metaspace; if the definitions match then defineSpace returns successfully, otherwise an error is thrown.
Using the admin object execute method A space can also be defined through the API by using the admin object’s execute method to execute a define Space admin language command.
Example creating a space using the admin language:
define space name 'myspace' (field name 'key' type 'integer', field name 'value' type 'string') key ('key')
 
Getting a Space Definition
You can get the space definition for a space that has been previously defined in the metaspace by using the Metaspace object’s getSpaceDef method, which takes a space name as a parameters and returns either a copy of that space’s SpaceDef object or throws an exception if no space of that name is currently defined in the metaspace.
Dropping a Space
You can delete a space’s space definition from a metaspace by invoking the dropSpace method of the Metaspace object. This call will only succeed if there are no members to that space at the time this method is invoked. It is also possible to drop a space using the Admin tool.
Space Definition
A space definition is composed of two parts:
1.
2.
The space definition is represented in the API by a SpaceDef object that is either created from scratch by invoking the SpaceDef’s create() method, or returned by the metaspace or space’s getSpaceDef methods.
Managing Space Distribution
You can set or get a space’s distribution policy by using the SpaceDef object’s setDistributionPolicy and getDistributionPolicy respectively. The value of the distribution policy argument can be either DISTRIBUTED (which is the default value) or NON_DISTRIBUTED.
Performing Space Replication
To provide fault-tolerance and to prevent loss of tuples if one of the seeders of a space suddenly disappears from the metaspace, you can specify that space data is replicated—backed up one or more seeders.
Replication in ActiveSpaces is performed in a distributed active-active manner: seeders both seed some tuples and replicate tuples assigned to other seeders. This means that the replication is distributed: rather than having a designated backup for each seeder that replicates all of the tuples that this seeder seeds, the tuples that it seeds are replicated by any of the other seeders.
ActiveSpaces also allows you to group seeders to help prevent the loss of replicated data. This is known as host-aware replication. When you use host-aware replication, the data from the seeders in one group is replicated on seeders that reside in other groups.
For example, if you group all of the seeders that reside on one device into a group and that device goes down, no data loss will occur, because the replicated data is guaranteed not to reside on any of the seeders in that group, but instead resides with seeders that were not on the device which went down.
Seeders work in an active-active mode, in the sense that there are no “backup seeders” just waiting for a “primary seeder” to fail to start becoming active. Instead, all seeders are always active both seeding and replicating tuples. This is a more efficient mode of replication because it means that if a seeder fails, there is no need to redistribute the tuples amongst the remaining seeders to ensure that the distribution remains balanced. All that is needed is for ActiveSpaces to rebuild the replication degree, which is less work than having to redistribute, as well, and therefore ensures that there is less impact on performance when a seeder fails.
A degree of replication of 0 (the default value) means there is no replication. In this case, if one of the members that has joined the space as a seeder fails suddenly, the tuples that it was seeding disappear from the space. If, instead, that member leaves in an orderly manner by invoking the call to leave the space or the call to disconnect from the metaspace, there is no loss of data.
A degree of replication of 1 means that each tuple seeded by one member of the space will also be replicated by one other seeder of the space. If a seeder suddenly disappears from the metaspace, the tuples that it was seeding will automatically be seeded by the nodes that were replicating them, and no data is lost. Seeders do not have designated replicating members; rather, all of the tuples seeded by a particular member of the space are evenly replicated by all the other seeders in the space. This has the advantage that even after a seeder failure, the tuples are still evenly balanced over the remaining set of seeder members of that space. It also means that ActiveSpaces's fault-tolerance is achieved in an active-active manner.
A degree of replication of 2 (or 3 or more) means that each tuple seeded by a member of the space is replicated by 2 (or 3 or more) other seeder members of the space. This means that up to 2 (or 3 or more) seeder members of a space can suddenly disappear of the metaspace (before the degree of replication can be re-built by the remaining members of the space) without any data being lost.
A special replication degree of REPLICATE_ALL is also available. When it is specified for a space, all of the tuples in that space will be replicated by all the seeder members of the space. This allows the fastest possible performance for get operations on the space, at the expense of scalability: each seeder member has a coherent copy of every single tuple in the space, and can therefore perform a read operation locally, using either its seeded or replicated view of the tuple.
The degree of replication can be set or gotten using the SpaceDef object’s setReplicationCount or getReplicationCount methods. The default value is 0, that is, no replication.
Host-Aware Replication
With host-aware replication, you can ensure that replicated data does not reside on the same system as the original data and therefore is not lost if that system goes down. Instead of increasing the replication degree to ensure that replicated data exists on other systems when more than one seeder resides on the same system, you can use host-aware replication instead.
With host-aware replication, you group seeders based upon their member names. To organize seeders into groups, use member names of the form:
<group_name>.<member_name>
ActiveSpaces groups all seeders with the same group_name together and their data will is replicated on seeders outside of that group.
 
You can set up host aware replication in several ways:
1.
See Using the ActiveSpaces API Set to Implement Host-Aware Replication
2.
3.
By starting as-agents that run as seeders and using the as-agent -name parameter to set up member names that use the host-aware replication naming convention.
For more information on setting up host-aware replication using as-agent, see Host-Aware Replication in the TIBCO ActiveSpaces Administration Guide.
Using the ActiveSpaces API Set to Implement Host-Aware Replication
The following examples show how to set the member name in the MemberDef object for each of the API sets:
Java API
MemberDef memberDef = MemberDef.Create();
memberDef.setMemberName = “mymachinename.seeder_n”;
C API
tibasMemberDef memberDef;
tibasMemberDef_Create(&memberDef);
tibasMemberDef_SetMemberName(memberDef, "mymachinename.seeder_n");
.NET API
MemberDef memberDef = MemberDef.Create();
memberDef.MemberName = "mymachinename.seeder_n";
 
Synchronous and Asynchronous Replication
Along with the degree of replication, it is also possible to define whether replication will happen in synchronous or asynchronous mode for the space.
Asynchronous Replication
Asynchronous replication is the default behavior. It offers the best performance, since the amount of time it takes to complete an operation that modifies one of the tuples in the space does not depend on the degree of replication. While the asynchronous replication protocol of ActiveSpaces has very low latency and is resilient in most failure scenarios, it does not offer an absolute guarantee that the data modification has been replicated to the desired degree by the time the operation completes.
Using asynchronous replication does not imply a degradation in the coherency or consistency of the view of the data between members of a space. Under normal operating conditions all members of the space will be notified of the change in the data almost at the same time as the operation returns, even when asynchronous replication is used.
Synchronous Replication
Synchronous replication offers the highest level of safety, at the expense of performance. When synchronous replication is in use for a space, it means that an operation that modifies one of the tuples in the space will only return an indication of success when that modification has been positively replicated up to the degree of replication required for the space.
Comparison of asynchronous and synchronous replication
One implication of the type of replication of a space on operations on that space is that asynchronous replication is more permissive in the sense that it allows operations that modify the data stored in a space even when the degree of replication can not be achieved at the time for lack of enough seeders. (Replication will be automatically achieved, however, as soon as enough seeders have joined the space.) Synchronous replication does not allow such operations. Asynchronous replication, then, is a best effort quality of replication, while synchronous replication is a strict enforcement of the replication degree.
The type of replication for a space can be set or queried using the SpaceDef object’s setSyncReplicated and isSyncReplicated methods, respectively. Those methods take and return a boolean and the default value is false, that is, asynchronous replication.
Defining Capacity
You can define a capacity for the space to control the amount of memory used by the seeders for storing tuples in the space. The capacity is expressed in number of tuples per seeder and defaults to -1, which means an infinite number of tuples per seeder.
If a capacity is specified, then you must specify an eviction policy that is used to indicate the outcome of an operation that would result in an additional tuple being seeded by a seeder that is already at capacity. The two choices for the eviction policy are NONE, which means that the operation will fail with the appropriate exception being stored in the Result object, or LRU, which means that the seeder will evict another tuple using the Least Recently Used (LRU) eviction algorithm, where the least recently read or modified tuple will be evicted from the space.
Specifying a capacity and an eviction policy of LRU for a space means that the space can effectively be used as a cache, and when used in conjunction with persistence, allows access to a persistent data-store in a “cache-through” mode of operation.
If you specify a capacity setting, then ActiveSpaces enforces the capacity limitation at one second intervals, and carries out any eviction policies that are configured.
Implementing Persistence
Since ActiveSpaces stores the tuples of a space in the memory of the space's seeders, the data contained in a space disappears when the last seeder of the space leaves. To be able to maintain the data in the space through times where no seeders are available, you can persist this data to another (slower but more permanent) data storage system such as a database, a key-value store, or even a file system. ActiveSpaces allows a space to be marked as persistent, meaning that the space can be “rehydrated” from the persistence layer at startup, and changes to the space can be reflected to that persistence layer.
ActiveSpaces provides two types of persistence:
Shared-Nothing Persistence Each node that joins a space as a seeder maintains a copy of the space data on disk. Each node that joins as a seeder writes its data to disk and reads the data from its local disk when needed for recovery and for cache misses. This type of built-in persistence is implemented by the ActiveSpaces libraries
Shared-All Persistence All nodes share a single persister or a set of persisters. If you choose to implement shared-all persistence, your application must implement it using the ActiveSpaces API.
For details on how to set up persistence, see Setting up Persistence.
If the space is defined as both persistent and with a capacity and an eviction policy of LRU, then you can use ActiveSpaces to cache access to the persistence layer in “cache-through” mode. In this case, applications can transparently access the data stored in the persistent layer through the space. If the data associated with a particular key field value is not in the space at the time of the read request (a “cache miss”), then it is transparently fetched from the persistence layer, and stored in the space such that a subsequent request for a get on the same key value can be serviced directly and much faster by the space (a “cache hit”).
When making a query on a space using a browser or a listener on a transparently cached space, there is a difference in behavior between the shared-nothing and the shared-all persistence modes of operation:
With the built-in shared-nothing persistence, the query can return ALL of the tuples stored in the space regardless of whether they are present in the cached records in RAM or on persistent storage. What is already cached is returned faster than what is evicted, but every matching record is returned. However, to do this, the fields being queried in the space MUST have indexes defined on them.
With external shared-all persistence, listeners and browsers only return the matching records that are present in the RAM-cached subset of the space, and will NOT return records that are only present in the persistence layer at the time the query is issued.
A space can be marked as persisted (in which case it will need at least one persister as well as the minimum number of seeders to be usable), or not (in this case it does not need a persister but still needs seeders).

Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved