Rank and Weight

Rendezvous fault tolerance software sorts the members of a group, assigning each member a unique rank. Rank determines which members are active.

A member with rank n takes precedence over a member with rank n+1. In this sense, n represents a higher rank than n+1.

If the active goal of a group is n, then the members with rank 1 through n are active. The member with rank n+1 is known as the ranking inactive member. If one of those active members fails, then Rendezvous fault tolerance software instructs the ranking inactive member to activate.

The most important factor in assigning rank is weight. When a process joins a fault tolerance group, it specifies its weight as a parameter. Weight represents the ability of a member to fulfill its function—relative to other members of the same group.

To rank the members of a group, Rendezvous fault tolerance software sorts the members by weight. The member with the highest weight receives rank 1 (so it outranks all other members); the member with the next highest weight receives rank 2; and so on. When two or more members have the same weight, Rendezvous fault tolerance software ranks them in way that is opaque to programs.

Weight Values

Each member specifies its weight as a positive integer.

Zero is a special, reserved value; Rendezvous fault tolerance software assigns zero weight to processes with resource errors, so they activate only as a last resort when no other members are available. Programs must always assign weights greater than zero.

(For further details, see Disabling a Member, and DISABLING_MEMBER.)

Assigning Weight

Weight lets you influence the ranking of member processes based on external knowledge of the operating environment. Assign weight after considering properties such as hardware speed, hardware reliability, and load factors.

For example, if member A runs on a computer that is much faster than member B, then assign higher weight to A than B. Greater weight expresses your opinion that A fulfills its task more effectively than B. As a result, A is ranked before B, and takes precedence.

If members C, D and E all run on equally fast computers with approximately equal load factors, then assign them equal weight. Equal weight expresses no preference for any process over the others. Rendezvous fault tolerance software ranks them in a way that is opaque to programs.

Rank among Members with Different Weight

Members of greater weight always outrank members of lower weight. For example, if member A has weight 200, and member B has weight 100, then A always outranks B.

Inactive members of greater weight preempt active members of lower weight. For example, if B (weight 100) is already active when A (weight 200) starts, then Rendezvous fault tolerance software instructs B to deactivate, and instructs A to activate in its place.

Ranking Members with Equal Weight

If members C and D have equal weight, their relative rank is opaque to programmers. That is, their relative rank does not necessarily depend on the order in which two processes start. Consider these (possibly surprising) consequences:

If member process C starts before member D, you cannot deduce from this order that C outranks D. Nor can you deduce the reverse, that D outranks C.
If the active goal for the group is 1, and C starts first, and D starts immediately after C—then you cannot assume that either process will be the first to become the active member.

Status Quo among Members with Equal Weight

A ranking inactive member never preempts an active member with the same weight—despite its precedence in rank.

For example, if members E and F have equal weight, with E outranking F, and F already active, then E does not preempt F to become active in its place.

Contrast that example with a situation in which neither E nor F is active, and a new active member is needed to complete the active goal—in this case E activates (rather than F) because E outranks F.

Adjusting Weight

In addition to specifying weight when a process joins a fault tolerance group, sophisticated programs can adjust their weight at any time to reflect changing conditions.

For example, a member might track the changing load factor of its host computer, and adjust its weight accordingly. Rendezvous fault tolerance software automatically recomputes the ranking of members whenever a member changes its weight.

Adjusting weights causes each member to recompute their relative weights of all the members of the group. For large groups this recomputation can affect performance.

For examples, see Adjusting Member Weights.