Diagnoses

Scanning for Problems described a quick scanning technique for locating problems in rvtrace output, looking for non-zero values in the Bad, Gaps, and Rbytes columns of the multicast data tables. When such a scan indicates problems, look more closely at the retransmit statistics in nearby intervals.

Rseq measures retransmission requests for missed multicast or broadcast packets. Non-zero Rseq values generally indicate a problem. The ratio Rdata/Data measures the severity of the problem. Small ratios indicate low-level problems, which you can investigate as time permits. Ratios of 2% or greater indicate potentially serious network problems; investigate further. High ratios that last for only one interval, could indicate an intermittent problem, which could become more serious in other situations.

Notice that Rseq tabulates packets that serve a feedback mechanism within the protocol. A data receiver becomes a feedback sender when it detects that it has missed data packets. So the Rseq value in source rows indicates a data receiver sending retransmission requests. Conversely, the Rseq value in destination rows indicates a data sender receiving retransmission requests.

Consider the following two examples.

Difficulty at One Specific Receiver

Rseq Reveals Difficulty at a Receiver shows rvtrace output for three intervals, which indicate a difficulty at one specific receiver. The administrator must investigate that receiver, its network hardware, and its load.

Several situations could cause this pattern in rvtrace display output. For example:

One slow computer is flooded by too much data from a network of faster senders; the receiver cannot process inbound data as fast as the rest of the network.
One receiver with intermittent network interface failures or a loose network cable.

Figure 155: Rseq Reveals Difficulty at a Receiver

In Rseq Reveals Difficulty at a Receiver, the first interval shows 9 sequence gaps in the multicast statistics table—that is, 9 gaps in the stream of multicast packets. The Rseq column of the multicast retransmit table contains further details; the host at address 10.101.3.246 requested 2211 packets for retransmission, while the other hosts requested a total of 11 packets. Conclude that the locus of the problem is at 10.101.3.246, and that retransmit requests from the other receivers are side effects.

The second interval of Rseq Reveals Difficulty at a Receiver shows zero sequence gaps—the problem has abated. Nonetheless, the Rdata and Rbytes columns indicate that retransmissions continue as Rendezvous daemons recover from the problem by resending stored data.

By the third interval of Rseq Reveals Difficulty at a Receiver, everything has returned to normal.

Difficulty at One Specific Sender

Rseq Reveals Difficulty at a Sender shows output indicating a difficulty at one specific sender. The administrator must investigate that sender, its sending applications, and its network hardware.

Several situations could cause this pattern in rvtrace display output. For example:

The sender is flooding the network—that is, it is sending packets faster than most other daemons on the network can receive them.
The sender has intermittent network interface failures or a loose network cable.

In Rseq Reveals Difficulty at a Sender, the multicast statistics table shows 411 sequence gaps—that is, 411 gaps in the stream of multicast packets. Moreover, all the missing packets originate at one sender, 10.101.3.246. The Rseq column of the multicast retransmit table contains further details; both of the receivers in the network requested those packets for retransmission—that is 10.101.3.74 and 10.101.3.237 both sent retransmit requests to 10.101.3.246. Conclude that the locus of the problem is at 10.101.3.246.

The Rdata column of the multicast statistics table shows that before the end of the interval, the sender had retransmitted all 411 missing packets. The problem was a brief glitch—the Rendezvous reliable transport mechanisms easily smoothed over this temporary rough spot. Nonetheless, if such behavior recurs, the administrator must investigate the problem.

Figure 156: Rseq Reveals Difficulty at a Sender