Connected: An Internet Encyclopedia
4.2.3.1 Sorcerer's Apprentice Syndrome

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 1123
Up: 4. FILE TRANSFER
Up: 4.2 TRIVIAL FILE TRANSFER PROTOCOL -- TFTP
Up: 4.2.3 SPECIFIC ISSUES
Prev: 4.2.3 SPECIFIC ISSUES
Next: 4.2.3.2 Timeout Algorithms

4.2.3.1 Sorcerer's Apprentice Syndrome

4.2.3.1 Sorcerer's Apprentice Syndrome

There is a serious bug, known as the "Sorcerer's Apprentice Syndrome," in the protocol specification. While it does not cause incorrect operation of the transfer (the file will always be transferred correctly if the transfer completes), this bug may cause excessive retransmission, which may cause the transfer to time out.

Implementations MUST contain the fix for this problem: the sender (i.e., the side originating the DATA packets) must never resend the current DATA packet on receipt of a duplicate ACK.

DISCUSSION:

The bug is caused by the protocol rule that either side, on receiving an old duplicate datagram, may resend the current datagram. If a packet is delayed in the network but later successfully delivered after either side has timed out and retransmitted a packet, a duplicate copy of the response may be generated. If the other side responds to this duplicate with a duplicate of its own, then every datagram will be sent in duplicate for the remainder of the transfer (unless a datagram is lost, breaking the repetition). Worse yet, since the delay is often caused by congestion, this duplicate transmission will usually causes more congestion, leading to more delayed packets, etc.

The following example may help to clarify this problem.

          TFTP A                  TFTP B

      (1)  Receive ACK X-1
           Send DATA X
      (2)                          Receive DATA X
                                   Send ACK X
             (ACK X is delayed in network,
              and  A times out):
      (3)  Retransmit DATA X

      (4)                          Receive DATA X again
                                   Send ACK X again
      (5)  Receive (delayed) ACK X
           Send DATA X+1
      (6)                          Receive DATA X+1
                                   Send ACK X+1
      (7)  Receive ACK X again
           Send DATA X+1 again
      (8)                          Receive DATA X+1 again
                                   Send ACK X+1 again
      (9)  Receive ACK X+1
           Send DATA X+2
      (10)                         Receive DATA X+2
                                   Send ACK X+3
      (11) Receive ACK X+1 again
           Send DATA X+2 again
      (12)                         Receive DATA X+2 again
                                   Send ACK X+3 again

Notice that once the delayed ACK arrives, the protocol settles down to duplicate all further packets (sequences 5-8 and 9-12). The problem is caused not by either side timing out, but by both sides retransmitting the current packet when they receive a duplicate.

The fix is to break the retransmission loop, as indicated above. This is analogous to the behavior of TCP. It is then possible to remove the retransmission timer on the receiver, since the resent ACK will never cause any action; this is a useful simplification where TFTP is used in a bootstrap program. It is OK to allow the timer to remain, and it may be helpful if the retransmitted ACK replaces one that was genuinely lost in the network. The sender still requires a retransmit timer, of course.


Next: 4.2.3.2 Timeout Algorithms

Connected: An Internet Encyclopedia
4.2.3.1 Sorcerer's Apprentice Syndrome