Connected: An Internet Encyclopedia
Connected: An Internet Encyclopedia
Programmed Instruction Course
Section 3 - The IP Protocol
Prev: IP Packet Structure
Next: RFC Examples
The most useful software tool for testing Internet operation
at the IP level is Ping.
``Ping'' is one of the most useful network debugging tools available.
It takes its name from a submarine sonar search - you send a short
sound burst and listen for an echo - a ping - coming back.
In an IP network, `ping' sends a short data burst - a single packet -
and listens for a single packet in reply. Since this tests the most
basic function of an IP network (delivery of single packet), it's
easy to see how you can learn a lot from some `pings'.
Ping is implemented using the required
Echo function, documented in
RFC 792 that all hosts
should implement. Of course, administrators can disable
ping messages (this is rarely a good idea, unless security
considerations dictate that the host should be unreachable anyway),
and some implementations have (gasp) even been known not
to implement all required functions. However, ping is usually
a better bet than almost any other network software.
Many versions of ping are available. For the remainder of this discussion,
I assume use of BSD UNIX's ping, a freely available, full-featured ping
available for many UNIX systems.
Most PC-based pings do not have the advanced features I
describe. As always, read the manual for whatever version you use.
What Ping can tell you
- Ping places a unique sequence number on each packet it transmits,
and reports which sequence numbers it receives back. Thus, you
can determine if packets have been dropped, duplicated, or reordered.
- Ping checksums each packet it exchanges. You can detect some
forms of damaged packets.
- Ping places a timestamp in each packet, which is echoed back
and can easily be used to compute how long each packet exchange
took - the Round Trip Time (RTT).
- Ping reports other ICMP messages that might otherwise get
buried in the system software. It reports, for example, if
a router is declaring the target host unreachable.
What Ping can not tell you
- Some routers may silently discard undeliverable packets. Others
may believe a packet has been transmitted successfully when it
has not been. (This is especially common over Ethernet, which
does not provide link-layer acknowledgments) Therefore, ping
may not always provide reasons why packets go unanswered.
- Ping can not tell you why a packet was damaged, delayed, or
duplicated. It can not tell you where this happened either,
although you may be able to deduce it.
- Ping can not give you a blow-by-blow description of every
host that handled the packet and everything that happened at
every step of the way. It is an unfortunate fact that no
software can reliably provide this information for a TCP/IP network.
Ping should be your first stop for network troubleshooting.
Having problems transferring a file with FTP? Don't fire up
your packet analyzer just yet. Leave your TDR in the box for now.
Relax. Put on some Yanni. Don't even ``su'' - ping is a
non-privileged command on most systems. Start one running and just watch
it for at least two minutes. That's enough time for most
periodic network problems to show themselves. Once you've seen about
a hundred packets, you should be getting a good feel for how
this host is responding. Are the round-trip times consistent?
Seeing any packet loss? Are the TTL values sane? Start
pinging other hosts. Try the machine next to you - the problem
might be closer than you think. Try the last router - maybe
the remote system is overloaded (especially if it's a popular
Internet site like this one). Don't know what the last router is? Use
or guess - changing the last number in the IP address to 1
usually gets you something interesting.
Check other sites with similar network topologies (other
remote LAN sites, or other Internet sites, or other sites using
the same backbone). Starting to learn something about how
your network is responding? Good. And - oh, yeah, go check
that FTP. It's probably done by now.
Here's a list of common BSD ping options, and when you might want to use them:
- -c count
Send count packets and then stop. The other way to stop is
type CNTL-C. This option is convenient for scripts that
periodically check network behavior.
Flood ping. Send packets as fast as the receiving host can handle them,
at least one hundred per second. I've found this most useful to
stress a production network being tested during its down-time.
Fast machines with fast Ethernet interfaces (like SPARCs) can
basically shutdown a network with flood ping, so use this with caution.
- -l preload
Send preload packets as fast as possible, then fall into
a normal mode of behavior. Good for finding out how many packets
your routers can quickly handle, which is in turn good for diagnosing
problems that only appear with large TCP window sizes.
Numeric output only. Use this when, in addition to everything else,
you've got nameserver problems and ping is hanging trying to give
you a nice symbolic name for the IP addresses.
- -p pattern
Pattern is a string of hexadecimal digits to pad the end
of the packet with. This can be useful if you suspect data-dependent
problems, as links have been known to fail only when certain
bit patterns are presented to them.
Use IP's Record Route option to determine what route the ping
packets are taking. There are many problems with using this,
not the least of which is that the option is placed on the
request and the target host is under no obligation to place
a corresponding option on the reply. Consider yourself lucky
if this works.
Bypass the routing tables. Use this when, in addition to everything
else, you've got routing problems and ping can't find a route
to the target host. This only works for hosts that can be directly
reached without using any routers.
- -s packetsize
Change the size of the test packets. Try it - why not? Check
large packets, small packets (the default), very large packets
that must be fragmented, packets that aren't a neat power of two.
Read the manual to find out exactly what you're specifying here -
BSD ping doesn't count either IP or ICMP headers in packetsize.
Verbose output. You see other ICMP packets that are not normally
considered ``interesting'' (and rarely are).
Sample ping sessions
This ping session shows a ten packet exchange over the loopback
interface. One line is printed for every reply received. Note
that for each sequence number, a single reply is received,
and they are all in order. The IP TTL values are reported,
as are the round-trip times. Both are very consistent.
At the end of the session,
statistics are reported. Pinging the loopback interface is a
good way to test a machine's basic network configuration, since
no packets are physically transmitted. Any problems
in such a test is cause for alarm.
meikro$ ping -c10 localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=4 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=5 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=6 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=7 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=8 ttl=255 time=2 ms
64 bytes from 127.0.0.1: icmp_seq=9 ttl=255 time=2 ms
--- localhost ping statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max = 2/2/2 ms
The next session shows a more interesting example - a router
on the remote side of a medium speed (128Kbps) link. The initial
timings show consistent link behavior. However, about 50 seconds
into the trace, we see greater fluctuations in the RTT, which
approaches one minute for several packets. From packet 53 to
54, we see a factor of 26 reduction in RTT. But since
reductions in RTT rarely cause problems, this is not
as troublesome as the change from packet 54 to 55, a factor
of 7 increase in RTT. So what should the RTT be?
Well, we're transferring 56 data bytes, plus an 8 byte ICMP header
(64 ICMP bytes), plus a 20 byte IP header - 84 byte packets.
At 128 kilobits per second,
84 bytes should require about 84*(8/128000) = 6 ms
to transfer. Since the packet has to go both ways, we expect
10-15 ms round-trip times. None of these values are that
low; clearly there are problems with this link.
More than anything else, it is simply overcrowded.
access 9 >ping sl-stk-3-S17-128k.sprintlink.net
PING sl-stk-3-S17-128k.sprintlink.net (184.108.40.206): 56 data bytes
64 bytes from 220.127.116.11: icmp_seq=0 ttl=254 time=35.653 ms
64 bytes from 18.104.22.168: icmp_seq=1 ttl=254 time=28.797 ms
64 bytes from 22.214.171.124: icmp_seq=2 ttl=254 time=28.559 ms
64 bytes from 126.96.36.199: icmp_seq=3 ttl=254 time=39.533 ms
64 bytes from 188.8.131.52: icmp_seq=4 ttl=254 time=28.621 ms
64 bytes from 184.108.40.206: icmp_seq=5 ttl=254 time=28.159 ms
64 bytes from 220.127.116.11: icmp_seq=50 ttl=254 time=848.810 ms
64 bytes from 18.104.22.168: icmp_seq=51 ttl=254 time=828.579 ms
64 bytes from 22.214.171.124: icmp_seq=52 ttl=254 time=753.865 ms
64 bytes from 126.96.36.199: icmp_seq=53 ttl=254 time=778.202 ms
64 bytes from 188.8.131.52: icmp_seq=54 ttl=254 time=29.913 ms
64 bytes from 184.108.40.206: icmp_seq=55 ttl=254 time=220.931 ms
64 bytes from 220.127.116.11: icmp_seq=56 ttl=254 time=173.661 ms
64 bytes from 18.104.22.168: icmp_seq=57 ttl=254 time=144.990 ms
64 bytes from 22.214.171.124: icmp_seq=58 ttl=254 time=28.520 ms
access 10 >
What you might see
- Dropped packets
A unfortunate fact of life. Detect them by noting when the sequence
numbers skip, and the missing number does not appear again later.
This is probably caused by a router queueing packets for
a relatively slow link, and the queue simply grew too large.
Early TCP implementations dropped packets at a truly alarming rate,
but things have gotten better. Even so, there are common situations,
typically involving crowded wide-area networks, in which even modern TCP
implementations can't operate steady-state without dropping packets.
There's no reason to pull your out hair over this, since TCP will
retransmit missing data, but this won't make your network run faster.
Also, if you have fast links that aren't showing much congestion,
the cause of trouble may be elsewhere - link-level failures are the
next most common cause of packet loss.
I'd suggest using the techniques mentioned above to narrow down as
much as possible where packets are being dropped, and try to understand
why this is happening, even if fixing it is beyond your control.
- Fluctuating Round Trip Times
Another fact of life. Pretty much caused by the same things that
cause packet loss. Again, not serious cause for alarm, but
don't expect optimum performance from TCP. Remember that
TCP generates an internal RTT estimate that affects protocol
behavior. If the actual RTT changes too much, TCP may
never be able to make a satisfactory estimate.
Both dropped packets and RTT fluctuations may
occur in a periodic nature - a batch of slow packets every
30 seconds, for instance. If you see this symptom, check
for routing updates or other periodic traffic with the
same period as the problem. Poor network performance can
often be traced to slow links being clogged with various
kinds of automated updates.
- Connectivity that comes and goes
Again, look for periods between problems that are multiples of
some common number - 10 and 15 seconds are good things to check.
If a router is sending error messages when connectivity disappears,
that router's the first place to start looking. However, just because
you can always reach hop 5, for instance, doesn't mean that
your problem isn't hop 3. Hop 3's router may be erroneously timing
out routing information for your target, but handling hop 5's
routing information just fine. Of course, check hop 5 first
if that's where your packets seem to check out but never leave.
- Ping works fine but TELNET/FTP/Mail/News/... doesn't
Good news - it's (probably) not a hardware problem. Use a packet
tracer of some sort to see what TTL values are being generated
by your hosts. If they're too low, you can see this kind of
behavior. It could also be a software or configuration problem - can other
machines connect to the offending host? Can it talk to itself?
On the other hand, it could be a hardware problem,
if one of your links is showing data-dependent behavior. The
telltale symptom is when FTP (for example) can transfer some files fine,
but others always have problems. Once you've found an offending
file, trying breaking it into smaller and smaller pieces and see
which ones don't work. If the pieces becomes too small to
detect problems, duplicate them several times to get a larger file.
Once you've found a small pattern that you suspect is causing your
grief, see if you can load it into ping packets (BSD PING's `-p' switch)
and reproduce the trouble.
Next: RFC Examples
Connected: An Internet Encyclopedia