The Address Resolution Protocol
When a device needs to send an IP packet to another device on the local network, the IP software will first check to see if it knows the hardware address associated with the destination IP address. If so, then the sender just transmits the data to the destination system, using the protocols and addressing appropriate for whatever network medium is in use by the two devices. However, if the destination system's hardware address is not known, then the IP software has to locate it before any data can be sent. At this point, IP will call on ARP to locate the hardware address of the destination system.
ARP achieves this by issuing a low-level broadcast onto the network, requesting that the system that is using the specified IP address respond with its hardware address. If the destination system is powered up and on the network, it will see this broadcast (as will all of the other devices on the local network), and it will return an ARP response back to the original system. Note that the response is not broadcast back over the network, but is instead sent directly to the requesting system.
ARP packets work at the data-link layer, the same as IP packets. As such, ARP packets are completely separate from IP packets, and even have a different protocol ID of "0806," instead of "0800" as used with IP.
ARP packets contain several fields, although only five of them are used to provide the bulk of ARP's functionality: the hardware address of the source, the IP address of the source, the hardware address of the destination, the IP address of the destination, and a "message-type" field that indicates whether the current ARP packet is a request or a response to a request.
When a device issues an ARP request, it fills in three of the four address-related fields, providing it's own hardware and IP address, as well as the IP address of the target (the target's hardware address is unknown, and so that field is filled with zeroes). In addition, it will set the message-type field to indicate that the current packet is an ARP request, and then broadcast the request onto the local network for all devices to see.
All of the local devices should monitor the network for ARP broadcasts, and whenever they see a request for themselves (as indicated in the destination IP address field of the ARP request), they should generate a response packet and send it back to the requesting system. The response packet will consist of the local device's IP and hardware addresses (placed into the sender fields), and the IP and hardware address of the original sender (placed in the destination fields of the response packet). The response will also be marked as such, with the message-type field indicating that the current packet is an ARP response. The new ARP packet is then unicast directly to the original requester, where it is received and processed.
The ARP Cache
When the requesting system gets an ARP response, it will store the hardware and IP address pair of the requested device into a local cache. The next time that system needs to send data, it will check the local cache, and if an entry is found it will go ahead and use it, eliminating the need to broadcast another request.
Likewise, the system that responded to the ARP broadcast will also store the hardware and IP addresses of the system that sent the original broadcast. If it did not do so, then it would have to issue an ARP broadcast to find out where to send the ARP response.
Most systems have a very limited ARP cache, with only enough room for a few entries. These entries will be overwritten as needed. This can be a problem with busy networks.
Note that many large, multi-user systems and network routers often have very large ARP caches in order to prevent these types of problems from occurring in the first place. For example, high-end Cisco routers have an ARP cache that is large enough to hold several hundred entries, since the router is likely to exchange data with each PC on the local network quite frequently. Having a large cache on these types of devices is essential to smooth operation, since otherwise the servers and routers could only communicate with a few PCs simultaneously.
Systems should flush entries from the ARP caches after they have been unused for a while. This allows well-known IP addresses to be moved to a different machine - or for a well-known machine to be given a new IP address - without communication problems coming about due to stale (and invalid) address mappings. ARP cache timeout values that are too high will cause problems whenever a host is assigned a different IP address, since the other hosts who have an older entry in their caches will still try to send data to the old (and invalid) hardware address.
Conversely, an ARP time-out that is too short will also result in problems, especially on busy networks with lots of devices. If network clients are constantly flushing their ARP caches due to short time-out values, then many broadcasts will be required. This will have a direct, negative impact on performance, since the IP software will not be able to send any data until an ARP broadcast has been sent and responded to.
Proxy ARP
Sometimes it is useful to have a device respond to ARP broadcasts on behalf of another device. This is particularly useful on networks with dial-in servers that connect remote users to the local network. In such a scenario, a remote user might have an IP address that appears to be on the local network, although the user's system would not be physically present (it would be at a remote location, connected through the dial-in server).
Systems that were trying to communicate with this node would believe that the device was local, and would use ARP to try and find the associated hardware address. However, since the system is remote, it would not see (nor respond to) the ARP lookups. Normally, this type of problem is handled through Proxy ARP, which allows a dial-in server to respond to ARP broadcasts on behalf of any remote devices that it services.
Inverse ARP
Inverse ARP works exactly the opposite of regular ARP. Rather than a device needing to find the hardware address associated with a known IP address, Inverse ARP is used to find the IP address for a known hardware address. The only time this protocol is used is with Frame Relay and ATM networks. It isn't needed on LAN topologies where all of the devices can find each other easily.
Reverse ARP
Reverse ARP is used to allow a diskless workstation to request it's own IP address, simply by providing its hardware address in an ARP request broadcast. A Reverse ARP server will see this request, assemble an ARP response that contains an IP address for the requester to use, and then send it back to the requesting device. The workstation will then use this IP address for its networking activity. RARP servers aren't very common anymore, although Linux and MacOS still support it, as do many X Window System terminals.
A special host (called a RARP server) watches for RARP broadcasts (RARP packets have their own unique ethertype of "8035"). When one is seen, the server attempts to find an entry for the requesting device's hardware address in a local table of IP-to-hardware address mappings. If a match is found, it returns a RARP response to the requesting device, providing it with the IP address needed to continue the boot process.
DHCP ARP
DHCP ARP is used by devices that obtain an IP address using an address-allocation protocol such as DHCP (thus the moniker). The purpose of DHCP ARP is to allow the device to probe the network for any other device that may be using the assigned IP address already, prior to the device actually trying to use the address. This helps prevent problems with duplicate or overlapping address assignments, as the requester can just reject the assignment if another device responds to the DHCP ARP query.
With DHCP ARP, the requesting device issues a normal ARP request, except that instead of putting its own IP address in the Source Protocol Address field, it puts in "0.0.0.0". The rest of the packet looks like a normal ARP request, with the local hardware address in the Source Hardware Address field, the questionable IP address in the Destination Protocol Address field, and the Destination Hardware Address field containing zeroes.
Gratuitous ARP
Gratuitous ARP is used when a device issues an ARP broadcast for the sole purpose of keeping other devices informed of its presence on the network. There is no information gained by the requesting device in this scenario. However, other devices that already know about this device will update their cache entries, keeping them from being expired too quickly.
When a Gratuitous ARP request is broadcast, the sender puts its hardware and IP address information into the appropriate sender fields, and also places its IP address in the destination IP address field. It does not put its hardware address in the destination hardware field, however. Other devices on the network will see this broadcast, and if they have the sender's information in their cache already, they'll either restart the cache entry's count-down timers or modify the entry to use the new hardware address (in case the network adapter on the sending host has been changed).
UnARP
Similarly, UnARP provides a mechanism for de-registering ARP entries. When a device wishes to leave the network, it can issue an UnARP broadcast, causing the other devices to clear their ARP caches of entries for this device. This variation is not standardized, and isn't widely used.
UnARP dictates that a special ARP packet should be broadcast whenever a node disconnects from the network, explicitly telling other devices that the node is going away, and that the cache entry for that host should be flushed. This would allow another device (such as another DHCP client) to reuse the IP address immediately, without having to worry about stale caches causing any problems.
This material is excerpted from Internet Core Protocols: the Definitive Guide courtesy of O'Reilly & Associates.