Introduction to the Domain Name System
DNS short for Domain name System is a protocol used primarily for converting hostnames like www.example.com into IP addresses like 192.168.1.10, and vice-versa. At the IP level, all hosts on the Internet refer to each other by IP addresses, not by the hostnames that users enter into programs like web browsers and telnet clients. This means that a system needs a way of finding out the IP address associated with a hostname before they can communicate. Although there are several ways this can be done (such as reading the /etc/hosts file or querying an NIS Server), DNS is the most common.
As well as looking up IP addresses for hostnames, the DNS protocol can also be used to find the hostname associated with an IP address. This is most often used for finding the hostname of a client that is connecting to a server, such as a webserver or SSH daemon. DNS can also be used to look up the address of a mail server for a domain, and additional information about a host such as its location, operating system or owner. However, by far its most common application is converting hostnames to IP addresses.
Most systems use the DNS protocol to send requests to a server, which does most of the work of resolving a hostname into an IP address. A normal system is only a DNS client, and never has to answer requests from servers. Almost all companies, organizations and ISPs will already have one or more DNS servers on their network that all the other hosts can use. If your company already has a DNS server, then there is no need to read this page - instead, see the Network Configuration page for information on how to set up your Linux system as a DNS client.
Zones
The domain name system is divided into zones (also called domains), each of which has a name like example.com or foo.com.au. Zones are arranged in a hierarchy, which means that the foo.com.au zone is part of the com.au zone, which in turn is part of the au domain. At the very top of the hierarchy is the . or root zone, upon which the entire DNS system depends.
For each zone, there is at least one DNS server that is primarily responsible for providing information about it. There may also be several secondary or slave servers that have copies of information from the primary, and act as backups in case the master server for the zone is unavailable. A single DNS server may host multiple zones, or sometimes may not host any at all. A server is typically responsible for providing information about the zones that it hosts, and for looking up information in other zones when requested to by DNS clients.
For a zone hosted by a server to be available to DNS clients that do not query that server directly, it must be registered in the parent zone. The most common parent domains like .com, .net and .com.au are managed by companies that charge for zones registered under them. This means that you cannot simply set up a DNS server that hosts a domain like example.com and expect it to be visible to the rest of the Internet - you must also pay for it to be registered with one of the companies that adds sub-domains to the .com domain.
Each zone contains multiple DNS records, each of which has a name, type and values. The most common type of record is the address or A record, which associates a hostname with an IP address. Other types include the NS or name server record which specifies the DNS server for the zone or a sub-domain, and the MX or mail server record type which defines a host that should receive mail for the zone.
Master and slaves
Every zone should have at least one secondary server in case the primary is down or un-contactable for some reason. Secondaries can also share the load on the primary server, because other servers looking up records in the domain will randomly choose a server to query instead of always asking the primary first. In fact, there is no way for other systems to know which server is the master and which are the slaves for a particular zone.
Slave servers can request a copy of all the records in a zone at once by doing a zone transfer. This is done a secondary DNS server when a zone is first added to it, and periodically when it detects that the zone has changed or the records in it have expired. A master server can also be configured to notify slaves when a zone changes so that they can perform a zone transfer immediately, ensuring that they are always up to date.
Every zone has a serial number, which is simply a counter that must be incremented each time any record in the zone is changed. The serial is used by slave servers to determine if a zone has changed, and thus if a transfer is needed. Most of the time, it does not matter what the serial number is as long as it gets incremented. However, some domain authorities require it to be in a certain date-based format, such as YYYYMMDDnn.
Normally a single server hosts either entirely master zones, or entirely slaves. However, this does not have to be the case- a DNS server can be both a master for some zones and a slave for others. There is no upper limit on the number of servers a zone can have, although few have more than three. The important .com and root domains have 13 servers, as they are critical to the functioning of the Internet and frequently accessed. Generally, the more slaves a domain has the better, as long as they can all be kept synchronized.
Lookup
When a server receives a request from a client to lookup a record, it first checks to see if the record is in one of the zones that it hosts. If so, it can supply the answer to the client immediately. However, if the record is not in a hosted zone then the server must query other servers to find it. It starts by querying one of the servers responsible for the root zone, which will reply with the address of another DNS server. It then queries that other server, which will either provide an answer, or the address of yet another DNS server to ask. This process continues until a server that is responsible for the domain is found and an answer retrieved from it. If the record that the client asked for does not actually exist, then one of the servers in the query process will say so, and the search will be terminated.
For example, imagine if a DNS client asked a server for the IP address of www.webmin.com. The steps that would be followed by the server to find the address are:
- Ask one of the root servers, such as a.root-servers.net (198.41.0.4) for the address of www.webmin.com. The server would reply with a list of servers for the .com domain, one of which is a.gtld-servers.net (192.5.6.30).
- Ask the .com server for the address of www.webmin.com. The reply would be a list of servers, one of which is au.webmin.com (203.89.239.235), the master server for the webmin.com domain.
- As the server for webmin.com for the address of www.webmin.com. The reply would be 216.136.171.204, which is the correct IP address.
- The resulting IP address is returned to the client, along with a TTL (time to live) so that the client knows how long it can cache the address for.
As you can see, a DNS server can find the address of any host on the Internet by following the simple process used in the steps above. The only addresses that it cannot discover are those of the root servers; instead, they are read from a file when the server program starts. Because the addresses of the root servers very rarely change, it is safe for a DNS server to store them in a fixed file.
If the steps above were followed exactly for every DNS request, then the root servers would have to be queried every time a client anywhere in the world wanted to lookup an IP address. Even though there are 13 of them, there is no way that they could deal with this massive amount of network traffic. Fortunately, DNS servers do not really query the root servers for every request - instead, they cache results so that once the IP address of a server for the .com domain is known, there is no need to ask for root servers for it again. Because every response from a server includes a TTL, other servers know how long it can be safely cached for.
The relationships between IP addresses and their hostnames are stored in the DNS in a different way to the relationship between hostnames and addresses. This is done so that it is possible to lookup a hostname from an IP using a similar process to the steps above. However, this means that there may be a mismatch between the relationship between an IP address and hostname, and between the hostname and IP address. For example, www.webmin.com resolves to 216.136.171.204, but 216.136.171.204 resolves to usw-pr-vhost.sourceforge.net! This can be confusing, but is an inevitable result of the way that queries for IP addresses work.
Reverse lookup
When a client wants to find the hostname for an IP address like 216.136.171.204, it converts this address to the record 204.171.136.216.in-addr.arpa. As you can see, this is just the IP address reversed with in-addr.arpa appended to the end. The special in-addr.arpa zone is hosted by the root DNS servers, and its sub-domains are delegated to other DNS servers in exactly the same way that forward zones are. Typically each of the final class C zones (like 171.136.216.in-addr.arpa) will be hosted by the DNS server for the company or ISP that owns the matching class C network, so that it can create records that map IP addresses in that network to hostnames. All of these records are of the special PTR or reverse address type.
The biggest problem with this method of reverse zone hosting is that there is no easy way for anything smaller than a class C network (which contains 256 addresses) to be hosted by a single DNS server. So if a server hosts the zone example.com which contains just a single record, www.example.com with IP address 1.2.3.4, the same server cannot also control the reverse mapping for the IP address 1.2.3.4. Instead, this will be under the control of the ISP or hosting company whose network the webserver for www.example.com is on. Only organizations big enough to own an entire class C network can host the reverse zone for that network on their own DNS server.
Private ranges
Many organizations have an internal network that uses private IP addresses such as those starting with 192.168. A network like this might not be connected to the Internet at all, or connected only through a firewall doing NAT. Some people even have networks like this at home, with several machines connected to a small LAN. Only one of these machines (the gateway) might have a single real Internet IP address assigned by an ISP.
On a private network like this, it can also make sense to run a DNS server to assign hostnames to the systems on the internal LAN. It is quite possible to host a zone called something like home or internal that contains records for internal systems, as well as a reverse zone for the 192.168 network so that IP addresses can be looked up as well. The server can also be set up to resolve real Internet hostnames by querying the root servers, just as any normal Internet-connected DNS server would. However, it will never receive queries from outside the LAN for records in the home network, because as far as the rest of the Internet is concerned that zone does not exist.