DNS: A Classic UDP Protocol

by: burt rosenberg
at: university of miami
date: 5 april 2021

History and Purpose of DNS

The Domain Name Service is the directory for the Internet. It is a distributed database that is queried for important information, such as,

We have two main reasons for studying this protocol, The issues introduced are not just limited to the communication aspects. DNS is the forerunner of a class of distributed databases running in a eventually consistent semantics. While the topic of databases, of consensus between running processes, and so on, are topics on their own, I hereby claim them the concern of a course like CSC424.

As a programmer you will be called upon to solve practical problems and you will have to design solutions involving networked computers. DNS provides ideas as to how to solve some difficult problems you might encounter.

The Data Model

The DNS data model is a tree of a Domain Names, with a collection of Resource Records attached to each domain name.

Each node in the tree has a label. The root node has an empty label, no other label is empty, and sibling labels must be distinct. A domain name is a sequence of these labels, created by dot-separating the labels along a path in the tree. If the path begins at the root, then it is a Fully Qualified Domain Name (FQDN). An FQDN will end in a dot (because the last label is the empty string). And example of a FQDN is cs.miami.edu.. Otherwise cs.miami is domain name relative to the dot-edu domain.

A domain is a subtree of this tree. The domain name of the node that is the root if this tree is the name of the domain.

                                  FQDN
                                  
                    * ""  root    .""
                   / \
                  / 
                 * "edu"         "edu".""
                / \
                   \ 
                    * "miami"    "miami"."edu".""
                   / \
                  /
+------+  <====  * "cs"          "cs"."miami"."edu".""
|  RR  |
+------+
|  RR  |          The * are nodes, each with its FQDN a . separated list of node labels.
+------+          To each node is attached a collection of RR's such as shown for cs.miami.edu.
|      |          The RR's have types such as A, SOA, NS, etc, each with a value and a TTL
   ...
|      |
+------+          THE DOMAIN NAME SYSTEM DATA ARCHITECTURE
            

A zone is a collection of connected nodes that has a common ancestor in the tree. That is, it is a domain with zero or more sub-domains removed. The DNS system is a client-server architecture running on UDP with the name servers listening on well-known port 53. A name server serves RR records for all domain names in the zone over which it has authority, and is called the authoritative server for the domain. A name server can serve multiple zones.

The root node for the zone is called the start of authority (SOA) of the zone. Name servers are made responsible for answering all queries in their zone of authority. Name servers are themselves stored in the DNS system as NS RR (Name Server Resource Records). but does not necessarily contain the entier domain. All the nodes in a zone will have their Resource Records maintained by a single authority. A RR called the NS RR (Name Server Resource Record) will be placed in the parent of the SOA node and will provide the IP address of a server authoritative for the zone.

Example: A user wishes to browse www.cs.miami.edu, and so needs the IP address at which an http server is listening on port 80. It needs the contents of the A record for the domain name www.cs.miami.edu. The process of satisfying the query is called name resolution and the software that does this is is a name resolver.

  1. The name resolver is configured with the IP address of name servers for the root domain. The resolver picks on and queries an NS record for edu.
  2. The NS record returns an IP address of a name server authoritative for the .edu domain. (Actually it returns a name, and DNS is again used to resolve the name to an IP address.) The resolver queries at that IP for the NS record for miami.edu.
  3. And so on until the resolver as the IP authoritative for cs.miami.edu. The domain www.cs.miami.edu is in the cs.miami.edu zone, hence the resolver can ask for the A RR, (A rec) of domain www.cs.miami.edu.
  4. The IP address answer to this query is returned by the resolver as the result of the resolution.

raritan% dig NS .

;; ANSWER SECTION:
.			233784	IN	NS	k.root-servers.net.

raritan% dig A k.root-servers.net

;; AUTHORITY SECTION:
k.root-servers.net.	259200	IN	A	193.0.14.129

raritan% dig @193.0.14.129 NS edu.

;; AUTHORITY SECTION:
edu.			172800	IN	NS	f.edu-servers.net.

raritan% dig @f.edu-servers.net. NS miami.edu

;; AUTHORITY SECTION:
miami.edu.		172800	IN	NS	cgacena1.miami.edu.

;; ADDITIONAL SECTION:
cgacena1.miami.edu.	172800	IN	A	129.171.32.1

raritan% dig NS cs.miami.edu

;; ANSWER SECTION:
cs.miami.edu.		360	IN	NS	ns1.cs.miami.edu.

raritan% dig A ewell.cs.miami.edu

;; ANSWER SECTION:
ewell.cs.miami.edu.	289	IN	A	192.31.89.12


raritan% dig @ns1.cs.miami.edu A ewell.cs.miami.edu

;; ANSWER SECTION:
ewell.cs.miami.edu.	360	IN	A	192.31.89.12

;; AUTHORITY SECTION:
cs.miami.edu.		360	IN	NS	ns1.cs.miami.edu.

Resource Records

Each node stores a collection of Resource Records (RR). Each RR has a type, a class, a TTL (time to live) and a value. Possible types are given in the box.

SOA
Start Of Authority. Placed at the root node of the zone to contain important information about the zone.
NS
Name Server. Gives the name of a name server authoritative for the domain. It is placed both in the domain name of the root of the zone and also in the parent node. This glues the DNS tree together.
A
The IPv4 address for the domain name.
MX
A Mail Exchanger record.
CNAME
A Canonical Name record.

The resolver can return either authoritative or non-authoritative answers. An answer from an name server authoritative for the domain name is an authoritative answer. Any name server can return a non-authoritative answer out of their cache of previously queried names, as long as the answer has not expired its TTL.

The last two dig's in the example in the above box show non-authoritative answers with shortened TTL's. The A record received for ewell had TTL 289. Going back for an authoritative answer, the TTL was 360. The non-authoritative answer was stale by 71 seconds.

The program dig gives command line access to the DNS system.

Eventually Consistent Databases

DNS is a distributed database. It is one of the first major distributed database and it pioneered ideas whose significance became clear only recently. A problem in distributed database is consistency — getting the same answer to a query each time it is asked. Since the the database is distributed, different servers might have different notions of the state of the world, and might answer queries differently.

DNS adopted a strategy of caching results, so replies might contain stale data — the answer is wrong. DNS solved this problem by redefining right and wrong. It introduced the notion of eventual consistency, although this term was not invented until much later. See Eventually Consistent - Revisited by Amazon's CTO Werner Vogels.

Definition: An eventually consistent database might not be entirely consistent — answers to a particular query might disagree. But eventually (within the TTL) all answers will agree.

The consistency method is that of two diffrent replies to a query, the most recent value is more correct. The SOA has a serial number that increments. The more recent value is the one from the database with the larger serial number. The various players in this game will eventually converge on this value. Authoritative name servers will chat among themselves and zone transfer when a peer has a newer database. Cached values will eventually expire and re-request the value, and possibly by then its source has been updated.

Negative Replies

A RR record contains a TTL specifying how long it is valid. A non-answer, that a particular RR record does not exist, is a form of reply. Its TTL is given in the SOA RR for the zone authoritative for that domain name.

Name Servers and Clients

The DNS system is implemented using a client-server architecture on UDP and TCP well known port number 53. UDP is used in preference. A server will make a best-effort attempt to reply but if there is some sort of inadequacy the only thing the client can do is ask again, or ask another server. There is no point to the overhead of a connecting oriented protocol. TCP will be used when the answer exceeds the size of a unfragmented UDP packet. This will occur certainly when master/slave name servers exchange the zone database.

Name servers should not be confused with the nodes in the DNS databases tree. This is very easy to do — to mentally place the name server "at" the node for which it serves up answers. This is a bad mental image because it is not true and it leads to mistakes when trying to fix problems.

Typically the servers authoritative for a zone are in some large server farm, and the farm is serving many zones with nothing in common except they are paying the same server farm, for instance Amazon Route 53 is a DNS server in the cloud. When a consumer purchases a Domain Name, there is usually a name server solution bundled into the purchase, so many proud owners of domain names have no idea about their name server, or even that such a thing exists.

$TTL    360
$ORIGIN miami.edu.
cs      360     IN      SOA     ns1.cs.miami.edu. burt.cs.miami.edu.  (
                                2020121400      ; serial
                                720             ; refresh
                                180             ; retry
                                1400000         ; expire
                                180            ; minimum
                                )
                IN      NS      ns1.cs.miami.edu.
                IN      NS      ns-350.awsdns-43.com.
                IN      A       192.31.89.16
                IN      MX      10 ASPMX.L.GOOGLE.COM.
                IN      MX      20 ALT1.ASPMX.L.GOOGLE.COM  
                IN      TXT     "v=spf1 ip4:192.31.89.12 ip4:129.171.34.11 include:_spf.google.com ~all"
$ORIGIN cs.miami.edu.
;
; nodes in sub-domains
;
www             IN      A       192.31.89.16
web             IN      A       192.31.89.16
lee             IN      A       192.31.89.5
ewell           IN      A       192.31.89.12
svn             IN      CNAME   meade.cs.miami.edu.
ssh             IN      CNAME   lee.cs.miami.edu.
;
gmail.zinc      IN      CNAME   ghs.google.com.
gcal.zinc       IN      CNAME   ghs.google.com.
Shown above is an edited snippet of a text-file based representation of RR's for an implementation of DNS called BIND, for Berkely Internet Name Domain, a.k.a. named, pronounced name-dee.

This file is available to the named server running on port 53. The default TTL is set to an hour. Attached to the domain cs.miami.edu are RR's for the SOA, the NS for this domain, the mail servers for this domain, and a encoded message to mail servers implementing the anti-spam SPF protocol.

The ORIGIN shortcut then interprets the other labels in a subdomain of the cs.miami.edu domain, including A rec's for the web server, as commonly named www.cs.miami.edu, and a few machines. Records for a further subdomain, such as gmail.zinc.cs.miami.edu, are specified.

Completely apart from the DNS specification, BIND has a master/server architecture. Name servers should be replicated for high availability. If no authoritative name servers are available, and the TTL's of it's RR expire, the domain is for all intents and purposes off the net.

Availability is provided for by replication. The consistency of the databases is achieved through serial numbers and a master server than copies out its RR database to slave servers, on a periodic basis. The retry and expire fields in the SOA RR are the parameters for this copy-out. A numerically increasing serial number identifies the more recent RR database.

MX, CNAME and TXT RR's

The Mail Exchanger (MX) RR type announces a host name that is accepting mail (by the usual protocol and at the usual well known port) for the domain. There are multiple MX's to provide redundancy, so that mail relays are not hindered by undeliverable mail due to server outages. An MX record includes a priority, and attempts to deliver email will first be made at the lowest numbered by priority, and work towards higher numbers.

The Canoncial Name (CNAME) record forwards one domain to another. In the example it is used to provide a service name, svn.cs.miami.edu, the subversion server, with a hostname, meade.cs.miami.edu.

There are tradeoffs to with a more direct approach, where svn is a A rec, and the IP address provided is that of the hostname meade.cs.miami.edu. The CNAME requires requires that a client seeking svn.cs.miami.edu to return to the resolution process with meade.miami.edu for an A rec. There are further issues with SSL, that expects the server to present a certificate for certain domain name. Issues with CNAME are discussed in RFC 2219, and other RFC's.

The TXT rec will deliver an arbitrary string associated with the domain name, TXT recs have become a way of introducing new types without having to lobby and reissue the core protocol documents.

Email, the TCP protocol officially known as SMTP, the Simple Mail Transport Protocol, is easily open to attacks. Among the more concerning, phishing is an attempt by the attacker to place an email in your email box that appears to be from a trusted sender. This is one step among several by which the attacker entices you into an action that will compromise your security.

The problems of email spam and phishing are significant, concerning, and not easily solved. Shown here is a TXT record for the SPF protocol. SPF, the Sender Policy Framework, combats email spam and spoofing.

This TXT RR above uses an spf1 token to identify the record as an announcement in the SPF protocol. The announcement lists the IP addresses of servers most likely to be sending email matching the domain name of the sender email address. It cannot be fully enforced that mail from, say burt@cs.miami.edu, come from a machine with an IP adress announced in this TXT rec,, because the original SMTP protocol was very permissive about how email was routed. Email servers might not reject out-of-policy email, and rather warn the reader of the infraction.

Security

None. This is a problem.