Thursday, June 28, 2018

domain name system - Why does DNS work the way it does?




This is a Canonical Question about DNS (Domain Name Service).





If my understanding of the DNS system is correct, the .com registry holds a table that maps domains (www.example.com) to DNS servers.




  1. What is the advantage? Why not map directly to an IP address?


  2. If the only record that needs to change when I am configuring a DNS server to point to a different IP address, is located at the DNS server, why isn't the process instant?


  3. If the only reason for the delay are DNS caches, is it possible to bypass them, so I can see what is happening in real time?



Answer



Actually, it's more complicated than that - rather than one "central registry (that) holds a table that maps domains (www.mysite.com) to DNS servers", there are several layers of hierarchy




There's a central registry (the Root Servers) which contain only a small set of entries: the NS (nameserver) records for all the top-level domains - .com, .net, .org, .uk, .us, .au, and so on.



Those servers just contain NS records for the next level down. To pick one example, the nameservers for the .uk domain just has entries for .co.uk, .ac.uk, and the other second-level zones in use in the UK.



Those servers just contain NS records for the next level down - to continue the example, they tell you where to find the NS records for google.co.uk. It's on those servers that you'll finally find a mapping between a hostname like www.google.co.uk and an IP address.



As an extra wrinkle, each layer will also serve up 'glue' records. Each NS record maps a domain to a hostname - for instance, the NS records for .uk list nsa.nic.uk as one of the servers. To get to the next level, we need to find out the NS records for nic.uk are, and they turn out to include nsa.nic.uk as well. So now we need to know the IP of nsa.nic.uk, but to find that out we need to make a query to nsa.nic.uk, but we can't make that query until we know the IP for nsa.nic.uk...



To resolve this quandary, the servers for .uk add the A record for nsa.nic.uk into the ADDITIONAL SECTION of the response (response below trimmed for brevity):




jamezpolley@li101-70:~$dig nic.uk ns

; <<>> DiG 9.7.0-P1 <<>> nic.uk ns
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21768
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 0, ADDITIONAL: 14

;; QUESTION SECTION:

;nic.uk. IN NS

;; ANSWER SECTION:
nic.uk. 172800 IN NS nsb.nic.uk.
nic.uk. 172800 IN NS nsa.nic.uk.

;; ADDITIONAL SECTION:
nsa.nic.uk. 172800 IN A 156.154.100.3
nsb.nic.uk. 172800 IN A 156.154.101.3



Without these extra glue records, we'd never be able to find the nameservers for nic.uk. and so we'd never be able to look up any domains hosted there.



To get back to your questions...




a) What is the advantage? Why not map directly to an IP address?




For one thing, it allows edits to each individual zone to be distributed. If you want to update the entry for www.mydomain.co.uk, you just need to edit the information on your mydomain.co.uk's nameserver. There's no need to notify the central .co.uk servers, or the .uk servers, or the root nameservers. If there was only a single central registry that mapped all the levels all the way down the hierarchy that had to be notified about every single change of a DNS entry all the way down the chain, it would be absolutely swamped with traffic.




Before 1982, this was actually how name resolution happened. One central registry was notified about all updates, and they distributed a file called hosts.txt which contained the hostname and IP address of every machine on the internet. A new version of this file was published every few weeks, and every machine on the internet would have to download a new copy. Well before 1982, this was starting to become problematic, and so DNS was invented to provide a more distributed system.



For another thing, this would be a Single Point of Failure - if the single central registry went down, the entire internet would be offline. Having a distributed system means that failures only affect small sections of the internet, not the whole thing.



(To provide extra redundancy, there are actually 13 separate clusters of servers that serve the root zone. Any changes to the top-level domain records have to be pushed to all 13; imagine having to coordinate updating all 13 of them for every single change to any hostname anywhere in the world...)




b) If the only record that needs to change when I am configuring a DNS
server to point to a different IP address is located at the DNS

server, why isn't the process instant?




Because DNS utilises a lot of caching to both speed things up and decrease the load on the NSes. Without caching, every single time you visited google.co.uk your computer would have to go out to the network to look up the servers for .uk, then .co.uk, then .google.co.uk, then www.google.co.uk. Those answers don't actually change much, so looking them up every time is a waste of time and network traffic. Instead, when the NS returns records to your computer, it will include a TTL value, that tells your computer to cache the results for a number of seconds.



For example, the NS records for .uk have a TTL of 172800 seconds - 2 days. Google are even more conservative - the NS records for google.co.uk have a TTL of 4 days. Services which rely on being able to update quickly can choose a much lower TTL - for instance, telegraph.co.uk has a TTL of just 600 seconds on their NS records.



If you want updates to your zone to be near-instant, you can choose to lower your TTL as far down as you like. The lower your set it, the more traffic your servers will see, as clients refresh their records more often. Every time a client has to contact your servers to do a query, this will cause some lag as it's slower than looking up the answer on their local cache, so you'll also want to consider the tradeoff between fast updates and a fast service.





c) If the only reason for the delay are DNS caches, is it possible to
bypass them, so I can see what is happening in real time?




Yes, this is easy if you're testing manually with dig or similar tools - just tell it which server to contact.



Here's an example of a cached response:



jamezpolley@host:~$dig telegraph.co.uk NS


; <<>> DiG 9.7.0-P1 <<>> telegraph.co.uk NS
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36675
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;telegraph.co.uk. IN NS

;; ANSWER SECTION:

telegraph.co.uk. 319 IN NS ns1-63.akam.net.
telegraph.co.uk. 319 IN NS eur3.akam.net.
telegraph.co.uk. 319 IN NS use2.akam.net.
telegraph.co.uk. 319 IN NS usw2.akam.net.
telegraph.co.uk. 319 IN NS use4.akam.net.
telegraph.co.uk. 319 IN NS use1.akam.net.
telegraph.co.uk. 319 IN NS usc4.akam.net.
telegraph.co.uk. 319 IN NS ns1-224.akam.net.

;; Query time: 0 msec

;; SERVER: 97.107.133.4#53(97.107.133.4)
;; WHEN: Thu Feb 2 05:46:02 2012
;; MSG SIZE rcvd: 198


The flags section here doesn't contain the aa flag, so we can see that this result came from a cache rather than directly from an authoritative source. In fact, we can see that it came from 97.107.133.4, which happens to be one of Linode's local DNS resolvers. The fact that the answer was served out of a cache very close to me means that it took 0msec for me to get an answer; but as we'll see in a moment, the price I pay for that speed is that the answer is almost 5 minutes out of date.



To bypass Linode's resolver and go straight to the source, just pick one of those NSes and tell dig to contact it directly:



jamezpolley@li101-70:~$dig @ns1-224.akam.net telegraph.co.uk NS


; <<>> DiG 9.7.0-P1 <<>> @ns1-224.akam.net telegraph.co.uk NS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23013
;; flags: qr aa rd; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:

;telegraph.co.uk. IN NS

;; ANSWER SECTION:
telegraph.co.uk. 600 IN NS use2.akam.net.
telegraph.co.uk. 600 IN NS eur3.akam.net.
telegraph.co.uk. 600 IN NS use1.akam.net.
telegraph.co.uk. 600 IN NS ns1-63.akam.net.
telegraph.co.uk. 600 IN NS usc4.akam.net.
telegraph.co.uk. 600 IN NS ns1-224.akam.net.
telegraph.co.uk. 600 IN NS usw2.akam.net.

telegraph.co.uk. 600 IN NS use4.akam.net.

;; Query time: 9 msec
;; SERVER: 193.108.91.224#53(193.108.91.224)
;; WHEN: Thu Feb 2 05:48:47 2012
;; MSG SIZE rcvd: 198


You can see that this time, the results were served directly from the source - note the aa flag, which indicates that the results came from an authoritative source. In my earlier example, the results came from my local cache, so they lack the aa flag. I can see that the authoritative source for this domain sets a TTL of 600 seconds. The results I got earlier from a local cache had a TTL of just 319 seconds, which tells me that they'd been sitting in the cache for (600-319) seconds - almost 5 minutes - before I saw them.




Although the TTL here is only 600 seconds, some ISPs will attempt to reduce their traffic even further by forcing their DNS resolvers to cache the results for longer - in some cases, for 24 hours or more. It's traditional (in a we-don't-know-if-this-is-really-neccessary-but-let's-be-safe kind of way) to assume that any DNS change you make won't be visible everywhere on the internet for 24-48 hours.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...