Historically, recursor sends the full domain name to any intermediary as it finds its way to the root or authoritative DNS. This meant that if you were going to , the root server and the . com server would both be queried with the full domain name i. e.
the the cloudflare, and the com parts, even though the root servers just need to redirect the recursive to dot com independent of anything else in the fully qualified domain name. This ease of access to all this personal browsing information via DNS presents a grave privacy concern to many. This has been addressed by several resolvers’ software packages, though not all solutions have been widely adapted or deployed. The DNS resolver, 1. 1.
1. 1, provides, on day one, all defined and proposed DNS privacy protection mechanisms for use between the stub resolver and recursive resolver. For those not familiar, a stub resolver is a component of your operating system that talks to the recursive resolver. By only using DNS Query Name Minimisation defined in RFC7816, DNS resolver, 1. 1. 1.
1, reduces the information leaked to intermediary DNS servers, like the root and TLDs. That means that DNS resolver, 1. 1. 1. 1, only sends just enough of the name for the authority to tell the resolver where to ask the next question.
With DNS aggressive negative caching, as described in RFC8198, we can further decrease the load on the global DNS system. This technique first tries to use the existing resolvers negative cache which keeps negative or non existent information around for a period of time. For zones signed with DNSSEC and from the NSEC records in cache, the resolver can figure out if the requested name does NOT exist without doing any further query. So if you type dot something and then dot something, the second query could well be answered with a very quick “no” NXDOMAIN in the DNS world. Aggressive negative caching works only with DNSSEC signed zones, which includes both the root and a 1400 out of 1544 TLDs are signed today. Initially, we thought about building our own resolver, but rejected that approach due to complexity and go to market considerations.
Then we looked at all open source resolvers on the market; from this long list we narrowed our choices down to two or three that would be suitable to meet most of the project goals. In the end, we decided to build the system around the Knot Resolver from CZ NIC. This is a modern resolver that was originally released about two and a half years ago. By selecting the Knot Resolver, we also increase software diversity. The tipping point was that it had more of the core features we wanted, with a modular architecture similar to OpenResty. The Knot Resolver is in active use and development.
When a resolver needs to get an answer from an authority, things get a bit more complicated. A resolver needs to follow the DNS hierarchy to resolve a name, which means it has to talk to multiple authoritative servers starting at the root. For example, our resolver in Buenos Aires, Argentina will take longer to follow a DNS hierarchy than our resolver in Frankfurt, Germany because of its proximity to the authoritative servers. In order to get around this issue we prefill our cache, out of band, for popular names, which means when an actual query comes in, responses can be fetched from cache which is much faster. Over the next few weeks we will post blogs about some of the other things we are doing to make the resolver faster and better, Including our fast caching.
One issue with our expansive network is that the cache hit ratio is inversely proportional to the number of nodes configured in each data center. If there was only one node in a data center that’s nearest to you, you could be sure that if you ask the same query twice, you would get a cached answer the second time. However, as there’s hundreds of nodes in each of our data centers, you might get an uncached response, paying the latency price for each request. One common solution is to put a caching load balancer in front of all your resolvers, which unfortunately introduces a single point of failure. We don’t do single point of failures.