APNIC Pty Ltd.

10/27/2020 | Press release | Distributed by Public on 10/27/2020 02:10

DNS evolution: Trust, privacy and everything else

We used to think of computer networks as being constructed on two fundamental common infrastructure components: names and addresses. Every connected device had a stable protocol address to allow all other devices to initiate a communication with this device by addressing a data packet to this protocol-specific address. And every device was also associated with a name, allowing human users and human use applications to use a more convenient alias for these protocol addresses.

By mapping names to protocol addresses, the realm of human use could navigate the network's services by symbolic names while at the level of packets, the data flow was directed by the network based on topology information of where these device addresses were located.

But that's 1980s thinking and 1980s network architectures. Architectures have evolved, and today's Internet architecture has dispensed with that view of IP addresses. These days, due in no small part to the exhaustion of the IPv4 address pool, but equally due to an evolving architecture that needs to cope with the massive explosion of numbers of devices in networks, we have shifted to a client/server architecture where clients initiate connections and servers respond.

This means that clients don't require a permanently assigned address, and can make use of an address only while it is communicating. Equally, on the server side we've seen the aggregation of uniquely named service points into service delivery platforms, and the multiplexing function that directs clients into the appropriate service rendezvous point is a function performed at an application level rather than as an infrastructure function.

We're now using the address infrastructure in very different ways than the way we had envisaged in the 1980s. The Internet's name infrastructure is subject to the same pressures for evolution.

How is the Domain Name System (DNS) responding? There are three parts to this survey: the first is trust, the second is privacy and the third is all the other stuff!

Trust

Can you believe what the DNS tells you? The answer is you probably can't!

This obvious failure in the trust model has been exploited in many ways by many parties. The DNS is seen as an overt control channel. For example, access to a named service can be blocked if that name does not resolve in the DNS. So we've seen the rise of deliberate lies in the DNS where content and services that are categorized as harmful are removed from access by withholding the associated name resolution in the DNS.

A number of open resolvers have turned this filtering of the DNS into a positive attribute, and there are a number of so-called 'clean feed' resolvers that do not resolve a collection of service names. Not only is this a distinguishing feature in the realm of competitive open resolvers, we've also seen a number of national regimes placing the onus on ISPs to block certain services, and given that addresses are no longer uniquely associated with individual services, the implementation of these national regulations is invariably performed through DNS blocking.

This has also been used to monetize the DNS, where 'no such domain' DNS responses are rewritten to send the user to a search site, through so-called NXDOMAIN rewriting. It has also been used in the IPv6 transition environment where the DNS records, the protocol addresses, are synthesized to allow the user to be steered through a transitional environment. The motives vary, but the result is the same, insofar as the DNS answer is a lie.

Read: Dealing with IPv6 fragmentation in the DNS

Then there are the hostile efforts to replace a genuine response with a lie in order to mislead the user. There is the technique of response guessing to try and insert a fake response before the 'normal' response, as in UDP the first response is the response that is used. There are manipulated glue records and even attacked on fragmented packets. The insidious nature of these forms of attack is that they rely on the host system running quite normally. It's the infrastructure of the name system itself that is being perverted here and the applications are completely unaware of this manipulation.

The response to this need to add a trust mechanism to the DNS is to add a digital signature to the 'answers' in the DNS. The idea is that a digital signature attached to a DNS answer can allow the receiver of the answer to be assured that the answer is current, that the answer is authentic, that it has not been manipulated or altered and the answer cannot be repudiated. DNSSEC, the framework for adding digital signatures into the DNS, was some 10 years in the making. A key pair can be associated with a delegated zone in the DNS, and a set of four further resource records (RR) are defined in this framework to sign each entry and to aid in validation of the signature.

The commentary on DNSSEC deployment varies considerably. Some 25% of the world's users cannot reach a DNS named service and the DNSSEC signature cannot be validated. That's triple the level from early 2014, so there is certainly some momentum in the adoption of validation. But at the same time, the number of DNSSEC-signed zones appears to be pitifully low. Of the hundreds of millions (perhaps billions these days) of delegated zones in the DNS, we see some 8M signed zones.

Read: Why has DNSSEC increased in some economies and not others?

Perhaps the most relevant metric is the ranking of domain names by usage and the ratio of DNSSEC-signed zones in that set. The Alexa top 25 list is a good place to start. None of these are DNSSEC-signed names. A late 2016 study of the Alexa top 1M found 1% of names are DNSSEC-signed. A similar study of all of .com, .org, and .net second- level domains found that between 0.75% and 1.0% of all domains in these three zones publish a DNSKEY RR. So turning on validation in DNS queries has very little downside given that few popular DNS names appear to be DNSSEC-signed in the first place!

That last metric is perhaps the most telling. Why is signing so low? Are service publishers unconcerned about misdirection of users? Of course not - they are extremely concerned, as this is all about revenue. Is this ignorance about DNSSEC? Again, not at all. This is a choice. They want to prevent users being misdirected, but equally that was to reduce the dependence on intermediaries and they want the user connection experience to be as fast as possible. If DNSSEC was all we had, then service publishers would be using it! But it's not the only show in town.

These days service publishers use Transport Layer Security (TLS). Can users be misdirected with TLS? Not normally. It requires the private key to leak or the certificate system to be corrupted either at the CA level, or in the user trusted key set. It's fast, because all the credentials needed to validate the certificate are provided in the RLS handshake.

Is DNSSEC fast? Well no. Validation is a serial query sequence back to the to the TLS. Is it simple? No. There is local key management, the ZSK/KSK key split, limited automation, limited support for high resilience hosting. And the fundamental criticism is that it doesn't stop lying in the DNS for DNSSEC-signed zones anyway!

Really? The entire point was that lies were just not possible with DNSSEC wasn't it? Well yes, but that's not the way we built it.

Recursive resolvers perform DNSSEC validation. Stub resolvers - that sit over there with the user - don't. Instead, the recursive resolver withholds the response if the DNSSEC validation fails. But what if it's the recursive resolver that performs the DNS lie? The stub resolver is none the wiser given that its not validating, So the lie stands. If the ISP is blocking some names, performing NXDOMAIN substitution or redirecting actual names, then the stub resolver is just caught in the lie. All that effort to sign the zone validates the DNS answer, and offers absolutely no real protection against being misdirected! TLS just seems to offer a solution that is faster, simpler and more robust. No wonder no one uses DNSSEC in the service world. More pain, and no real gain.

Read: DNSSEC validation revisited

Is DNSSEC good for anything else? As long as 75% of users sit behind non-validating DNS resolver systems, and 99.9% of users don't directly validate DNS responses in any case, then we cannot place critical information in the DNS in a secure fashion and expect everyone to be protected by DNSSEC. So the incentives for putting critical information into the DNS and protect it with DNSSEC are not looking very convincing. There is simply no natural market-based incentive for deployment.

It has been said a number of times that the heart of many issues with the DNS lies in the DNS' choice of transport protocol. The use of UDP as the primary first choice protocol and the fallback to TCP means it's challenging to place large quantities of information in DNS answers while still operating within what we've become accustomed to in terms of speed and robustness. Validation is a very inefficient process, and the inefficiency is increased by the DNS model where the onus is placed on the client that is requesting the information and not the server that is the source of this information. End clients do not validate because every validation operation would entail further DNS queries in order to construct the validation chain, and the incremental time penalties would be unacceptable in terms of user expectation.

Interestingly, we know how to make it faster, and the approach is to package up all the answers to the DNSSEC validation chain construction queries and include then as additional information to the original signed answer (RFC 7901). However it's unlikely that this is viable in a DNS-over-UDP framework. If we want to go down a TLS-like path, and package a validation chain with the DNSSEC-signed response, then we are looking at DNS over TCP (or TLS).

The price of the 'trust' solution is significant and creates a higher requirement for the benefits that trusted answers in the DNS can provide. If all this is about protecting users from a Kaminsky-style attack then that's just not enough. The benefit needs to be far more than that to help justify the considerably costs in moving the DNS from UDP to a TCP-styled platform.

Privacy

Everybody looks at the DNS. Everybody. Because if the Internet is funded by users, and it is, then what users do on the Internet is of paramount interest. How can you find out what users do? Easy. Look at the DNS. Every transaction on the net starts with DNS query, and the DNS exposes every action. But it's worse than that. The DNS is needlessly and senselessly chatty. It over-exposes information.

At all points in the DNS, queries and responses are collected, packaged, analyzed, profiled, replayed and traded.

How can we make the DNS less the go-to system that exposes users and user behaviours to business and government alike?

We can improve its privacy.

There was little in the way of motivation to do anything about this for years. After all, if the Internet actors are busy constructing a global economy based on surveillance capitalism, why should they make the task any harder than necessary? The watershed moment that changed the stance for many of the industry actors was the papers leaked by Edward Snowden.

Government agencies had spent considerable sums in weaponizing the Internet as a highly effective surveillance tool that operated at a scale of national populations. Their motivations were not overly concerned about your future purchases, but more about your personal profile. Are you a threat? Will you be a threat in the future? And all the components of the Internet system that laid out all this information in a clear text, pre-packaged format, were on the DNS. So we've been changing aspects of the the DNS to try and stop the most blatant forms of information leakage.

The first of these is so-called Query Name Minimization. The change is to prevent the DNS name resolution process from being unconstrained in an extraneous information leak. This largely relates to the interaction between recursive resolvers and the authoritative name servers. The task of the recursive is to find the 'right' name server to ask, and it starts at the root and asks the query. The root zone server's response will direct the queries to the name servers of the relevant delegate domain name, and the process repeats until the resolver has an answer.

Read: DNS Query Privacy

But in this process, every name server - from the root down - is aware of the full DNS name that is being resolved. Query Name Minimization trims the name in these queries so that only the next label is exposed to each name server. Root servers will only see Top Level Domain name queries, Top Level name servers will only see second level name queries and so on.

There has been some further work to understand what is the most robust query type for this discovery process. The initial suggestion of NS queries has been supplanted by A queries. The issue of CNAME rewriting and the equally vexed question of Empty Non-Terminal domains (and the variable behaviour of name servers in such situations) have added some complications to this question.

Read: DNS Query Privacy revisited

There is widespread use of this approach, although some implementations have taken some license with the specification and used their own re-interpretation of the technique. Some resolvers, apparently including Google's public resolver service, only perform Query Name Minimization to the first three levels of the DNS name. It appears very much as if the recursive resolver is deliberately withholding full query name information from the root servers, and the top level and second level domain name services, but is willing to disclose the full query information to servers for zones that are deeper in the name hierarchy. Is this approach motivated by protecting user interests or motivated by an effort of denying information to authoritative servers located at the upper levels of the name hierarchy?

Of course, if we were serious about user privacy the Client Subnet extension would never have been specified. The knowledge of full query names that are emitted by a recursive resolver is to some extent mitigated by the inability to conclusively associate such queries with an end user. But if the query is also loaded with the IP address of the end client, or even the network subnet of the end client, then all pretence of privacy protection has been shredded! While Query name minimization could be seen as a positive step in providing a greater level of concealing extraneous information in the DNS, the use of Client Subnet is a gigantic leap backward.

A generic response to privacy considerations in the Internet has been channel encryption. Telnet was replaced by ssh because of the issues of running sessions over the Internet in the clear. Similarly, HTTP has been largely replaced by HTTPS for much the same reason. The DNS is increasingly an anachronism in still passing queries and responses in the clear. Not only does it permit eavesdropping, but it also enables efforts to manipulate the responses, all to the detriment of the user.

However, to repeat an earlier observation, the heart of many issues with the DNS lies in the DNS' choice of transport protocol. Encryption normally involves a number of steps, including the presentation and validation of credentials to confirm that the client is talking to the party they intended to talk to, and also to establish a session encryption key to allow the subsequent data exchange to be encrypted in a manner known to the two parties, but unknown to all others.

This is challenging in UDP. The efforts to implement TLS over UDP, namely Datagram TLS (DTLS, RFC 6347) has the overhead of the exchange of credentials and session cipher establishment, so it's a long step away from a single packet exchange of query and response. DTLS also should avoid IP level fragmentation, but it cannot avoid large payloads associated with this session establishment process.

The result is that fragmentation is pushed up to the application layer and DTLS needs to handle payloads that extend across multiple DTLS datagrams. It appears that the additional overheads of DTLS equate to the overheads of TLS over TLS, but with some added fragility relating to packet fragmentation that is not replicated in TCP. The result of this fragility of DTLS means that when we refer to DNS over TLS (DoT), we are in fact referring to DNS over TLS over TCP. It has been implemented and deployed over the path from the stub resolver to the recursive resolver.

DoT adds encryption to the stub to resolver path, and encryption not only hides the query and response stream from eavesdroppers but also prevents alteration or manipulation of the response by third parties. The recursive resolver can still lie about the response, and unless the stub resolver is performing DNSSEC validation (and it's likely not!) and the domain name is signed (which it most likely is not!) then any DNS lie from the recursive resolver will be unnoticed, whether or not the transport channel from the recursive resolver to the stub resolver uses TLS or not. A lie is still a lie no matter how secure the case used to carry the lie. So it does not eliminate the potential for manipulation of DNS information, but limits the number of entities who are in a position to perform such manipulation and where and how the manipulation can be performed.

How far should channel cloaking go? Should the identity of the other party be obscured? Should the fact that these are DNS transactions be obscured?

DoT makes no effort to cloak its use. The use of TCP port 853 for DoT is a visible signal that there is an active DoT connection. The use of a novel port number is likely to cause many firewall configurations to trigger. The IP address of the remote end is clearly visible, as is the TCP header. The TLS handshake may get around to using ECH and encrypt the server name at some point in the future, but in the case of DoT it probably is a minor artefact, given that name-based overloading of service IP addresses is not happening in DoT today and unlikely to do so in the future.

Is DoT going anywhere? It's unlikely in my view. Right now it requires users to play with their DNS settings, and that is a massive barrier to widespread use. Some users may use it as a means to jump across one set of recursive resolvers' DNS filters, such as those provided by their ISP's, to hook up with another provider, but the overt signalling that this is happening means that an ISP can readily block this if it so feels.

Read: DNSSEC 'and' DNS over TLS

In theory, the use of TCP permits larger DNS payloads, and there is a glimmer of hope that we could use DNS chaining to make DNSSEC validation fast and efficient on end systems using DoT. But so many other preconditions, including server provisioning of DNS Chained responses and a reliable way for the DNS to manage large responses, mean that it is still a distant glimmer of a possibility and nothing more.

DoT is seen as a replacement for the existing DNS infrastructure service, where the DNS is a service located on the common platform and applications use the same DNS resolution calls to the platform as they have always used. It's a platform approach to securing the DNS. Adoption is probably going to require some form of automated provisioning that typically involves the local access service provider. Given that the major compromise threat actor here is the same access ISP and given that the ISP is operating the recursive resolver in any case, it's very challenging to understand the incremental benefit of DoT deployment as an ISP service.

Perhaps it may be that its benefit is as a barrier to other hosts in the local network. Local residential and enterprise environments are cluttered with IP stacks from many providers. A compromised stack is inside the external firewalls and is trusted merely by its physical location. DoT shifts a DNS host's conversation into a protected channel where the protection is against other hosts on your local network!

DNS over HTTPS (DoH) uses the same TCP and TLS foundations, but adds an HTTP context to the transactions. There are a couple of changes here that are interesting. The first is the switch to TCP port 443. It looks like any other HTTPS traffic and is not so readily identified in the network. Second, the DoH servers do not need to use dedicated IP addresses. Like the web itself, the HTTP protocol allows for named service points. And with TLS 1.3 with ECH, even the server name can be concealed in an encrypted envelope.

But there is a little more to this. HTTPS is an application level protocol, and this approach allows an application to bypass the DNS services provided by the platform. This means that no ISP-based configuration is necessarily relevant and the application can not only conceal its DNS transactions from the local and remote network, it can also hide these same transactions from the platform and other applications.

If this heads to HTTPS/3 using QUIC then a number of capabilities are unlocked. Not only is the transport control protocol cloaked behind an encryption envelope, a number of DNS requests can be made on a single transport channel simultaneously.

How far can we go with this effort to advance a privacy agenda in the DNS? Once we've deployed QName minimization, discarded Explicit Client Subnet, and adopted DoH over HTTPS/3 as the application-to-recursive resolver protocol and pushed DNSSEC validation to the application via chained DNSSEC responses, then much of the achievable trust and privacy agenda has been realized.

At that point, much of the ability for an onlooker to associate an end-entity identity with a DNS request is severely curtailed, and while a recursive resolver is still privy to these user transactions, the use of DNSSEC all the way to the edge makes response manipulation by the resolver particularly challenging.

Should we encrypt the paths between a recursive resolver and authoritative servers? Assuming that ECS has been abolished, there is little that such transactions directly reveal about the identity of the end user, and the larger the pool of clients served by a recursive resolver the larger the crowd each individual's queries can hide in.

Other Topics

There are a number of other aspects to the technology evolution of the DNS, and I'll put them all into this section!

The DNS has traditionally used a 7-bit ASCII code for names. Upper and lower case are equivalent, and in addition to the latin characters, DNS labels can use hyphens and number characters. In certain circumstances the underscore is also permitted.

The expansion of the DNS into a larger character repertoire has not been a stellar success. The design decision was to preserve the capabilities of the DNS system and use encoding to map the larger character set into this restricted alphabet.

The choice of Unicode as the underlying character repertoire was not all that good a choice. Unicode is a contract between an application and a printer. It does not matter that there are multiple ways in Unicode to print the same glyph on a printer. The printer does not care! But the DNS cares. The DNS has no concept of 'what it means' and alternate Unicode strings that are presented in an identical way on a screen actually map to distinct DNS names.

Read: DNS homographic abuse

So the effort in the use of Unicode in the DNS has been one of trying to push the Unicode glyph set back into the box, and try and specify canonical subsets of Unicode that minimize display similarity. This is a tough ask, made even harder by the increasing variance in display glyphs used to display the same Unicode code point.

This is most evident in emoji characters. Why is this a problem? Because the Internet still works on a rather crude model of what you see is what you get. If there are alternate ways of coding the same visual outcome then these are distinct labels in the DNS and can be associated with distinct service points. The possibilities to dupe unsuspecting users is, of course, a natural outcome of this process.

These days 'DNS Abuse' is a current topic, particularly in the ICANN world. The phrase describes an effort to engender a level of self-regulation in the DNS supply industry, where behaviours are governed largely by contractual provisions between the registrant and the registrar, and between the registrar and a common registry. It allows various forms of abusive behaviours, including criminal activities, that leverage the DNS to be sanctioned by contractual enforcement including the take-down of DNS names.

It's a lot like the self-regulatory measures that are common in the finance industry, but without the reporting framework, without any common legal framework for enforcement and without any penalties for breaches. My suspicion is that it will turn out to be no more effective than these measures in the finance sector, and probably more ineffectual! It's unlikely to be successful in reducing the levels of abusive and criminal behaviour that leverage the DNS and the Internet.

DNS Fragmentation is also a perennial topic in the evolution of the name space. The pressures for a single consistent name space are embedded in the concept of a single network. Communications systems rely on assumptions of referential integrity, and referential integrity typically implies that the same DNS name refers to the same resource.

Read: It's time to consider avoiding IP fragmentation in the DNS

We've seen this concept being tested many times, from alternate root systems of a couple of decades ago, through to private name spaces today in the enterprise environment. A good case in point about fragmentation is the use of search terms as a replacement for the DNS. The objective of a search engine is to try and customize the responses to best match the known preferences of the querier, and when I attempt to pass a pointer to you about a digital resource then the search term I use that will expose this resource may not be exposed when you enter the same search term in your context.

However, there are other forms of name fragmentation that are enabled by DoH. DoH enables a name space to be lifted out of common network infrastructure and placed into the context of an attribute of an application. The application can direct DoH queries to server infrastructures of its own choosing, and provide responses that pertain to the application, as distinct from a lookup in a common distributed database. The ability in HTTP to push objects to the application client also allows the use of so-called 'resolverless' DNS where an application can improve the performance of name resolution functions by performing them in advance of the time when it is needed.

DNS 'name flattening' has been a constant pressure in the DNS. No one wants to have their critical service names to be buried deep down in the DNS under a bunch of other names. Not only do such names take longer to resolve, they increase the set of dependencies in the same way as there are presumably a greater number of service provider ideas all the way down in the name hierarchy.

So folk want shorter names. The shorter the better. The result is the name space is under constant erosive pressure to flatten down. The ultimate place is in the top level of the DNS, in the root zone, and as the price premium for top level domain comes down the pressure to inflate this zone with a significantly larger number of entries is constant.

But perhaps the forces of evolutionary pressure are more fundamental and parallel the evolutionary forces of the Internet itself. The silicon industry is indeed prodigious, and there are many more processors in this world than people. While we have constructed the DNS name space using an analogue of natural language terms as a means of facilitation of human use, this is not necessarily the dominant use pattern of the DNS.

One view of the DNS is a universal signalling and tunneling protocol, and the use of the DNS as a command and control channel for bot armies testifies to the efficacy of such use of the DNS! It's likely that, as the number of such devices increases, the use of the DNS as a command and control mechanism increases in importance and the human use becomes increasingly marginalized. Human-use DNS may well become an esoteric luxury goods business.

The high touch activity of DNS name management is unsustainable in a shift from human to largely automated use, and the business models and institutions that populate this space will need to adjust to a names business that provides names not as a branding attribute but as an undistinguished commodity activity. In the same way that we have transformed IP addresses from end point identifiers to ephemeral session tokens, we may well see the DNS as a code base for command and control of highly distributed automated systems, which is a far cry from the distributed database lookup that we originally constructed for the human use model of the DNS.

When we think of a DNS query as a set of instructions to a DNS server, and the DNS server as a distributed processing environment, then the DNS changes from a distributed database to a distributed computation and signalling environment. It is certainly a different future for the DNS as we knew, but its probable commoditization in the future is in line with the plight of carriage, switching and content in the Internet!

Read: Where is the DNS heading?

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.