ISC-TN-2012-1-Draft1 Paul Vixie, ISC April-2012 Vernon Schryver, Rhyolite DNS Response Rate Limiting (DNS RRL) ISC Technical Note Series This memo describes a methodology in use inside ISC which may be of use to other members of the Internet technical community. Use of the methods and formats depicted herein is free of any and all license or other encumbrance by the authors or their employers. Copyright Notice Copyright (C) Internet Systems Consortium, Inc. (2012). All Rights Reserved. Distribution of this memo is unlimited, if full attribution is given. Abstract This memo describes a method of limiting the rate of responses by a DNS server in order to blunt the impact of DNS reflection and amplification attacks. Such attacks depend upon IP source address forgery by attackers, where the forgery is undetectible at a distance. DNS servers who respond to all queries without rate limiting are at risk for abuse such that a stream of potentially very large responses are transmitted toward victims whose IP address was forged and who did not in fact solicit said responses. Rate limiting can make a DNS server less useful for such attacks. 1 - Overview 1.1. This memo specifies a method of limiting the rate of responses by a DNS server in order to blunt the impact of DNS reflection and amplification attacks. A reflection attack is one where the IP source address of the victim is forged in a stream of DNS requests, and the forgery is undetectible by the responding DNS server. An amplification attack is one where the responses sent to the victim are larger than the forged requests sent by the attacker. 1.2 Normal and healthy DNS request flows from recursive name servers to authority name servers contain few duplicates, since a normal and healthy recursive DNS server has a cache containing prior answers for Vixie & Schryver ISC [Page 1] ISC-TN-2012-1-Draft1 DNS Response Rate Limiting April-2012 reuse. By keeping a moderate amount of state as to what requestor has heard what response recently it is possible to silently drop requests which are part of attack flows with little or no impact on non-attack requests. 1.3. Notably, the request flow from a stub DNS resolver to a recursive DNS server is extremely duplicative due to the lack of caching by most applications. The rate limiting methods described in this document are thus inapplicable to recursive DNS servers. Such servers should be more difficult to abuse since they will be on the same LAN or campus or ISP as their clients, and should have access control lists preventing use by the rest of the Internet. The networks these servers are connected to should have access control lists which prevent requests having forged IP addresses inside the network from entering that network. Deliberately open recursive DNS servers are outside the scope of this document. 1.4. The methods described here have been implemented for ISC BIND Version 9 (BIND9). However, the methods described here are easily implemented in any software DNS server, and are to the best of our knowledge unencumbered. 2 - Operator Behaviour 2.1. This response rate limiting method has the following configurable parameters, with recommended defaults as shown. 2.2.1. RESPONSES-PER-SECOND (5). This is the maximum number of times that a requestor will be told the same answer within a one-second interval. Note that many possible questions can yield the same answer -- for example, many nonexistent subdomains of an existing zone will all be told NXDOMAIN with a negative proof consisting of that zone's start-of- authority (SOA) record. We therefore apply the rate limit to the answer rather than to the question. 2.2.2. ERRORS-PER-SECOND (5). This is similar to RESPONSES-PER-SECOND but applies only to the REFUSED, FORMERR and SERVFAIL response codes (which are "errors"). 2.2.3. LOG-ONLY (False). This is a testing mode in which responses are not actually dropped but the normal operational logging activity still takes place. This allows an operator to find out what responses would be dropped without actually dropping anything. 2.2.4. WINDOW (5). This is the period (in seconds) over which rates are measured and averaged, and during which memory of rate limit excesses is Vixie & Schryver ISC [Page 2] ISC-TN-2012-1-Draft1 DNS Response Rate Limiting April-2012 retained. If a given requestor solicits the same answer too often then similar queries will be dropped for WINDOW seconds, other than as described under LEAK-RATE and TC-RATE. 2.2.5. IPV4-PREFIX-LENGTH (24). Requestor IP (version 4) addresses are grouped into buckets of size (32 - IPV4-PREFIX-LENGTH) ^ 2. The default (24) is equal to a "class C network" of 256 host addresses. Since the purpose of rate limiting is to protect distant networks against forged- source DNS request floods, this approximates the granularity of the victim's network topology. 2.2.6. IPV6-PREFIX-LENGTH (56). Requestor IP (version 6) addresses are grouped similarly to IP (version 4) addresses as described for IPV4-PREFIX-LENGTH, but since IP version 6 addresses are 128 bits long it's necessary to approximate the size of victim networks differently in IP version 6. The default (56) is the usual size allocated to a household or small enterprise network. 2.2.7. LEAK-RATE (3). When a query would be dropped due to rate limiting, we randomly respond anyway once per LEAK-RATE queries. This gives the victim whose IP address is being forged some chance of getting an answer even during a flood of forgeries. LEAK-RATE must be from 2 to 10 and should approximate the real victim's retry count on a legitimate query. If LEAK-RATE is set to zero then this behaviour is disabled. 2.2.8. TC-RATE (2). When a query would be dropped due to rate limiting, we randomly send back a truncated response instead once per TC-RATE queries. This tells a victim whose address is being forged to retry using TCP. It's recommended that TC-RATE be set lower than LEAK-RATE. If TC-RATE is set to zero then this behaviour is disabled. 2.2.9. MAX-TABLE-SIZE (10000). This is the upper bound on the number of state blobs maintained within this server. Should be set to the product of window size and maximum queries per second, which allows for the worst case scenario in which all queries are unique and each response requires its own state blob. Estimating around 64 bytes of storage per blob, a WINDOW of five seconds, and a query rate of 2000 queries per second, 10000 state blobs should take about one megabyte of server memory. 2.2.10. MIN-TABLE-SIZE (1000). This is the initial size to be allocated for an empty state blob table at startup time. Since growing this table has a cost, an operator might decide to start with a larger than default size table. Vixie & Schryver ISC [Page 3] ISC-TN-2012-1-Draft1 DNS Response Rate Limiting April-2012 3 - Responder Behaviour 3.1. When generating a response, a server will take the requestor's IP address and mask it according to either IPV4-PREFIX-LENGTH or IPV6-PREFIX-LENGTH, and then impute a domain name which is either a wildcard name (if a wildcard match occurred) or the zone name (if no match occurred) or the query name, and a boolean error indicator (was the response code REFUSED, FORMERR or SERVFAIL, or was it not?), and use this tuple to select a state blob, creating this if necessary. 3.2. If the selected state blob indicates that this response has been sent too often to requestors on this network, then consider whether to send a truncated response, or a leaked response, or no response. In any case increment a counter to indicate that the response has been considered. 3.3. When a state blob's age goes over WINDOW, and its counter has not been incremented within WINDOW, then discard the state blob. 3.4. In the event that the creation of a new state blob would cause the table to exceed MAX-TABLE-SIZE, the least recently used state blob should be discarded. 3.5. Noting: Conceptually speaking, a state blob is either filling, full, or draining. To be filling means that the rate limit has not been exceeded. To be full means that the rate limit has been exceeded. To be draining means that the rate limit was once exceeded and the rate has not yet returned to zero. 4 - Victim Behaviour 4.1. A victim whose IP address is forged in a large request flow should normally expect to experience unavoidable congestion on their Internet link. However, if all forged requests are sent to a small number of servers and if those servers implement rate limiting as described in this document then the victim will see a small amount of unsolicited response traffic and will at the same time have higher than normal UDP retry and truncation/TCP counts if they themselves ask the affected servers any question which solicits the same answer as is being solicited in the attack. Vixie & Schryver ISC [Page 4] ISC-TN-2012-1-Draft1 DNS Response Rate Limiting April-2012 5 - Attacker Behaviour 5.1. A forged-source reflective amplifier attacker who wants to be successful will either have to select authority servers who do not practice rate limiting or will have to select a large number of authority servers and use round robin to distribute the attack flows. Each authority server will have to be asked a randomized question within one of that server's zones in order to get an amplification effect. An attacker would do well to select DNSSEC-signed zones and to use DNSSEC signalling in their forged queries to maximize response size. 6 - Acknowledgements 6.1. Many folks have been asking for this for many years. Peter Losher deserves special thanks for having the "ISC.ORG/ANY" problem on his servers. Vixie & Schryver ISC [Page 5]