The Role of DNS in Botnet Command and Control (C&C) DNS is powerful, ubiquitous and yet ignored by most organizations. Today, cybercriminals rely on DNS for rallying infected devices to join a botnet and to mitigate takedowns by authorities. In 2011, cybercriminals started covertly tunneling botnet communications over DNS traffic to mitigate detection by security solutions, despite security researchers widely publishing this threat in 2004! QUESTION: What do you know about 101.cnc.com? ANSWER: Analyze logs... RESULT: Post-damage forensics • Are any devices outside your network trying to resolve Stored on: • Locates infected devices delegated to be such domain hostnames through your network? • DNS servers, proxies or name servers for botnet C&C. • Web servers, or • Are any devices within your network trying to • Locates infected devices attempting to tunnel resolve hostnames to that domain? • Firewalls. botnet C&C communications over DNS.
If you cannot answer the above questions, either because you build botnets to bypass firewall filters or Web proxies.3 Ethical don’t keep these logs, they’re not readily available, or you hackers have constructed a reverse shellcode exploit that could wouldn’t know how to analyze them, you’re likely blind to provide cybercriminals VPN and remote access into an insecure infected devices that have compromised your network by network using valid DNS syntax to avoid detection.4 Furthermore, performing these botnet command and control (C&C) activities. with the future adoptions of DKIM, IPv6 and other extensions to the basic DNS protocol, big and complex packets within DNS Botnet’s principal single point of failure and beacon to security traffic will become more common. Thereby assisting DNS-based researchers is its Internet-wide C&C architecture. From 2007-8, botnet C&C communications to more easily and efficiently blend cybercriminals began building distributed or hybrid C&C in since it’ll appear normal in DNS query streams (see page 2). topologies leveraging more advanced DNS-based C&C rallying mechanisms, such as third-party dynamic or its own distributed In the arms race between cybercriminal organizations and the DNS services, to enable C&C communications to be redirected security community, C&C techniques have become so robust, through its own distributed proxy service. Infected devices within stealth and mobile that botnets are ubiquitous in both home and insecure home or business networks host these services. DNS is business networks despite so-called “next generation” security used to add robustness and mobility to remove single points of solutions’ best attempts to prevent all malware. The “defense- failure within the architecture and provide anonymity for the in-depth” strategy needs to migrate from adding prevention cybercriminals running botnet C&C servers (see page 2). Fluxing layers, to adding containment layers. DNS traffic is often domain names and/or the IP addresses in DNS records used by examined after security incidents; for example, Google botnets makes them more difficult for the security community to discovered the advanced and persistent “Aurora” botnet that take down or over. breached its network by analyzing DNS logs after damage occurred. The most costly damage is no longer the lost time for Today, most botnets rely on a mix of P2P-, HTTP- or IRC-based IT to remediate infected devices, but the stolen data enclosing protocols to communicate between bots and/or C&C servers. sensitive company or personal info for legal and regulatory However, in late 2011, security researchers began publishing bodies to resolve. papers and blogs on botnets, such as “Morto”, “Feederbot” and “Katusha/Timestamper”, using a covert C&C communication method known as DNS tunneling to add stealth.1 DNS tunneling “DEFENSE-IN-DEPTH” STRATEGY MIGRATION is not new; it existed since 1998 and the first implementation DETECT MALWARE PREVENT MALWARE CONTAIN BOTNETS published by Slashdot in 2000. In 2004, Dan Kaminsky widely presented his implementation to tunnel arbitrary data over DNS to the security community, but lost their short-term attention as other exploited DNS vulnerabilities, such as DNS cache INFECTED DEVICE / UNINFECTED DEVICE INFECTED DEVICE / poisoning, became more prevalent. 2 Today, many popular DNS INSECURE NETWORK / SECURE NETWORK SECURE NETWORK tunnels exist that are readily available for cybercriminals to
DNS-BASED RALLYING MECHANISMS HELP CYBERCRIMINALS STOP TAKEDOWNS BY REMOVING SINGLE POINTS OF FAILURE C&C RALLYING MECHANISMS DYNAMIC & DISTRIBUTED (*one example) DNS SERVICES DNS TRAFFIC HTTP TRAFFIC REDIRECTED REDIRECTED DNS-BASED C&C COMMUNICATION can be HELPS AVOID DETECTION BY BLENDING IN same bot ns1.cnc.tld ns2.cnc.tld 184.108.40.206 220.127.116.11 cnc.tld QUERY: QUERY: flux.cnc.tld RESPONSE: HTTP GET RESPONSE: ONLY ALLOW BASIC 18.104.22.168 C&C PORT 80/443 PORT 53 RESOLVERS RESPONSE: NO PROXY NO FILTER NO SINKHOLE 22.214.171.124 QUERY: QUERY: QUERY: QUERY: flux.cnc.tld LEAK DATA = where is where is where is 11010 + 01010 11010 + 01010 00110. 01010. 11010. + … 00110 QUERY: flux.cnc.tld DISTRIBUTED + … 00110 cnc.tld? cnc.tld? cnc.tld? = DATA STOLEN PROXY REFERRER: 01110 + 11011 RESPONSE: RESPONSE: RESPONSE: COMMAND = ns1.cnc.tld SERVICES + … 11100 00110. 01010. 11010. 01110 + 11011 cnc.tld is cnc.tld is cnc.tld is = CONTROL at 01110 at 11011 at 11100 + … 11100 DNS TUNNELING FOR COVERT C&C COMMUNICATIONS
The Past, Present and Future of Significant Botnet C&C Techniques
C&C Attributes Past Present Future Centralized topology Distributed or hybrid topology using RALLYING MECHANISMS using static IP lists domain flux and/or IP flux (via DNS records) > Static Lists IP addresses Domain names and/or IP addresses Dynamic content hidden on popular websites (e.g. > Domain Flux > Seeding Predictable timestamp Twitter trends) that can be customized in do-it- yourself kits > Domain Flux > Crypto Static Frequently changing > Domain Flux > Names Random characters Dictionary word combinations > Domain Flux > Volume Hundreds of domains Tens of thousands of domains Single flux networks changing A Double flux networks changing both A and NS > IP Flux > Records resource records (first seen in the resource records (first seen in the Asprov botnet in Storm/Peacomm botnets in 2007) 2008) Existing dynamic DNS services or As dynamic DNS services are taking a more “personalized” third-level domain (3LD) aggressive stance against botnet abuse, and services. Alternatively, custom DNS governments are cooperating quicker with the servers on bulletproof hosts, which security community, cybercriminals are building their > IP Flux > Service allows a cybercriminal to bypass the own distributed DNS services using multiple laws or contractual terms of service compromised hosts. Often these are initially regulating Internet content and service bootstrapped via custom DNS servers on bulletproof use in its country of operation and are hosts. unlikely to cooperate with authorities. Distributed or hybrid Hybrid topology with Centralized topology using COMMUNICATION topology using P2P-and/or protocol tunneling such IRC- or HTTP-based protocols HTTP-based protocols as DNS traffic > IRC > Client Common IRC client Cybercriminal’s custom IRC client Paid do-it-yourself malware exploit kits Paid or open-source do-it-yourself botnet kits > HTTP > DIY Kits (e.g. Mpack, ICEPack, Fiesta) (e.g. Zeus, SpyEye, TDSS) > HTTP > Protocol Unencrypted Encrypted Public Web 2.0 services (e.g. Amazon Elastic > HTTP > Hosts Privately owned Web servers Compute Cloud, Google App Engine) and social network sites (e.g. Twitter, Facebook, Google Groups) Non-standard port numbers used by P2P standard ports numbers used by common encrypted > P2P > Port protocols protocols (e.g. SSH, HTTPS) > P2P > Protocol Unauthenticated Authenticated > P2P > Discovery Centralized in cache servers Distribute hashed tables across the network Trickled, non- Phone home, data consecutive DNS > DNS Not used exfiltration and/or bot queries over long time instructions periods to further mitigate detection
C&C RALLYING MECHANISM DESCRIPTIONS The rallying mechanism enables new bots to locate its peers IP Flux or the C&C servers and join the botnet. While rallying can Modern botnets primarily use one or more hard-coded also be related to botnet recruitment and propagation, the domain names for DNS servers to resolve to many different IP following mechanisms are only for the purposes of addresses over a short span of time. This technique is also networking the bots. widely known as “Fast Flux” Service Networks (FFSN) as it’s If the security community is 100% successful in shutting also associated with spam and phishing attacks. However, down or hijacking the rallying mechanisms, the botnet falls the term “IP Flux” best describes the result of rapidly apart into a benign collection of discrete, unorganized changing the location (i.e. IP address) to which the domain infections. However, if even a few C&C servers remain alive, name of an Internet host (A) or authoritative name server the botnet can adapt and reconfigure itself to be undetected (NS) resolves, caused by rapid and repeated changes to DNS or protected behind the virtual walls of international records using very low time-to-live (TTL) cache settings. jurisdiction. Several movie analogies come to mind such as Relative to using IP lists, taking down malicious DNS records Terminator’s shape-shifting T-1000 series cyborg or Star is often more difficult than compromised IP addresses Trek’s Borg collective; both these entities are very resilient because many records can be established for the same or unless the entire control mechanism is eliminated. Today, many IP addresses. botnets use a hybrid of up to all three of the following These locations are actually a network of compromised hosts techniques, where one may initiate the rallying, one that act as front-end nodes to proxy DNS and C&C maintains the rallying, and another backs up the rallying if communication protocols to a group of backend C&C servers, the other one or two are disrupted. commonly referred to as a “fast flux mothership” (see page
2). This second layer of abstraction further increases Static Lists anonymity, security, high availability and load balancing of Early botnets primarily used hardcoded static lists of IP the botnet. It makes it nearly impossible to filter only by IP addresses or domain names. However, many firewalls can address, ASNs or geo-location and adds resiliency to add an optional feed of known bad IP addresses to help takedown attempts as it shifts the centralizing agent of mitigate this legacy technique and it is often not agile control from the C&C servers to the distributed DNS enough for today’s large botnet operations. While some architecture. In many ways the idea is comparable to Content compromised hosts will initially rely on static IPs to Delivery Networks (CDN). It has evolved and advanced since bootstrap communications with the botnet, they then switch the The Honeynet Project Research Alliance first discovered to one of the following, more robust methods. For added its use. mobility, cybercriminals used domain names with round- The evolution for cybercriminals to use their own robin/multi-homing techniques to associate multiple IP authoritative name servers has added greater robustness addresses with a single DNS record or dynamic DNS services, and mobility to IP Flux, and makes successful takedown more but not abusing them via IP flux, which is described next. difficult for the security community. Alternatively, if the
compromised devices are redirected to the cybercriminals Domain Flux own recursive DNS servers, bots are able to resolve domain The botnet uses cryptographically generated domain names names to different IP addresses relative to the rest of the by a Domain Generation Algorithm (DGA), which makes it Internet, so for example, if a security researcher or other more difficult for static reputation systems to maintain an network device tries to access the domain, it may appear to accurate list of all possible C&C domains or for the security not exist. Also, it allows the bot to resolve well-known domain community to attempt to hijack the domain. Many names (e.g. google.com) to C&C servers. cybercriminals register only a few of the possible generated domains at a time using dynamic DNS services. In limited recent cases such as the “Android bot”, URL Flux has been used, which is similar to domain flux in that the bot uses a list of usernames generated by a Username Generation Algorithm (UGA) from which it selects a username to visit on a Web 2.0 site.
C&C COMMUNICATION DESCRIPTIONS Once the bots have joined the botnet, they regularly maintain the century, many first-generation cybercriminals were very communications to receive new commands, send back data familiar with IRC as a simple, synchronized and scalable to the C&C servers, such as sensitive company or personal means to chat between thousands of hosts so it was natural information, or learn how to adapt itself in response to the evolution to utilize it for the first C&C communications in security community’s efforts to disrupt or take down its 1999. Despite the advent of instant-messaging (IM) operations. There are advantages and disadvantages as the protocols such as ICQ, AIM, and MSN Messenger that gained following table explains. popularity over IRC for the masses, many “old school” networking and security professionals still use IRC. In fact, Evolution Past Present the original C&C functionality of three evolved IRC-based bot Distributed or hybrid, yet Topology Centralized families – Agobot, SDBot, and GTBot – still constitute a large many are still centralized percent of today’s botnet infections especially since some of Protocols IRC or HTTP P2P the source code was published by its author, with occasional Setup Easy Hard infections by variants of the DSNX, Q8, kaiten, and Perlbot IRC-based families. While almost the same in principal to Detection Easy Hard IRC, there have been only a few botnets based on IM Communication Small delays Small to medium delays protocols due to the difficulty of creating individual IM Resiliency Bad Good accounts for each bot. Anonymity Bad Good
Centralized Communications via HTTP-based Protocols Based on the communication topology, different push and However, as the security community adapted to use network pull control mechanisms will be used together with the firewalls to block seldom used or unnecessary ports at the communication protocol. Also, command authentication can Internet gateway, cybercriminals realized that a more be added to the communication protocol such as passwords ubiquitous C&C protocol was needed to blend in with normal or encryption certificates to help mitigate outsiders taking user traffic. Ports 80 and 443 used for unencrypted and command over the botnet from the cybercriminals; especially encrypted Web traffic over HTTP/S is almost universally with P2P-based protocols. allowed through firewalls, and a few GET and POST requests Direction / used for C&C can easily be lost amongst the exponentially Topology Centralized Distributed growing volume of legitimate Web traffic. HTTP-based DDoS & spam botnets greatly accelerated with advances in do-it-yourself Push IRC-based protocols attacks kits developed mainly by professional Russian cybercriminals to aspiring amateur cybercriminals, and in mid-2011 several HTTP-based protocols, IP Flux P2P-based Pull botnet kits were leaked. Recently, public or social Web rallying mechanisms protocols services have been gaining popularity as C&C hosts via
obfuscated commands due to their added anonymity, Centralized Topologies openness and scalability. However, the security research All early botnets and still the majority of botnets today use community can also leverage this openness to quickly shut centralized topologies via HTTP-based, IRC-based or other such botnets down. IDS/IPS solutions can often detect protocols because they are easier to setup and ensure that suspicious URI strings or nonstandard HTTP headers (e.g. new commands are disseminated to large botnet populations Entity-Info, Magic-Number) used by botnets (e.g. Bredolab). quickly. However, centralized C&C servers are easier to
detect and become a single point of failure for the botnet Centralized Communications via Other Protocols (see page 2). FTP isn‘t commonly seen in the wild; however, several
phishing or banking Trojan horses regularly drops off stolen Centralized Communications via IRC-based Protocols data to FTP servers. Some botnets use custom UDP-only Only one year after the IRC protocol was invented in 1988 protocols, which while easily blocked by business networks, programmers created the first bots to enable chat room (aka. often are able to bypass misconfigured firewalls. channel) operators to log in, ensure the channel remained
open, and to give them non-malicious control. At the turn of
Distributed Topologies (via P2P-based protocols) Hybrid Topologies Peer-to-peer (P2P) communications were created to Advanced hybrid, hierarchal C&C architectures combine the distribute file sharing (e.g. MP3s) amongst large stealth from a few centralized C&C servers and robustness populations. From 1999 to 2003, P2P topologies and from distributed peers to prevent take down. For example, protocols quickly evolved to add robustness, stealth and one group of bots act as servants since they behave as both mobility from the recording industry’s and ISP’s attempts to clients and servers, which have static, non-private IP disrupt communications and/or prosecute guilty individuals; addresses and is accessible from the global Internet. The exactly what cybercriminals also seek for their botnet C&C second group of bots only act as clients since they don’t communications. Using structured P2P communications as a accept incoming connections. The second group contains the C&C topology was first envisioned as early as 2000, but the remaining bots, including: (1) bots with dynamic IP first botnets to use it appeared in 2003, the security research addresses; (2) private IP addresses; or (3) bots behind community began to publish its use in 2005, and it wasn’t firewalls such that they cannot be connected from the global until 2006 that they achieved some limited success. The bots Internet. Only servant bots are candidates in peer lists. are able to loosely communicate amongst its peers using the Another example, is the Hierarchical Kademlia bot, which same or similar non-RFC TCP, UDP (used to bypass NAT extends the base Kademlia bot. Each level in the hierarchy situations) or encrypted ICMP protocols as many file sharing consists of a set of clusters or islands of bots. These clusters clients (see page 2). This topology offers the botnet better use Kademlia for intra-cluster communication. Each cluster anonymity and resiliency without any single points of failure has a super peer that is responsible for communicating with at the expense of higher setup overhead and communication other super peers in the next level up in the hierarchy. The latency. However, since the knowledge about participating super peers thus facilitate inter-cluster communication (see peers is distributed throughout the botnet itself, which gives page 2). the security research community equal access to this
information, cybercriminals evolved the standard P2P
protocols to include proprietary authentications.
A future evolution for P2P-based botnet C&C would be to
blend in with common encrypted P2P protocol traffic ubiquitously within business networks. Fortunately, only one protocol really exists today; Skype. Despite known malware instances using Skype plugins and its API, to the best of the security community’s knowledge, Skype-based botnets are still exclusively theoretical. In 2005, researchers presented an extremely distributed C&C topology using random, unstructured P2P communications broadcast to any other available peers. While one of the very first experimental P2P botnets in 2003 had used such a method, it was not successful, and no other botnets have since been reported to use this topology. Overall, despite the advancements that cybercriminals have developed, some of the oldest botnet C&C communication techniques are still being used today due to their availability via open or leaked source code, or do-it-yourself kits. The table below provides a few data C&C Apr 2008 2008 2009 Q2 2010 2011 points published by the security Communications Arbor Networks Symantec Symantec Microsoft govcert.nl community over the past few years. Centralized / IRC 90% 44% 31% 38.2% 30%
Centralized / HTTP 4% 57% 69% 29.1%
Distributed / P2P 5% n/a n/a 2.3% 70%
Other 1%` n/a n/a 30.5%
DNS-based Communications within Any Topology Notable Quote from Ed Skoudis, Founder of Counter Hack Essentially, DNS records are abused to traffic data in and out Challenges and SANS Fellow (Feb 2012) of a network. Every type of record (NULL, TXT, SRV, MX, “Number of malware threats that receive instructions from CNAME or A) can be used, but the speed of the connection attackers through DNS is expected to increase, and most companies are not currently scanning for such activity on differs by the amount of data that can be stored in a single their networks, security experts said at the RSA Conference record (see page 2). 2012 on Tuesday. While most malware-generated traffic passing through most channels used for communicating The outbound phase starts with the bot on the compromised with botnets (such as TCP, IRC, HTTP or Twitter feeds and device requesting a response from the local host or network Facebook walls) can be detected and blocked, it's not the case for DNS (Domain Name System) and attackers are DNS server for a DNS query to [data].cnc-domain.tld. The taking advantage of that.” data (base32-encoded) is split and placed in the third- and http://www.circleid.com/posts/malware_increasingly_uses_dns_as_command_and_control_channel/ lower-level domain name labels of multiple queries. Since there will be no cached response on either local DNS server, the requests are forwarded to the ISP’s recursive DNS servers, which in turn will get responses from the cybercriminal’s authoritative name server. For the inbound phase, TXT records can store the most data (base64-encoded) as typically suggested in DNS tunnel implementations up to 110 kbps, but may not be ideal for botnets to avoid detection by network devices since these are not common records. Unfortunately simply blocking TXT records as a defense method is insufficient, because it will break other protocols (e.g. SPF, DKIM) and alternative DNS records such as CNAME are common, and used in series, can still transmit detailed instructions for the compromised host to act on. Alternatively, if two-way communication is not necessary, either the queries or responses can exclude the encoded outbound or inbound data, respectively. This would make the transfer more inconspicuous to avoid anomaly detection systems. At present time, there are not many countermeasures cited by the security community that are “silver bullets” to detect DNS-based botnet C&C communications. While some larger, security-aware organizations could use techniques such as “split horizon” DNS to force internal hosts to send their DNS requests only through the network DNS server and then use statistical anomaly detection (aka. signatures) for this DNS traffic, there are unfortunately little to no readily-available signatures that are well tested to both guarantee protection and cause no false positives.
Cloud-based Internet Security Trusted by millions around the world. The easiest way to prevent malware and phishing attacks, contain botnets, and make your Internet faster and more reliable. OpenDNS, Inc. • www.opendns.com • 1.877.811.2367