NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks...

Author: Amberly Woods

4 downloads 0 Views 188KB Size

Report

Download PDF

Recommend Documents

gore: Routing-Assisted Defense Against DDoS Attacks

Proactive Intrusion Defense Against DDoS Flooding Attacks:

Defending HTTP Web Servers against DDoS Attacks through Busy Period-based Attack Flow Detection

A Hybrid RFID Protocol against Tracking Attacks

Managing IPS Anomaly Detection

A Comparative Approach to Handle Ddos Attacks

StackPi: A New Defense Mechanism against IP Spoofing and DDoS Attacks

Moving Target Defense against DDoS Attacks: An Empirical Game-Theoretic Analysis

Keywords Topic Detection, Anomaly Detection, Social Networks, SDNML, Burst Detection

DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS: EVOLUTION, IMPACT, & SOLUTIONS

Modeling Mobile User Behavior for Anomaly Detection

Techniques for Anomaly Detection in Network Flows

Is Sampled Data Sufficient for Anomaly Detection?

Signal Processing Methods for Network Anomaly Detection

Host based anomaly detection for webservers

July, 2009 South Korea and US DDoS Attacks

Modeling Multiple Time Series for Anomaly Detection

Online Anomaly Detection under Adversarial Impact

NSH: Normality Sensitive Hashing for Anomaly Detection

Anomaly Detection of Network-wide Traffic

P3CA: Private Anomaly Detection Across ISP Networks

Defeating Vanish with Low-Cost Sybil Attacks Against Large DHTs

Dynamic Network Evolution: Models, Clustering, Anomaly Detection

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks Kai Hwang, Pinalkumar Dave, and Sapon Tanachaiwiwat University of Southern California, Los Angeles, CA. 90089

Abstract: This article presents a new defense system to protect network servers, network routers, and client hosts from becoming the handlers, Zombies, and victims of distributed denial-of-service (DDoS) flood attacks. The NetShield system was developed at USC to protect any IP-based public network over the Internet. We explore preventive and deterrent controls to remove system vulnerabilities on target machines. Adaptation techniques are suggested to launch protocol anomaly detection and corrective intrusion responses. In particular, we propose a new datamining approach to detect protocol anomaly against DDoS flooding attacks. Suggestions are made to enforce dynamic security policies on the NetShield security system. At present, the NetShield is specially tailored for protecting network resources against DDoS flood attacks. Some analytical results on the detection performance are reported. Open research issues are identified for further work.

Index Terms: DDoS attacks, SYN/UDP/Smurf/ICMP flooding, anomaly detection, risk assessment, datamining for security, alarm classification, and adaptive response

1. Introduction A recent CERT/CC report shows that security incidents are reported doubling each year [2]. Major victims including Yahoo, Amazon, CNN, eBay, and E*Trade were brought down by DDoS attacks at least once in the past. Network hackers attack client hosts, network servers, and Internet routers by exploiting vulnerabilities and known bugs, installing backdoor software, and covering the rootkit tracks. Many available intrusion detection systems (IDS) cannot stop the DDoS attacks, effectively [1, 10, 20]. The problem is getting worse, because SYN, UDP, and ICMP attack tools are easy to act faster than the defenses deployed so far. Automatic blocking of DDoS attacks is simply not in use today [4, 17]. This paper reports a new approach adopted in the NetShield Project at USC Internet and Wireless Security Laboratory [6, 24]. We present the architecture design choices made in NetShield. A datamining learning capability is developed for attack classification. We pursue protocol anomaly detection [14], which is improved from signature anomaly detection by modeling the well-defined protocol standards. Instead of using a misuse detection model, this approach applies a normal-use defense model, which requires no signature updates, a major advantage in reducing detection overhead. ________________________________________________________ •

Manuscript submitted on March 31 to RAID 2003, the Sixth International Symposium on Recent Advances in Intrusion Detection, Pittsburgh, PA. Sept. 8-10, 2003. Corresponding author: K. Hwang, Internet and Wireless Security Lab., EEB 212, University of Southern California, Los Angeles, CA 90089. Email: [email protected]

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 1 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

Denial-of-service (DoS) attacks are launched from a single source. These attacks either exploit logic bugs to cause system crashes and destroy files. They enable flooding damage to system resources (CPU/memory) like the SYN flood and Smurf attacks. The result is to degrade the storage capability and disrupt network connectivity. In recent years, DoS attacks are largely evolving into the form of DDoS attacks (Figure 1). The range of such an attack extends far into a global scale with an alarming growth rate [2]. DDoS Attacks: The DoS attack disables the victim’s resources from a single attacker. The DDoS are network flooding attacks from multiple machines, simultaneously [16]. By filling the network pipe with an overwhelming amount of packets, an attacker can completely consume all of a network's available bandwidth. A DDoS attack is launched by the attacker from a hidden site. The attack involves four major participants: the attacker, one or more handler nodes, multiple zombie agents, and a target victim. (Figure 1). The attacker first recruit one or more handlers (also known as the masters), which in turn recruit some innocent or unwilling Zombie hosts to serve as agents to launch the attacks on the victim machine, simultaneously.

Attacker

Handler

Zombie

...

... Zombie

Handler

...

Zombie

Target/Victim Network

Figure 1 Platforms involved in distributed denial-of-service (DDoS) attacks The result is a massive flood of packets that crashes the host or bogs down the entire network operations. Very few networks or hosts can effectively cope with such a scale of attacks today. Most of the hosts drafted to play the handler and Zombie roles are completely unaware of the fact they were part of a DDoS conspiracy. The hacker must be able to compromise the handler and Zombie nodes and can install the attack software there. As a result, even if a victim is able to trace a connection to a particular Zombie host, the Zombie host will be of little help to catch the attacker. DRDoS Attacks: These are a new generation of DDoS attacks using a large number of Internet’s core infrastructure routers to launch powerful attacks on the victim’s host, whose IP address has been falsified by the hacker as a valid source IP address. This creates “distributed reflection” (DRDoS) attacks as called by Gibson [5]. Malicious SYN packets are reflected off innocent bystanding TCP servers. Their SYN/ACK responses are used to flood the innocent victim and

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 2 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

consume the network bandwidth. Gibson has suggested the use of a refection server list to deal with reflection attacks.

2. DDoS Attack Tools and Defense Approaches To understand the DDoS attacks better, we have to know the tools used by attackers. In this section, we introduce the attack tools, the protocols applied, attack types and their damaging impacts. The first sign of a DDoS attack is when large number of compromised systems being flooded with heavy packet streams. The first symptom is likely to be a network router crash. The traffic simply stops the packet flow between your host and the Internet. When the victim servers or the client hosts begin to slow down or come to a complete stop, the attacks are almost for sure a DDoS type. DDoS Attack Tools: Several attack software packages are available from the public domain. They can be easily downloaded by any ordinary user to exploit the vulnerable hosts and launch the DDoS attacks. These packages are summarized in Table 1. We compare below these attack tools and their attack methods [16, 20]. The table entries reveal the distinctions among the attack types. They differ in attack characteristics, victim selection, attack tools methods used, protocol applied, impact and damages, and suitable defense countermeasures. Table 1 DDoS Attack Tools, Protocols and IDS used, and Damaging Impacts Attack Tools Trin00 (1999) TFN (1999), TFN 2K Stacheldracht (2000) Shaft (2000) Mstream (2000) Trinity (2000)

Attack Type, Agent Codes

Attack Profile and Applicable IDS and Traffic Monitor Tools

UDP flood, Handler code: aster.c, Zombie code: ns.c

Flooding the network. Traffic monitor

UDP, ICMP, SYN, Smurf, Handler: tribe.c Zombie: td.c

Break 3-way handshake TCP connections, use DDS, RID, and ZoimbieZapper

UDP, ICMP, SYN, Smurf, Handler code: mserv.c, Zombie Code: td.c. UDP, ICMP, SYN, and Combination of above

Break TCP connections, create a rootshell on ephemeral port, automatic update Zombies, use DDS, RID, and ZombieZapper Same as above. Use symmetric key encryption, difficult to detect.

Multiple Stream (TCP/ACK)

Heavy flooding attack on one or many destinations.

UDP, SYN, RST, ACK, Random Flag, Fragment

Uses Internet Relay Chat to flood the network.

DDoS Defense Techniques: Seven classes of DDoS defense techniques are assessed below. References are identified with some known techniques: (a) – (e), along with our new approaches: (f) and (g), being reported in this paper. a)

Distributed firewalls and Statefule IDS: Firewalls are preventive measures to block network attacks including DDoS attacks. We have suggested the use of distributed firewalls

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 3 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

and stateful IDSs for hierarchical security control [6]. In this approach, dynamically shifting resources was suggested as the threats increase [10, 25]. b)

Packet filtering techniques: use of Ingress or Egress routers or firewalls to solve the IP spoofing problem, reconfigure the routers to intercept SYN attacks, develop better selected packet discard (SPD) strategy to reduce filtering overhead [8], or use the pushback router to match aggregation signatures with a rate limiter [7] or use IP counter and timers to drop malicious packet more accurately without hurting the normal traffic [24].

c)

Hardness OS and reduce system vulnerability: Use Tripwire to help determine the state of the operating system and increase the connection queue and decrease of timeout [11, 20]. Exploit Zombie flaws (e.g. ZombieZapper) or imitate handlers. Vigilantes to counter attack unarmed intruders or to stop the Zombies from being used [3, 16]

d)

Trace-back and backscatter analysis: Trace-back to reveal the routers and links used in the attack traverse and use backscatter analysis to estimate worldwide DoS activities [17], use reflection server list to trace the TCP routers involved in DRDoS attacks [5]

e)

Encryption, anti-virus, and security appliances: Encryption of all packets transmitted and use longer asymmetric keys, use anti-virus programs to scan virus signatures in executable codes [2, 15], and deploy security appliances such as network monitors and sniffers or mitigation of DoS attacks through QoS regulations [4]

f)

Anomaly Detection and Adaptation techniques: Stateful IDS [10] and reconfigurable intrusion detection sensors [25] are very useful for our purpose. We use an alarm-matrix model to distinguish real attacks from false alarms. We choose to implement a protocol anomaly detection system [14]. Our work extends from the adaptive intrusion response work for active networks [21, 22] and the adaptation techniques for real-time intrusion detection and responses reported in [6, 11, 18, 25].

g)

Datamining Support for Intrusion Detection: We propose to design an protocol anomaly detection system aided by datamining and training techniques. Previously, datamining approaches for network security have been studied in [9, 12, 13, 19]. In particular, we find the work of Lee, et al [12] and Noel, et al [19] inspirational to our work in this direction.

3. The NetShield Security System at USC Our NetShield defense system was designed to prevent, detect, and respond to malicious network worm or virus attacks of any type, not necessarily restricted to DDoS attacks. At this time, we focus the use of the system on defending against massive attacks from DDoS types. The system generates a classified intrusion report and assesses the residue risks if certain countermeasures are enabled. Some of the risk from DDoS attacks can be completely removed and some can only be partially blocked depending on the ability of the filters, firewalls, or IDS applied.

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 4 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

Testbed Network Environment: Figure 2 shows the positioning of the NetShield system in a network environment at USC. The testbed is built as part of a local-area network (LAN). There is a firewall gateway between the LAN and the network router to USC. The router is connected to Internet backbone. All network involved are IP-based. The NetShield system is essentially a runtime software library for anomaly-based intrusion detection and semi-automatic responses. The IDS is supported by datamining features presented in Sections 4 and 5. The RAS is a risk assessment system. The IRS is used to generate intrusion responses. These subsystems are described in subsequent sections.

The Internet

Network Router

ISP

Datamining for Anomaly Intrusion Detection (IDS)

Firewall

Risk Assessment System (RAS)

Intrusion Response System (IRS)

LAN

The NetShield System Figure 2 The coalition forces against flooding attacks, in which the NetShield plays a core role with the ISP and TCP router to protect the LAN

Roles of IPS, IDS, and ISP: A comprehensive intrusion defense system is often built with several subsystems, namely an intrusion prevention system (IPS), an IDS, and IRS along with help of the Internet service providers (ISP). We assess below the roles of the IPS, IDS, and ISP. An IPS uses a proactive measure to stop the hackers before they can do any damage. Some host and network-based IPS are commercially available. The IPS stops the DDoS attacks by building defenses against unchanged signatures, behaviors, or patterns of those attacks. IPS applies software-based heuristics, sandbox protection, and kernel-based protection approaches. List below are 4 techniques for stopping the intruders outside the firewall gate.

Obtain the latest service patches for your network hosts and stop privilege escalations.

Understanding the trends of target selection (routers, DNS, DHCP, etc)

Preventing buffer overflow exploits and alteration of system resources

Prohibit access to E-mail list, system libraries, files/directories, registry settings, etc.

The IDS is generally installed in series with a firewall to enhance the intelligence of the security system. At the network level, an administrator search for UDP and TCP packets with certain port numbers. A network-based IDS scans the network links to look for illicit traffic, such as

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 5 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

the UDP packets from a known port. Signature-based detection has fewer false positives and lots of false negatives. Anomaly-based detection demands protocol dynamics and multi-site correlation [14]. The ISP must consider the security effects of a shared network and can quickly respond the user needs. Blocking the reflection attacks is possible by adding a filter to the aggregation router. The difficulty is that there are so many router ports to be blocked in successive waves of reflection attacks. There are many SYN/ACK packets to be blocked. An ordinary user cannot handle the blocking process. Gibson has designed a blocking idea by creating a long refection server list [5]. NetShield Building Blocks: The architecture of the NetShield system is detailed in Figure 3. The system consists of four major functional blocks: the IDS, AMG (alarm matrix generator), RAS (risk assessment systems), and the IRS. All 4 subsystems need the support of a packet filter, a traffic monitor, a datamining unit, and a security database. The database stores all attack and response records, which are centrally administrated and dynamically updated. The IDS is used to detect intrusions. The host intrusion detection system (HIDS) is used to detect handler/zombie activities over the network traffic. The network intrusion detection system (NIDS) is used to detect flooding attacks from the incoming traffic. The traffic monitor helps identify the traffic characteristic including the irregular burst of traffic to signal a DDoS attack. The AMG raises a distinct alarm (alert) after a particular attack type is confirmed.

Security database Outgoing . Traffic Incoming Traffic

Packet Filer

HIDS

AMG

RAS

IRS

NIDS Traffic Monitor

Figure 3

The NetShield system designed at USC for intrusion detection (IDS), alarm generation (AMG), risk assessment (RAS), and intrusion responses (IRS) to protect a victim server/network from massive DDoS attacks

The concept of an attack and an alarm are really two sides of the same coin. When they match with each other, the IDS has detected correctly with a hit. Otherwise, the AMG has a misclassification. When the IDS fails to detect a real attack, it is called a missed detection or false negative. When a no-attack is wrongly alerted, the AMG may raise a false-positive alarm. Often, these are caused by the background traffic. The IDS/AMG should be able to distinguish among the hits, misclassifications, and false alarms

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 6 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

The RAS is used to assess the risk of a raised alarm and estimate potential damages. When the IDS/AMG fails to detect/classify a real attack/alarm, it is called a missed detection or false negative. When a no-attack is wrongly alerted, it is called a false-positive alarm. We assess the risk level from multiple attacks according to the potential damages from both real attacks and false alarms. The response cost is rather difficult to estimate accurately. Accurate account of risk cost can only be produced after testing a production system is running for some time. For example, we have only considered alarms raised by real attacks, instead of the alerts from background traffic. In reality, these cannot be ignored by any DDoS defense system.

4. Datamining Mechanisms for Anomaly Detection Anomaly-based detection demands protocol dynamics and multi-site correlation [14, 23]. If the attack is detected correctly, then an effective response is enabled in time to stop the attack. All attacks and their responses are stored in the database. The recorded information is provided to match the attack profiles. Attack profiles are the combination of the protocol used, packets per second, port used, time interval between packets sent, etc. We explain below how to construct the security database and how to perform the datamining process for anomaly-based intrusion detection. Security Database Construction: The security database cannot be built overnight. It is constructed incrementally over the entire life span of the network platform being protected. For flooding detection, we must adjust the detection threshold frequently. Signature-based detection has fewer false positives and lots of false negatives. For new attacks, signature matching becomes ineffective to detect the DDoS attacks correctly. As shown in Figure 4, the packet filter eliminates some malicious packets. The security database stores all footprints of previous attacks, including all past attack patterns, the countermeasure deployed, and their effectiveness.

Database Trainer

Security Database

Intrusion Response System (IRS)

Filtered Packets

Packets

Profile Matching

Packet Filter

Database Training Unknown Packet

Packet Classifier

Abnormal Packets Normal Packets

Abnormal Packets

Alarm Generator

Alarms

Risk Assessment System

Figure 4 An anomaly-based intrusion detection system (IDS) using datamining to train attack profiles and to drop malformed packets

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 7 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

The database trainer sorts out the knowledge base and updates all attack profiles for use in detecting new attacks in the future. In an anomaly-based IDS, one can handle the new attacks by checking the attack profiles and generate some appropriate responses [12, 15]. Any significant deviation from the profile is reported as suspicious attacks. New network conditions should be also added to the profile. The traffic monitor is often an integrated part of the packet filter. Traffic and System Parameters: The security database contains following information in each profile entry. It also contains the system information. We have selected only a few of the parameters related to attack detection. Database entries correspond to different attack patterns [8]. Table 2 contains key traffic and system parameters used in designing the security database. Most table entries are connection counts in CPS, RNC, NOC, and MAXCON as defined in the footnotes. Two IP address lists, IPinSec and IPRange, are exemplified by those IP addresses with the same prefixes or with the same suffixes [23]. Any terminal acting as zombie or attacker will try to connect victim with different type of attack shown in Table 2. If there is very high rate of new connection attempts, RNC, from the same source IP, it is possible that source is trying to attack the destination by engaging many new connections. Attempting many new connections from single source creates redundant attack pattern, which may repeat itself again and again. We call this phenomenon the source IP redundancy. Table 2 Traffic and System Parameters used in Security Database Design Profile

CPS

RNC

NOC

IPinSec

IPRange

MAXCON

Entry 1

100 pps

10

75

130.110.x.x

128.125.x.x

1000

Entry 2

190 pps

23

50

132.23.34.x

198.128.12.x

100

Entry 3

1000 pps

100

768

123.x.x.x

198.182.x.x

500

CPS: No. of connections/sec from a particular host, RNC: No. of new connections/sec established, NOC: No. of open connections, IPinSec: List of IP addresses of insecure hosts and Zombies, IPRange: Valid external and internal IP range, MAXCON: Max. no. of connections handled by a system

Attack Profile Collection: In our study, the raw attack profiles are collected from multiple sources. For anomaly detection, profiles of normal behavior are automatically discovered. This automatic discovery avoids the costly handcrafting of profiles found in a knowledge-based scheme. It is rather difficult to detect stealthy attacks. This is because it sends an alarm only when the number of requests exceeds a threshold. It is hard to define abnormal deviations as attacks, if they cannot be distinguished from variations of normal behavior. This is also part of the argument for us to apply a normal use detection model in the anomaly detection process. The problem is solved by employing some classifiers that are trained to learn the difference between normal and abnormal traffic patterns. Presently, we are collecting the attack profile records

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 8 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

from three sources: First, we collect attacks to local resources from USC Information System Division (The Computer Center at USC). Second, we try to use some of the attack information released by the CERT/CC Security Statistics during 1998-2002. This is from the Computer Emergency Response Team of CMU Coordination Center [2]. Finally, we will collect security information from Cisco [3] and other open sources like the SANS Info Sec Training programs.

5. Protocol Anomaly Detection Process We propose a new protocol anomaly detection algorithm using the security database, which is constructed through datamining and machine learning as explained below. Datamining for Anomaly Detection: Datamining can help anomaly detection in many ways as originally suggested by Noel. et al [19]. The idea is to discover new attack patterns rather than using pre-defined rules. This method creates general representations of various attacks. Datamining and classifications are combined to implement the machine learning process, in particular for supporting protocol anomaly detection [14]. Datamining discovers strong associations with well-known protocol standards. The classifier induces classes of attacks, based on datamining results and other attributes from training data associated with known attacks. There are two phases of the machine learning process: profile training and anomaly detection. The ultimate goal is to implement anomaly detection using a normal-use model, instead of the miss-use detection model. This approach requires no frequent signature updates as in traditional IDS based on signature matching. In the training phase, the system builds the profile of normal user behavior, instead of missuse signatures. This is done by using some mining rules from network traffic that is known attackfree. Also during training, the classifier learns attack classes through network traffic that has been labeled with known attacks. In the detection phase, a dynamic on-line mining algorithm is used to generate the rules of suspicion. Suspicious rules are those that exceed a particular support level and are missing from the profile of normal behavior. Matching Protocols to Verify Attacks: Figure 5 gives the detailed profile matching and subsequent IP address and traffic checking for abnormal behaviors. Profile matching is the method where we compare the profiles of the incoming packets with previous packet profiles stored in the security database. If they are equal, the received packet is a previously known attack. For an example, the HTTP protocol allows the application to use the shortest form of UTF-8 Unicode strings. In the case of a Nimda attack, the network worm exploits directory traversal vulnerability using some overlong UTF-8 characters. This is a typical protocol violation in the attack profile. If the source IP of the incoming packet is in the IPinSec list, the attack is verified as the Nimda type from its source IP address. If the source IP address of the packet received is from an insecure host listed in IPinSec, the attack can be easily verified. Due to high variations of the CPS, RNC, and NOC counts, the traffic of the network often show irregular behavior, which shows

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 9 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

sudden rise and drop in the network traffic. This is called the unusual traffic. We check the following condition in this verification process: CPS + NOC + RNC (θ) > MAXCON

(1)

The first two terms CPS + NOC account for the relevant connection numbers. The number of open connections in the third term RNC (θ) is a function of the detection threshold θ applied. This threshold is determined by experience after long observation of the traffic spectrum. When the left-hand sum of various connections exceeds the maximum that can be handled locally, the traffic is considered unusual traffic. The unusual traffic pattern creates the flooding damages on the victims. Packets Is protocol profile found in the attack database?

No

Source IP in the IPinSec listing?

Yes

Yes

No

Yes

Source IP No

redundancy? No

No

Oversized Fragmentation? Yes (e.g. 46 octet < Packet Size < 576 octet )

No

No

Unusual Traffic? CPS+NOC+ RNC(θ) > MAXCON

Yes

No Attack

Yes

Suspicious DDoS Attack

Verified DDoS Attack

Figure 5 Protocol anomaly intrusion detection aided by datamining of attack profiles

Alarm Classification: In Figure 5, there are three ways leading to a verified DDoS attack: (1) The protocol file violated established protocol standards. (2) The source IP is in the insecure IP list. (3) Source IP redundancy is detected even the packet is not in the black IP list. A packet is classified as a suspicious DDoS attack in two possible ways: (1) The requested connections exceeded the maximum allowed. (2) Oversized fragmentation has occurred as explained below.

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 10 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

The hosts are not required to reassemble very large IP datagrams. The maximum size a datagram that most hosts (severs or user workstations) accept is 576 octets. If IP packet size is larger then 576 octets, the host cannot resemble them correctly. This is called an oversized fragmentation caused by intended protocol violation. This will confuse the OS and brings the system to a shutdown. Lack of IP redundancy may lead to either a suspicious attack or a no attack. When the traffic rate is normal and no oversized fragmentation, it is signaled as a no attack.

6. Alarm Matrix for Assessing Detection Results A new concept of alarm matrix is introduced below. The alarm matrix is generated by the IDS report and verified by the datamining mechanisms, after a long period of monitoring the traffic conditions. The matrix can be used as a predictor of the future detection behavior, in order to distinguish between the true and false alarms correctly. The anomaly detection and the security database designers must know the ground truth of the attacks, false alarms, and misses, before this alarm matrix can be accurately generated. The alarm matrix is a powerful tool to perform risk assessment or evaluate the performance of IDS or filtering processes. We show below how to generate this matrix and then use it to assess the effectiveness of the IDS and filtering results in the next section. When some new attacks come into play, the detection and classification criteria may have to be changed. The matrix will help the adaptation process in changing the detection or filtering thresholds or updating the security database used. In what follows, we use a few DDoS attack types to illustrate the main ideas behind the NetShield defense system. Alarm Matrix: The detection results of an IDS are represented by a square alarm matrix A = (aij) of order n + 1, where n is the number of distinguishable DDoS attack types as listed in Table 2. The rows correspond to the DDoS attacks and the columns are the corresponding alarms raised. The matrix element aij is the number of times that attack type i has triggered the alarm type j. The last row is for no attacks which lead to false positive alarms. The last column indicates the false negatives caused by missed detection. For simplicity, we show a 4 x 4 alarm matrix below. Attack Type 1

a11, a12, a13, a14

Attack Type 2

a21, a22, a23, a24

Attack Type 3

a31, a32, a33, a34

False positives for no attacks

=

A

(2)

(Modeled for three

a41, a42, a43, a44 Alarms for 3 attack types

(Alarm Matrix)

DDoS attack types) Alarm for false-negative from detection misses

The four rows correspond to 4 attack types as labeled. The first 3 columns are the alarms raised. The 4-th column shows the false negatives from missed detection. The detection hits are

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 11 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

represented by all diagonal elements. The misclassified alarms are recorded by the off-diagonal elements. The false positives correspond to entries at the last row. Not that the corner entry a44 = 0, because no alarm raised for no attacks. Handlers, Zombies vs. Victim Hosts: The first thing to do is to check whether the site was targeted as the handler, Zombie, or victim by the attacker. These targets are characterized by different alarm matrices. We use the signature-based Snort to detect the communication between the handler and the Zombies. Anomaly-based detection is often used in the router. Specific communication ports are specified in Table 3 for different attack tools and protocols used. These are the signatures used to detect DDoS attacks of various types. Table 3 Communication Port Numbers Used by the Attack Tools Attack Tools Trin00 TFN Stacheldraht Shaft

Communication Port Numbers 1524 tcp, 27665 tcp, 27444 udp, 31335 udp ICMP Echo, ICMP Reply 16660 tcp, 65000 tdp, ICMP Echo, ICMP Reply 20432 tcp, 18753 udp, 20433 udp

Four Attack Scenarios: To demonstrate the use of the alarm matrix for reporting IDS results, we consider below four example alarm matrices, namely H, Z, R, and V. The matrix entries are entered hypothetically, based on our understanding of the detection behaviors of the handler, the zombie, the router, and the victim platforms, respectively. In Table 4, we summarize the platform vulnerabilities, the IDS behavior, and alarm matrix characteristics. Table 4 Attack Scenarios and IDS Behavior on Four Network Platforms Target Platforms

Intrusion Detection System (IDS) (FP: false positive, FN: false negative)

Handler exploited to command the Zombies to attack

Use a signature-based Snort, the TFN attack can be distinguished from TRin00 and Shaft attacks, which are difficult to distinguish between them. Both FP and FN alarms exists

Matrix H: High detection rate, moderate false alarm rates, and no confusion between the Trin00 and Shaft attacks

Zombie exploited to launch attacks on the victim

The Snort is less effective to distinguish among three attack types, but both FP and FN alarms are fewer, Stacheldraft attack easier to be confused with the other two attack types

Matrix Z: Low detection hits and lower FP and FN rates, high misclassification rate between Stacheldraht and other two attacks

Internet router used to launch DRDoS attacks

Anomaly-based IDS using a low threshold to enforce high security at expense of router throughput, no misclassifications due to using different protocols among the 3 attack types

Matrix R: High detection hit rate, no missed detections, and no confused detection among the 3 attack types

Server or Client Host targeted as the attack victim

Use a low-cost and signature-based IDS, no FP alarms for no signature matched with no attacks, and no misclassified alarms due to the use of different protocols on 3 different attack types

Matrix V: High detection hit rate, no misclassification and no FP alarms, but high FN alarm rate by using a low-cost IDS

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Alarm Matrix Characteristics

Page 12 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

Each case is specified below, separately. The matrix entries shown are hypothetical values to reflect the typical behaviors of the IDS applied on different platforms or routers traversed. Case 1: Detection Result on the Handler (Alarm Matrix H) The assumption here is that the handler machine is vulnerable and has been comprised with the attacker. The handler has control of a large number of Zombies recruited in the attacking process. It has installed an IDS, which can detect most of the attacks but it is not good enough to distinguish between the Trin00 and Shaft attacks. Thus, the detection hit rate is high but misclassifications are only a few between the two attack types. There are no confusions (with zero entries in H) between the Trin00 and TFN attacks or between the TFN and Shaft attacks. This distinctive behavior is reflected below by the alarm matrix H with respect to 3 DDoS attack types: Trin00, TFN, and Shaft.

Trin00

20

0

2

TFN

0

8

0

Shaft

6

0

14

5

3

2

4

0

False positive

4 2

= Alarm Matrix H

(3)

Both false positives and false negatives do occur often as shown by the entries in the bottom row and the rightmost column, respectively. This is due to the fact that the Trin00 and Shaft attack tools share the same communication protocols, UDP or TCP0; whereas the TFN uses the ICMP protocol. This explains why the IDS were confused between the Trin00 and shaft attacks with nonzero entries (6 and 2) on the off-diagonal elements in matrix H. The communication ports given in Table 3 are set by default. The handler sees the communication links with its zombies, but not conversely. Case 2.

Detection Results on the Zombie or Agent Machine (Alarm Matrix Z)

The zombie machine is also compromised with the embedded attack program from the attacker. The Zombie IDS behavior is shown below by the alarm matrix Z. This is an IDS which is less effective with a fewer detection hits than that reported in matrix H. The zombie IDS is confused with small nonzero entries on the off-diagonal elements. However, both false positives and false negatives are fewer in the matrix Z. Trin00

5

0

10

1

TFN

0

3

6

2

8

20

4

2

1

1

1

0

Stacheldraht False positive

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

= Alarm Matrix Z

(4)

Page 13 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

The Zombie IDS is unable to detect the communications between zombies and the handler who command its operations. The control command consists of a smaller number of packets, which are difficult to detect amid heavy normal traffic. The zombies passively obey orders from the handler. They are the not the victims of the attack, just serving as intermediate agents, The Stacheldraht attack tool is easy to be confused by IDS because it use the same ICMP and TCP protocols used by the Trin00 and TFB tools. However if a large number of zombie machines reside in the network, all commanded by the same handler, it may be easier to identify the misbehaved scenario. Case 3: Detection Results on the Network Router (Alarm Matrix R) In the case of an Internet router, we assume that security demand has higher priority over the throughput performance. This IDS behavior is represented by the alarm matrix R below. The router uses an anomaly-based IDS on its traffic monitor. This monitor considers sets the detection threshold low to facilitate the detection of traffic bursts. We assumed the use of suspicious attack events instead of the number of packets in the intrusion detection process. SYN lood

10

0

0

0

UDP Flood

0

13

0

0

Smurf

0

0

24

0

5

2

5

0

False positive

= Alarm Matrix R

(5)

Each attack event corresponds to 1000 pps to alert the system (see Table 3). This case has no misclassification, because the three attack types apply different protocols on different ports. One special property worthy of mentioning is that the router IDS has high false positives being reported. These are caused by the background traffic. This scheme benefits those security-sensitive applications. However the normal traffic may suffer to some extent with the low detection threshold setting in the IDS. In other words, many normal packets could be blocked or dropped with the high security threshold applied. Case 4: Detection of DDoS Attacks on the Victim Server (Alarm Matrix V) On the victim server or the client host, we assume an IDS which is signature-based for low cost considerations. The victim IDS generates the following alarm matrix V. This case has high detection hits as seen by the large diagonal entries. There are no false-positives and no misclassifications. The signature-based IDS may have outdated signatures. Thus it may cause a lot of missed detection. This coincides with the practical situations, where the web server has to serve a large number of requests and has to handle the detection themselves. Thus the server may drop some packets for using a ordinary host-base IDS. This explains the fact that there are large number of missed detections in the rightmost column of matrix V.

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 14 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

SYN Flood

10

0

0

12

UDP Flood

0

33

0

14

Smurf

0

0

24

8

False positive

0

0

0

0

= Alarm Matrix V

(6)

Using the above 4 alarm matrices, we evaluate the performance of the IDS defense systems applied on four target platforms in the next section. The IDSs used, the assumptions, and alarm matrix entries are based on the characterization given in Table 4.

7. Analytical Results on Detection Performance With datamining, the detection performance can be systemically assessed. This improvement is directly resulted from the classification of the attacks into three classes in Figure 5. This will benefit the normal traffic flow and facilitate the dropping of malicious packets more accurately. Anomaly detection complements the static firewalls and IDS to provide reinforced defense against the DDoS attacks. In this section, we show how to use the alarm matrix to project the joint datamining and intrusion detection performance. Consider attack type i, where i = 1, 2, and 3 represent the three DDoS attacks in the above alarm matrices. The attack frequency Fi = ai1 + ai2 + ai3 + ai4 equals the summation of all matrix entries in the i-th row. The false-positives are not part of the attack frequency. The alarm frequency Gj = a1j + a2j + a3j + a4j for j = 1, 2, and 3 reveals all alarms raised by 3 attack types. The relative magnitude of the two varies with the IDS/platform applied. If G > F, an IDS may be cautiously designed to over kill with many false-positive alarms. An ineffective IDS may have missed many attacks such that G < F. We derive 4 performance expressions from the above alarm matrices. Detection and Alarm Rates: For each attack type i = 1, 2, and 3, we compute the detection hit rate and the detection miss rate by the following two ratios, respectively:

Hi = aii / Fi

and

Mi = ai,4 / Fi

(7)

The false positive rate for j = 1, 2, and 3 is given by the expression:

Sj = a4j / Gj = a4j / (a1j + a2j

+

a3j + a4j )

(8)

The miss-classification rate is defined for the 3 attack types as follows:

T1 = (a21 + a31) / G1 , T2 = (a12 + a32 ) /G2

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

,

T3 = (a13+ a23 ) / G3

(9)

Page 15 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

100%

100%

80%

80%

60%

60%

40%

40%

20%

20%

0%

0%

Trin00

TFN

Stacheldraht

Trin00

Shaft

(a) Handler machine (Matrix H)

St acheldraht

(b) Zombie machine (Matrix Z)

100%

100%

80%

80%

60%

60%

40%

40%

20%

20%

0%

TFN

0%

SYN Flood

UDP Flood

Smurf

(c) Router machine (Matrix R)

SYN Flood

UDP Flood

Smurf

(d) Victim machine (Matrix V)

Figure 6. Intrusion detection rate, miss rate, and false alarm rates on four target platforms involved in DDoS attacks Analytical Performance Results: Figure 6 plots the detection rate and various alarm rates for three attack tools in each of the 4 attacking scenarios specified above. Examining the 4 plots in Figure 6, we make the following observations on their relative performance. •

Among the four cases, the network router has the perfect detection rate with 100% hits for all 3 attacks (SYN flood, UPD flood, and Smurf). The handler has also high detection rate (58-80%). The Zombie has lower detection rate less than 26%. This is considered a typical performance rating of innocent hosts involved as unwilling agents in DDoS attacks. Of course, the rating depends on the quality of service and the capability of the IDS used.

•

The Zombie machine has the highest rate of misclassification. This implies that the Zombie is quite confused among three attack types: Trin00, TFN, and Stacheddraht. The false alarm rates in all cases are relatively low. The router has false positives but no false negatives or misclassifications.

•

The victim machine has moderate detection rate in the range (45%, 75%) and highest miss rate in the range (25%, 55%), but no false alarms and misclassifications at all. The victim may appear as a client host (PC or workstation) or a web server. The performance is caused by a large number of false negatives encountered on victim machines. This implies most

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 16 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

victims are vulnerable and they should be equipped with some host-based IDS to overcome the false negatives and misclassifications of DDoS attacks.

8. Conclusions and Extended Research The architectural design and feature innovations of the NetShield defense system are reported above. The NetShied software is developed through a simulator at USC Internet and Wireless Security Laboratory. Our continued effort is to build the complete security testbed with a Linux cluster. The NetShield simulator will be converted into a production software suite. Benchmark evaluation experiments on the NetShield system are still in progress at this time. At present, the NetShield is specially tailored to stop the DDoS flooding attacks. However, the design can be easily extended to defend against other attacks from network worms or viruses. We proposed an alarm-matrix framework to model the platform/IDS behavior. Risk assessment is carried out by profile datamining and protocol anomaly detection. This alarm-matrix model is shown effective to evaluate various attack scenarios on different platforms involved in DDoS attacks. Lessons Learned: To sum up, we highlight below major findings from this work and important lessons we have learned. The proposed NetShield architecture is based on profile datamining, risk assessment, and proactive defenses as specified in previous sections. a) Both reactive and proactive countermeasures are needed to fight against the DDoS attacks. These attacks may paralyze a wide range of the Internet and user resources, if adaptive countermeasures, such as those suggested in the NetShield system, could not be deployed timely. Dynamic intrusion response must base on the risk assessment results and rely on frequent security updates by adjusting the intrusion detection thresholds and following the traffic patterns closely. b) New vulnerability could be introduced with patches to block DDoS attacks. Prevention strategies are needed to block newly introduced vulnerabilities in the network. Successful response to protect one host may extend across all vulnerable hosts at a resource site. Patching all hosts may prevent R2L (root-to-local) and U2R (user-to-root) attacks, but not necessarily effective in stopping the DoS attacks. Across-site attacks must be addressed in future research. c) We have proposed a new datamining approach to automate the protocol anomaly detection process. The alarm-matrix model is rather handy to evaluate any IDS-based security systems. The idea is to perform some risk assessment through datamining of logged security records. These logs should cover historical attack patterns and corresponding responses and their effectiveness assessed in the past. The security database should have a learning capability from the past to face new attacks in the future.

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 17 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

Extended Research: To design an effective NetShield system is quite involved. In addition to those attack and detection models presented above, we identify below a number of other outstanding issues, which wait for satisfactory solutions at this time. Any new solutions must be verified by actual implementation and a long period of benchmarking experiments. Listed below are several good open research problems to be solved. d) Our current NetShield functionalities are being simulated in the USC Labs. We have only used security breach data collected from USC Information System Division locally. We plan to explore the large breach data collected from business or government communities, once a full security testbed is built by the end of 2003. We have planned major security benchmark experiments on the testbed in the next few years. e) The response timing and repeated responses must be evaluated for their effectiveness in blocking different DDoS attack types. The effects of blocking malicious packets at the expense of blocking normal packets should be studied. Both overkill and undercut are not desired. The defense infrastructure should never become a victim themselves. f) We have presented the attack profile matching algorithm (Figure 5) and functionally specified the corresponding architecture requirements in Figures 2 - 4. The security database development and the DDoS datamining experiments both take a long time to develop. We will report real benchmark results on the NetShield in the future [24]. The advantages and relative merits of protocol anomaly detection as compared with signature anomaly detection need lot more performance data to draw a final conclusion.

References: 1.

Axelsson, S. “The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection”, Proc. of the 6th ACM Conference on Computer and Communication Security, Singapore, Nov. 1999, pp.1-7.

2.

CERT/CC Security Statistics during 1988-2002, Computer Emergency Response Team, Coordination Center at Carnegie Mellon Univ., http://www.cert.org/stats/cert_atates.html, Pittsburgh, PA., Oct.20, 2002.

3.

Cisco QoS and DDoS Engineering Issues for Adaptive Defense Network, MITRE. 7/25/2001, http://www.mitre.org/support/papers/tech_papers_01/moore_cisco/index.shtml

4.

Garg, A. and Reddy, A.L., “Mitigation of DoS Attacks Through QoS Regulation”, Proc. of IWQoS Workshop, May 2002

5.

Gibson, S., “Distributed Reflection Denial-of-Service Attacks”, Gibson Research Corporation, Feb. 2002, http://grc.com/dos/drdos.htm

6.

Hwang, K. and Gangadharan, K., “Micro-Firewalls for Dynamic Network Security with Distributed Intrusion Detection”, Proc. of the IEEE Int’l Symp. on Network Computing and Applications, Cambridge, MA. October 18, 2001.

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 18 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

7.

Ioannidis, J. and Bellovin, S. “Pushback: Router-Based Defense Against DDoS Attacks”, Proc. of Network and Distributed System Security Symposium, Feb. 2002.

8.

Kuri, J., Navarro, G., Mé, L. , and Heye. L. “A Pattern Matching Based Filter for Audit Reduction and Fast Detection of Potential Intrusions.” Proceedings of Third International Symposium on Recent Advances in Intrusion Detection. 2000 , pp. 17-21.

9.

Kantardzic, M., Data Mining: Concepts, Models, Methods, and Algorithms. John Willey & Sons, New York, 2002.

10.

Kruegl, C., Valeuer, F., Vigna, G. and Kemmer, R., “Stateful intrusion detection for highspeed Networks”, Proceedings of 2002 IEEE Symp. on Security and Privacy, May 2002, pp.162 - 172

11.

Lee, W., et al, “Performance Adaptation in Real-Time Intrusion Detection Systems, Proc. Of RAID 2002, Lecture Notes for Computer Science No. 2516, Springer-Verlag, Berlin, 2002, pp.252-273

12.

Lee W., Stolfo S. J. and Mok K., “A Data Mining Framework for Adaptive Intrusion Detection”, in Artificial Intelligence Review, Kluwer Academic Publishers, 1999.

13.

Lin, T. Y., Hinke, T. H., Marks, D.G., and Thuraisingham, B., "Security and Data Mining," Proceedings of the Ninth Annual IFIP TC11 Working Conference on Database Security, Rensselaerville, N. Y, 1996.

14.

Lemonnier, E. “Protocol Anomay Detection in Network-based IDSs”, June 2001 http://erwan.lemonnier.free.fr/exjobb/report/protocol_anomaly_detection.pdf

15.

Almgren, M. and Lindqvist, U., “Application-Integrated Data Collection for Security Monitoring”, RAID 2001. Proceedings of Fourth International Symposium on Recent Advances in Intrusion Detection. Davis, CA. October 10, 2001

16.

Mikovic, J. and Reiher, P. “A Taxonomy of DDoS Attacks and DDoS Defense Mechanisms” Technical Report No. # 020018, Computer Science Dept., UCLA, 2002.

17.

Moore, D. Voelker, G. and Savage, S. “Inferring Internet Denial of Service Activity”, Proc. of the 2001 USENIX Security Symposium, Washing D.C., August 2001

18.

Morin, B., M´e, M., Debar, H, and Ducass´e, M., “M2D2: A Formal Data Model for IDS Alert Correlation”, RAID 2002 Fifth Int’l Symposium on Recent Advances in Intrusion Detection,,Zurich, Switzerland, October 16-18, 2002.

19.

Noel, S., Wijesekera, D. and Youman, C. “Modern Intrusion Detection, Datamining, and Degree of Attack Guilt” in Application of Datamining in Computer Security, Kluwer Academic Publishers. 2002

20.

NSS, Intrusion Detection Systems Group Test (Edition 3), NSS Group Report, Oakwood House, Cambridgeshire, PE28 2LX, England, U.K., 2002.

21.

Petkac, M. and Badger, L. “Security Agility in Response to Intrusion Detection”, 16th Annual Computer Security Applications Conference, New Orleans, Louisiana, 2000

22.

Ragsdale, D. Carver, C.A., Humphries, J.W., and Pooch, U.W., “ Adaptation Techniques for Intrusion Detection and Intrusion Response Systems”, Proceedings of the IEEE Int’l Conf. on Systems, Man, and Cybernetics, Nashville, TN., Oct. 8-11, 2000, pp. 2344-2349.

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 19 of 20

NetShield: Protocol Anomaly Detection with Datamining Against DDoS Attacks

23.

Tan, K.; Killourhy, K. S. and Maxion, R. A. "Undermining an Anomaly-Based Intrusion Detection System Using Common Exploits." Fifth International Symposium on Recent Advances in Intrusion Detection (RAID-2002), October 2002, Zurich, Switzerland, pp. 54-73.

24.

Tanachaiwiwat and Hwang, K., “Adaptive Packet Filtering Against DDoS Attacks”, Technical Report, Dept. of EE-Systems, University of Southern California, April 2003

25.

Vigna, G., Kemmerer, R.A., and Blix, P., “Designing a Web of Highly Configurable Intrusion Detection Sensors”, Proc. of the 4th Int’l Symp. of Recent Advances in Intrusion Detection, (RAID 2001)

Biographical Sketches: Kai Hwang is a Professor of Electrical Engineering and Computer Science and the Director of the Internet and Wireless Security Lab at the University of Southern California. An IEEE Fellow, he specializes in computer architecture, parallel processing, Internet security, and grid computing. Dr. Hwang has published numerous papers and books in these areas. Presently, he leads a research group at USC developing new Internet security architectures and distributed intrusion detection and response systems for cluster and grid computing. He can be reached at [email protected] Pinalkumar Dave is completing his M.S. degree in Electrical Engineering at the University of Southern California in May 2003. He will continue pursuing the Ph.D. degree at USC. He has received his B.S. degree in Department of Electronics and Communication, Gujarat University, India. He His current research interest includes wireless and ad hoc network security and distributed grid computing. He can be reached at [email protected]. Sapon Tanachaiwiwat is presently pursuing Ph.D. degree in the Electrical Engineering Department at the University of Southern California. He has received his B.S. degree in Electrical Engineering from the Mahidol University in Thailand and M.S. degree in EE from USC. His current research interest includes Internet security and distributed Intrusion detection and response. He can be reached at [email protected].

March 31, 2003 (Hwang, Dave, Tanachaiwiwat, USC)

Page 20 of 20