Performance of Packet Capturing Systems

Performance of Packet Capturing Systems Hardware Selection for Monitoring Fabian Schneider [email protected] Technische Universtit¨ at Be...
Author: Ronald Gibbs
1 downloads 2 Views 2MB Size
Performance of Packet Capturing Systems Hardware Selection for Monitoring Fabian Schneider [email protected] Technische Universtit¨ at Berlin Deutsche Telekom Laboratories

11.12.2006

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

1 / 34

Introduction

Motivation

Motivation

• high speed networks → high data and packet rate • network security tools need to capture this traffic • 2 Choices: • expensive special hardware • cheap commodity systems

⇒ Is it feasible to capture the traffic with commodity hardware?

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

2 / 34

Introduction

Outline

Outline

1 2

3

Monitoring 10 Gigabit Measurement Setup Systems under Test Topology Procedure Profiling Workload Workload Generation Packet Size Distribution Output

Fabian Schneider (TU Berlin/DT Labs)

4

5

Results Using multiple processors? Increasing the buffer size Additional filtering Additional copy operations mmaped pcap Linux write to disk Further Results Conclusion Summary Future Work Resources

Performance of Packet Capturing Systems

11.12.2006

3 / 34

Monitoring 10 Gigabit

Monitoring 10 Gigabit

• monitoring 10 Gigabit of traffic needs app. 2500 MBytes/s (both

directions) • no recent bus or disk system can handle this! • need to split up traffic: • use a switch: e.g. link bundling feature (Cisco: Etherchannel) • use specialized hardware

• But be careful: do not split up data that belongs together

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

4 / 34

Measurement Setup

1

Monitoring 10 Gigabit

2

Measurement Setup Systems under Test Topology Procedure Profiling

3

Workload

4

Results

5

Conclusion

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

5 / 34

Measurement Setup

Systems under Test

Systems under Test

Opterons: 2x AMD Opteron 244 (1 MB Cache, 1.8 GHz), 2 GB RAM, Intel 82544EI optical GigE, Disk System: ATA-RAID on 3ware 7000 Controller Xeons: 2x Intel Xeon (512 kB Cache, 3.06 GHz), 2 GB RAM, Intel 82544EI optical GigE, Disk System: ATA-RAID on 3ware 7000 Controller Dual-Core Opterons: 2x2 AMD Opteron 270 (1 MB Cache, 2.0 GHz), 2 GB RAM, Intel 82544EI optical GigE, Disk System: SCSI-RAID on Compaq Smart Array 64xx & external RAID (easyRAID, SATA based) attached via SCSI. 2 examples of any of the systems: one installed with Linux and the other with FreeBSD Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

6 / 34

Measurement Setup

Topology

Topology eth0

gen eth2

SNMP Interface Couter Queries

eth1 Cisco C3500XL Workload ->

Splitter

swan

moorhen

flamingo

snipe

Control Network Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

7 / 34

Measurement Setup

Procedure

Procedure Measurement categories: • Capturing Rate • System Load

Measurement Sequence: 1

Login to the four sniffers → Start the capturing and profiling applications. (Save process ID’s)

2

Login to gen → Read SNMP packet counters of the switch.

3

Login to gen → Start packet generation.

4

Login to gen → Read SNMP packet counters of the switch.

5

Login to the four sniffers → Stop the applications (via saved process ID’s).

Measurement Specification

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

8 / 34

Measurement Setup

Profiling

Profiling

Goal: record CPU usage while capturing • based of the mechanisms used by top • CPU accounting information (user, system, idle, interrupt, . . . )

written twice per second to file • additional minimum/maximum/average identification • ”under load” condition and resulting averages identified by awk script.

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

9 / 34

Workload

1

Monitoring 10 Gigabit

2

Measurement Setup

3

Workload Workload Generation Packet Size Distribution Output

4

Results

5

Conclusion

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

10 / 34

Workload

Workload Generation

Workload Generation

• Requirements:

Speed: Line Speed (1 Gbit/s) is desired Reproducibility: of the load and to avoid unrepeatable failures Realness: at least packet sizes should match • Checked different existing tools → none was sufficient • Best: Linux Kernel Packet Generator can only generate packets of

fixed size ⇒ Necessity to add generation of different packet sizes → Identify distributions.

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

11 / 34

Workload

Packet Size Distribution

Observed Packet Size Distribution 109

40 52

75 % of all packets in the 13 most frequent sizes!

1500

108

106

105

100 cumulated percentage percentage of packets of size

95

104

90 103

85 80

102

75 100

200

300

400

500

600

70

700 800 900 1000 1100 1200 1300 1400 1500 packet size

65 60

Only few frequent sizes Implementation

55 50 45 40 35 30 25 20 15 10 5 rest

1460

1470

57

1454

44

1452

1480

1440

60

1400

64

576

1300

48

1492

552

52

1420

0 40

0

1500

101

number of packet of size (24h trace)

percentage

number of packets per size

107

packets of size (sorted by percentage descending)

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

12 / 34

Workload

Output

Output: Packet Size Distribution 109

originally captured generated

108

number of packets

107

106

105

104

103

102

101

107

packets classified by size (sorted descending by quantity of packets)

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

13 / 34

Workload

Output

Output: Data and Packet Rate Generator: median generation rate (with min/max errors)

700

875

600

750

500

625

400

500

300

375

200

250

100

125

kpps

1000

0

Mbit/s

packet rate (kpps) data rate (Mbit/s)

800

0 min pktsize (40 bytes)

Fabian Schneider (TU Berlin/DT Labs)

distribution

max pktsize (1500 bytes)

Performance of Packet Capturing Systems

11.12.2006

14 / 34

Results

1

Monitoring 10 Gigabit

2

Measurement Setup

3

Workload

4

Results Using multiple processors? Increasing the buffer size Additional filtering Additional copy operations mmaped pcap Linux write to disk Further Results

5

Conclusion Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

15 / 34

(32) no-improvement: no SMP, no HT, 1 app, traffic: generated, no filter, no load

Capturing Rate [%]

Only one processor

Using multiple processors?

Linux/AMD - swan Linux/Intel - snipe FreeBSD/AMD - moorhen FreeBSD/Intel - flamingo Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Results

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Data Rate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

16 / 34

(31) no-improvement: SMP, no HT, 1 app, traffic: generated, no filter, no load

Capturing Rate [%]

Multiprocessor (SMP)

Using multiple processors?

Linux/AMD - swan Linux/Intel - snipe FreeBSD/AMD - moorhen FreeBSD/Intel - flamingo Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Results

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Data Rate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

17 / 34

(17) increased-buffers: SMP, no HT, 1 app, traffic: generated, no filter, no load

Capturing Rate [%]

increased buffers

Increasing the buffer size

Linux/AMD - swan Linux/Intel - snipe FreeBSD/AMD - moorhen FreeBSD/Intel - flamingo Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Results

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Data Rate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

18 / 34

Results

(21) filter: SMP, no HT, 1 app, traffic: generated, 50 BPF instr., no load

Additional filtering

Linux/AMD - swan Linux/Intel - snipe FreeBSD/AMD - moorhen FreeBSD/Intel - flamingo Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Capturing Rate [%]

additional filtering (BPF/LSF)

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Datarate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

19 / 34

(27) memcpy-50: SMP, no HT, 1 app, traffic: generated, no filter, no load

Capturing Rate [%]

50 additional copy ops

Additional copy operations

Linux/AMD - swan Linux/Intel - snipe FreeBSD/AMD - moorhen FreeBSD/Intel - flamingo Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Results

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Data Rate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

20 / 34

mmaped pcap Linux

(19) mmaped pacp: SMP, no HT, 1 app, traffic: generated, no filter, no load

Capturing Rate [%]

mmap Patch (only Linux)

Linux/AMD mmap - swan Linux/AMD - swan alt Linux/Intel mmap - snipe Linux/Intel - snipe alt Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Results

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Datarate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

21 / 34

write to disk

(2-8) write to disk: SMP, 1 app, traffic: generated, no filter, no load

Capturing Rate [%]

Dual Core: writing to disk

32bit FreeBSD/Opteron 64bit FreeBSD/Opteron 32bit Linux/Opteron 64bit Linux/Opteron Capturing Rate [%] CPU usage [%]

100 90 80 70 60 50 40 30 20 10 0

100 90 80 70 60 50 40 30 20 10 0

CPU usage [%]

Results

50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 Datarate [Mbit/s] Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

22 / 34

Results

Further Results

Further Results

• running multiple capturing applications concurrently leads to bad

performance. • Measurement with additional compression show some advantage for

Intel Systems • Intel Hyperthreading does not change the performance • FreeBSD 5.4 performs better than FreeBSD 5.2.1 (no comparable

measurements for FreeBSD 6 at the moment). • Using 4 processors (2x Dual Core) is minimal better than 2 Processors

(Dual Core)

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

23 / 34

Conclusion

1

Monitoring 10 Gigabit

2

Measurement Setup

3

Workload

4

Results

5

Conclusion Summary Future Work Resources

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

24 / 34

Conclusion

Summary

Summary

• FreeBSD/AMD Opteron combination in general performs best • choosing the right buffer size is important • filtering is cheap with respect to its benefit • using the memory-map patch from Phil Woods does help • 64bit systems drop more packets • capturing full traces to disk is feasible up to about 600 Mbit

bandwidth.

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

25 / 34

Conclusion

Future Work

Future Work

• 10 Gigabit Ethernet • future operating system versions / direct comparison of different

versions on the same machine (e.g.: FreeBSD 4.x 5.x 6.x) • New Intel I/O Acceleration Technology • implement a mmaped packet reception for FreeBSD • (Windows platform)

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

26 / 34

Conclusion

End

Questions?

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

27 / 34

Conclusion

End

Thanks for the attention!

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

28 / 34

Conclusion

Resources

Software Profiling • cpusage: Available at http://www.net.in.tum.de/~schneifa/sources/cpusage-0.2.tar.gz ,

• trimusage.awk Script: http://www.net.in.tum.de/~schneifa/sources/trimusage.awk

Capturing • createDist: Available at http://www.net.in.tum.de/~schneifa/sources/createDist-0.1.tar.gz ,

• tcpdump: Available at www.tcpdump.org

Workload • patched LKPG: Available at http://www.net.in.tum.de/~schneifa/pktgen-lkpg-dist-0.1.tar.gz Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

29 / 34

Conclusion

Resources

Further Reading

F. Schneider. Best Packet Capture System http://www.net.t-labs.tu-berlin.de/research/bpcs/ F. Schneider. Performance evaluation of packet capturing systems for high-speed networks. Diplomarbeit, http: // www. net. in. tum. de/ ~schneifa/ papers/ da. ps

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

30 / 34

Measurement Setup

Measurement Specification

Measurement Specification

• seven similar measurements → to avoid errors • a million packets per run • 26 different inter-packet gaps per measurement

→ increasing data and packet rate • average of different runs with errorbars for min and max values • no filter to capture all the packets Return

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

31 / 34

Workload

Implementation

Workload – Implementation

Return Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

32 / 34

OS insights

FreeBSD

Packet Reception in FreeBSD

• interrupt context • double buffer as

interface to userspace • one buffer pair per

capturing session • 3 packet copy operations

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

33 / 34

OS insights

Linux

Packet Reception in Linux

• soft-interrupts used • central memory block for

all packets handled in kernel • pointer queue as

interface to userspace • 2 packet copy operations

Fabian Schneider (TU Berlin/DT Labs)

Performance of Packet Capturing Systems

11.12.2006

34 / 34