Calculate bandwidth usage by IP address using scapy, iftop-style

I use scapy to sniff the mirror port and create a list of the top 10 talking speakers, i.e. list of hosts using high bandwidth on my network. I know existing tools like iftop and ntop , but I need more control over the output.

The following script exemplifies traffic for 30 seconds, and then prints a list of the 10 most popular speakers in the format "source host β†’ host host: bytes". This is great, but how can I calculate the average byte per second ?

I realized that changing sample_interval to 1 second does not allow me to select traffic well, so it seems to me that I need to average it. So I tried this at the end of the script:

bytes per second = (total byte / sample _interval)

but the received / s bytes seem much lower. For example, I generated rsync between two hosts with a throttle rate of 1.5 MB / s, but using average calculation, my script continued to calculate the speed between these hosts as around 200 KB / s ... much lower than 1.5 MB / with as i expected. I can confirm with iftop that 1.5 MB / s is actually the speed between the two nodes.

Am I summing the packet length incorrectly using scapy (see the traffic_monitor_callbak function)? Or is this a bad decision overall :)?

from scapy.all import * from collections import defaultdict import socket from pprint import pprint from operator import itemgetter sample_interval = 30 # how long to capture traffic, in seconds # initialize traffic dict traffic = defaultdict(list) # return human readable units given bytes def human(num): for x in ['bytes','KB','MB','GB','TB']: if num < 1024.0: return "%3.1f %s" % (num, x) num /= 1024.0 # callback function to process each packet # get total packets for each source->destination combo def traffic_monitor_callbak(pkt): if IP in pkt: src = pkt.sprintf("%IP.src%") dst = pkt.sprintf("%IP.dst%") size = pkt.sprintf("%IP.len%") # initialize if (src, dst) not in traffic: traffic[(src, dst)] = 0 else: traffic[(src, dst)] += int(size) sniff(iface="eth1", prn=traffic_monitor_callbak, store=0, timeout=sample_interval) # sort by total bytes, descending traffic_sorted = sorted(traffic.iteritems(), key=itemgetter(1), reverse=True) # print top 10 talkers for x in range(0, 10): src = traffic_sorted[x][0][0] dst = traffic_sorted[x][0][1] host_total = traffic_sorted[x][3] # get hostname from IP try: src_hostname = socket.gethostbyaddr(src) except: src_hostname = src try: dst_hostname = socket.gethostbyaddr(dst) except: dst_hostname = dst print "%s: %s (%s) -> %s (%s)" % (human(host_total), src_hostname[0], src, dst_hostname[0], dst) 

I'm not sure if this is a programming issue (scapy / python) or a more general network issue, so I call this a network programming issue.

+6
source share
2 answers

Hello,

First of all, you have an error in the code that you posted: instead of host_total = traffic_sorted[x][3] you probably mean host_total = traffic_sorted[x][1] .

Then you have a mistake: you will forget to split host_total by the value of sample_interval .

Since you also want to add sender traffic to the sender and the sender and receiver, I think the best way is to use an β€œordered” tuple (the order itself does not matter, the lexicographic order can be fine, but you can also use arithmetic order, since IP the addresses are 4 octet integers) as the key for the Counter object. This seems to work just fine:

 #! /usr/bin/env python sample_interval = 10 interface="eth1" from scapy.all import * from collections import Counter # Counter is a *much* better option for what you're doing here. See # http://docs.python.org/2/library/collections.html#collections.Counter traffic = Counter() # You should probably use a cache for your IP resolutions hosts = {} def human(num): for x in ['', 'k', 'M', 'G', 'T']: if num < 1024.: return "%3.1f %sB" % (num, x) num /= 1024. # just in case! return "%3.1f PB" % (num) def traffic_monitor_callback(pkt): if IP in pkt: pkt = pkt[IP] # You don't want to use sprintf here, particularly as you're # converting .len after that! # Here is the first place where you're happy to use a Counter! # We use a tuple(sorted()) because a tuple is hashable (so it # can be used as a key in a Counter) and we want to sort the # addresses to count mix sender-to-receiver traffic together # with receiver-to-sender traffic.update({tuple(sorted(map(atol, (pkt.src, pkt.dst)))): pkt.len}) sniff(iface=interface, prn=traffic_monitor_callback, store=False, timeout=sample_interval) # ... and now comes the second place where you're happy to use a # Counter! # Plus you can use value unpacking in your for statement. for (h1, h2), total in traffic.most_common(10): # Let factor out some code here h1, h2 = map(ltoa, (h1, h2)) for host in (h1, h2): if host not in hosts: try: rhost = socket.gethostbyaddr(host) hosts[host] = rhost[0] except: hosts[host] = None # Get a nice output h1 = "%s (%s)" % (hosts[h1], h1) if hosts[h1] is not None else h1 h2 = "%s (%s)" % (hosts[h2], h2) if hosts[h2] is not None else h2 print "%s/s: %s - %s" % (human(float(total)/sample_interval), h1, h2) 

Scapy may not be fast enough to get the job done. Of course, you can, for example, tcpdump -w , write your traffic to a file in sample_interval seconds, and then run it (by the way, see how to apply the function to packages, I think it's good to know if you use Scapy often):

 #! /usr/bin/env python sample_interval = 10 filename="capture.cap" from scapy.all import * from collections import Counter traffic = Counter() hosts = {} def human(num): for x in ['', 'k', 'M', 'G', 'T']: if num < 1024.: return "%3.1f %sB" % (num, x) num /= 1024. return "%3.1f PB" % (num) def traffic_monitor_callback(pkt): if IP in pkt: pkt = pkt[IP] traffic.update({tuple(sorted(map(atol, (pkt.src, pkt.dst)))): pkt.len}) # A trick I like: don't use rdpcap() that would waste your memory; # iterate over a PcapReader object instead. for p in PcapReader("capture.cap"): traffic_monitor_callback(p) for (h1, h2), total in traffic.most_common(10): h1, h2 = map(ltoa, (h1, h2)) for host in (h1, h2): if host not in hosts: try: rhost = socket.gethostbyaddr(host) hosts[host] = rhost[0] except: hosts[host] = None h1 = "%s (%s)" % (hosts[h1], h1) if hosts[h1] is not None else h1 h2 = "%s (%s)" % (hosts[h2], h2) if hosts[h2] is not None else h2 print "%s/s: %s - %s" % (human(float(total)/sample_interval), h1, h2) 
+3
source

This may not be the case, but can you mix megabytes * per second (Mb / s) and megabytes * per second (MB / s)? You seem to measure the amount of data sent in bytes and then convert them to MB / s, but I wonder if rsync has set uo ws spec'd in bits to 1.5 Mbps. If so, then your script gives you 200 kB / s, at least in the right step at a speed of 1.5 Mbps ...

0
source

All Articles