I'm using GroundWork (a monitoring framework built upon Nagios) to monitor some network devices via SNMP, but I'm stuck on a problem with bandwidth usage.
Most routers, including the Cisco 2800 used here, can be queried via SNMP for network traffic information; however, they return this information in the form of a counter, meaning "how many bytes have gone in or out of a given interface since the router was turned on". So, in order to get something meaningful out of this information, you need to, f.e. query this information each second and see how much each measure differs from the previous one.
Groundwork/Nagios can automatically manage this for performance data, because they use RRD for storing those data and RRD supports computing deltas between values.
But how can I generate an alert when bandwidth usage exceeds a certain limit? Nagios can send alerts only when a value is above a given threshold, not based on the difference between two distinct measures of the same value.
I need a way to check if bandwidth usage is above a certain threshold, and generate a Nagios alert (thus sending an e-mail) if this happens; I can't rely only on the admin looking at performance data to see if something is wrong with network bandwidth.
Answer
I did this with a cron script, stores current value in a temp file then next time uses it to calculate bandwidth utilization since last run.
#!/bin/bash
email_address=""
router_ip=""
# 80% BANDWIDTH [ (384000bps) 48,000Bps ] - 20% = 38,400 Bps
alertBW="76800"
lastBWFile="/var/log/ciscoGW.log"
lastBW=`cat $lastBWFile | awk '{print$2}'`
lastTime=`cat $lastBWFile | awk '{print$1}'`
curBW=` snmpget -c snmap_name -v 1 $router_ip IF-MIB::ifOutOctets.2 | awk '{print$4}'`
let diffBW=$curBW-$lastBW
#echo "Diff BW: $diffBW"
timeNow=`date +%s`
let diffTime=$timeNow-$lastTime
let alertBW=$alertBW*$diffTime
echo "$timeNow $curBW" > $lastBWFile
if [ $diffBW -gt $alertBW ]; then
# echo "Over limit!"
echo "Bandwith used over $diffTime seconds: $diffBW" | mail -s "BANDWIDTH OVER LIMIT!!!!" $email_address
fi
Since I was more interested in actual peaks I've since moved to using rrdtool:
#start 15 minutes ago
#end 5 minutes ago since rrdtool queries every 5 minutes
rrdtool fetch $FROM MAX -s -900 -e -300
No comments:
Post a Comment