I was called on to provide a method of alerting from within nagios that was more active and direct than the usual use of email or SMS messages. So I came up with a simple way to have a nagios notification place a phone call to our off hours tier3 support line to report certain very rare but serious problems.
This was actually a two part solution. We were interested in looking for certain things coming out of the Broadsoft audit log that were important enough to wake someone up in the middle of the night. So I wrote a daemonized script that tails the Broadsoft audit log and interprets it looking for these config changes and then reports this to a special RT3 queue. It also notifies nagios (a push notification to a passive service check) over a socket connection. A listener script on the nagios box (using net::server) validates the syntax of the alert and lets nagios know.. which in turn triggers (and manages the scheduling) of the outgoing phone call(s) through asterisk.
I used the google TTS engine to record certain fixed statements that would be common across calls.. converting the audio to the proper format for asterisk (SLN16) with sox. For the specific alert text I’m using a simplified version of festival from the command line called ‘flite’. The asterisk part is done entirely in a perl agi script and allows the called person to repeat the alert or acknowledge it within nagios. If they don’t answer or don’t ack the alert nagios will initiate another call in a few minutes.. and I’m able to use service escalations to notify different people if it goes too long without a response.
This project has more moving pieces than I usually like to use but it was interesting in just how easy it was to get working. It’s gotten me thinking about doing a more full featured voice fronted to nagios that I could release.
Download the scripts here.