Situation:
Nagios is a widely used alerting system
Complication:
Sometimes you’re out to dinner and get an alert that is not immediately actionable until you finish desert
Question:
Can you ack the alert without having to patch in and ack through the nagios UI?
Answer:
Yes! You can ack alerts with a simple email reply with the words “ACK”
Avleen Vig wrote a great python script to poll the nagios inbox, parse the alert info and acknowledge the problem if ACK is in the message
- Install Nagios
- The base nagios install does not include a home directory and login for the nagios user, so create it manually
mkdir /home/nagios
- Create IMAP inbox for nagios to use (for both sending and receiving). This can be done through Gmail or any other IMAP server you have access to
- Copy Avleen’s script from Github to /home/nagios/
- chmod the script to be executable
chmod 760 nagios_email_handler.py
- Edit nagios_email_handler.py to match the nagios CMD file that is in your environment
40 CMD_FILE = '/usr/local/nagios/var/rw/nagios.cmd'
OR
40 CMD_FILE=/etc/nagios/var/rw/nagios.cmd'
- Put in your IMAP information into the script
42 # IMAP server, username and password 43 IMAP_SERVER = 'imap.example.com' 44 IMAP_USER = 'imapuser@example.com' 45 IMAP_PASS = "imap_password"
- If your host names in Nagios are longer than ~15 characters, then Gmail (and potentially others) will automatically make a new line to account for that, even though the Subject line is 1 line. Get around this by adding the ability to handle new lines within the script with \n at the end of ACK
152 if alert_class == 'Host': 153 msg = '[%s] ACKNOWLEDGE_HOST_PROBLEM;%s;1;1;1;%s;ACK\n' % \ 154 (now, server, fromaddr) 155 elif alert_class == 'Service': 156 msg = '[%s] ACKNOWLEDGE_SVC_PROBLEM;%s;%s;1;1;1;%s;ACK\n' % \ 157 (now, server, service, fromaddr) 158 open(CMD_FILE, 'w').write(msg) 159 LOGGER.info('ACKed alert: From: %s, Host: %s, Service: %s\n' % \ 160 (fromaddr, server, service))
- Cron the script to run every minute to search for new acknowledgements
crontab -e
SHELL=/bin/bash * * * * * /usr/bin/python $HOME/nagios_email_handler.py >> /var/log/nagios/email_ack.log 2>&1
- Test by purposefully getting nagios to alert, and then respond with an email with just the contents “ACK”. Look in /var/log/nagios/email_ack.log. Make sure the information is getting parsed correctly. You should see something like this:
Service, user@example.com, hostname, disk_usage, ack