What is Nagios?
Nagios Server
- Runs the service and host checks that you define. Configuration definitions, sends emails/third party phone calls and SMS, web interface
NRPE – Nagios Remote Plugin Executor
- Local agent that allows nagios server to get a command return message from the host you’re checking (disk usage, cpu load)
Let’s Build It!
Requirements:
sudo setenforce 0
sudo yum install epel-release sudo yum install nagios nagios-plugins-all nagios-plugins-nrpe nrpe php httpd vim
** If you have iptables, or firewalld running you’ll want to open up port 80 and 5666. CentOS minimal does not come with these installed **
sudo htpasswd /etc/nagios/passwd nagiosadmin
sudo chkconfig httpd on && chkconfig nagios on
sudo service httpd start sudo service nagios start
Now Let’s Look at configuration
sudo su cd /etc/nagios ls cgi.cfg conf.d/ nagios.cfg objects/ passwd private/
cd objects/ ls commands.cfg hosts.cfg printer.cfg switch.cfg timeperiods.cfg contacts.cfg localhost.cfg services.cfg templates.cfg windows.cfg
Host Checks
Service Checks
- Current Load, Current Users, HTTP, PING, Root Partition, SSH, Swap Usage, Total Processes
Image via: https://www.rittmanmead.com/blog/2012/09/an-introduction-to-monitoring-obiee-with-nagios/
Let’s Define Contacts First
vim contacts.cfg
define contactgroup { contactgroup_name ops alias Ops Team members nagiosadmin }
email nagios@localhost ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
name The name that you call in other .cfg configuration files service_notification_period The time that you want *service* alerts to fire. Can configure work-hours, 24x7 etc host_notification_period The time you want *host* alerts to fire service_notification_options Warning, Unknown, Critical, Resolved, Flapping, Scheduled downtime host_notification_options Down, Unknown, Resolved, Flapping, Scheduled downtime service_notification_commands Define email alerts, third party integrations (VictorOps, PagerDuty, OpsGenie) host_notification_commands Define alerts for host notifications register Partial definition or not
Let’s Define A Host To Alert On
define host { host_name sofree alias Software Freedom School address sofree.us use generic-host contacts ops ; The contact we just made }
Now Define What The Parameters Of That Host Check Should Be
vim templates.cfg
define host { name sofree-host use generic-host ; This grabs the notification period, notifications enabled, flap detection etc check_period 24x7 ; What hours this should check check_interval 5 ; How often to check, in minutes retry_interval 1 ; How often to retry when it fails max_check_attempts 10 ; How many times to retry until it alerts. In this config, you will get an alert after 10 minutes of the server being down check_command check-host-alive; Another template for how to check for the host, currently a template for a simple ping. You may make a different host check for http host alive, etc. notification-options d,u,r ; When should notify happen - Down, Up, Resolved contacts ops ; Who to alert to, options are contacts or contact groups register 0 ; Make this a template }
service nagios restart
nagios -v /etc/nagios/nagios.cfg
Error: Invalid host object directive ' '. Error: Could not add object property in file '/etc/nagios/objects/templates.cfg' on line 199. Error processing object config files!
This is because the notification options directive should have an underscore, not a dash
notification_options d,u,r ; When should notify happen - Down, Up, Resolved
Tell the Main Config to Include Your New Config files
vim /etc/nagios/nagios.cfg cfg_file=/etc/nagios/objects/hosts.cfg
service nagios restart
Install NRPE on a separate host
Disable SELinux
setenforce 0
yum install epel-release wget gcc openssl-devel cd /tmp wget http://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz tar -xzf nagios-plugins-2.2.1.tar.gz cd nagios-plugins-2.2.1 ./configure make make install yum install xinetd cd .. wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-3.2.1/nrpe-3.2.1.tar.gz tar -xzf nrpe-3.2.1.tar.gz cd nrpe-nrpe-3.2.1 ./configure make all make install-groups-users chown -R nagios.nagios /usr/local/nagios make install make install-config make install-init service xinetd restart chkconfig nrpe on && service nrpe start
vim /usr/local/nagios/etc/nrpe.cfg allowed_hosts=<ip addr of server>
Verify NRPE is running
/usr/local/nagios/libexec/check_nrpe -H localhost
vim /etc/sysctl.conf net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 sysctl -p service nrpe restart
Let’s look through the different plugins
ls /usr/local/nagios/libexec
Install NRPE on Nagios Server
wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-3.2.1/nrpe-3.2.1.tar.gz tar -xzf nrpe-3.21.tar.gz cd nrpe-nrpe-3.2.1 ./configure make check_nrpe make install-plugin
/usr/local/nagios/libexec/check_nrpe -H <ip addr of host> -4 -c check_load
vim /etc/nagios/nagios.cfg cfg_file=/etc/nagios/objects/services.cfg vim services.cfg
define service { use sofree-service host_name nrpe_test service_description check_load check_command check_nrpe!check_load } define service { use sofree-service host_name nrpe_test service_description check_xvda1 check_command check_nrpe!check_hda1 }
NRPE commands need to be defined in 3 places
1) On server –> services.cfg, or other .cfg file
check_nrpe!check_load check_nrpe!check_hda1
2) On server –> commands.cfg
define command { command_name check_nrpe command_line $USER1$/check_nrpe -u -H $HOSTADDRESS$ -c $ARG1$ }
3) On host –> /etc/nagios/nrpe.cfg || /usr/local/nagios/etc/nrpe.cfg
On host machine, match up the command with the argument you’re passing
command[check_load]=/usr/local/nagios/libexec/check_load -r -w .15,.10,.05 -c .30,.25,.20 command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/xvda1
Now you go and define a couple services on your host
Logs are located at /var/logs/nagios/nagios.log
Additional material:
• nagios_email_ack
• Nagdash
• Converting epoch time:
cat /usr/local/nagios/var/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
• Nagios dynamic
• Custom commands (plugins, external scripts etc). API calls are a great use of external scripts.