Nagios installation and configuration
Let us discuss the overview, installation and configuration of Nagios, a powerful open source monitoring solution for host and services.
I. Overview of nagios
I. Overview of nagios
II. 8 steps for installing nagios on Linux:
- Download the nagios and plugins
- Take care of the prerequisites
- Create user and group for nagios
- Install nagios
- Configure the web interface
- Compile and install nagios plugins
- Start Nagios
- Login to web interface
III. Configuration files overview
I. Overview of Nagios
.
Nagios is a host and service monitor tool. Following are some of the features of nagios.
Nagios is a host and service monitor tool. Following are some of the features of nagios.
- Monitor equipments such as servers, switches, routers, firewalls, power supply etc.
- Monitor services such as disk space, cpu usage, memory usage, temperature of the equipment, HTTP, Mail, SSH etc.
- Nagios can monitor pretty much anything. for e.g. host, services, databases, applications etc.
- Nagios has an extensible plugin interface for monitoring user defined services. There are lot of plugins available for Nagios. Visit NagiosPlugins and NagiosExchange for review the available user developed plugins.
- It can send out various notifications ( email, pager etc.) when the problem occurs and get resolved.
- Web interface to view current status, notifications, problem history, log files etc.
Following is a partial screenshot of the nagios web dashboard:

Fig: Nagios Web UI (click on the image to enlarge)
II. 8 steps for installing nagios on Linux:
1. Download the nagios and plugins
Download following files from Nagios.org and move to /home/downloads
- nagios-3.0.1.tar.gz
- nagios-plugins-1.4.11.tar.gz
2. Take care of the prerequisites
- Make sure apache is working on the server by verifying from browser: http://localhost
- Verify whether gcc is installed
[root@localhost]#rpm -qa | grep gcc gcc-3.4.6-8 compat-gcc-32-3.2.3-47.3 libgcc-3.4.6-8 compat-libgcc-296-2.96-132.7.2 compat-gcc-32-c++-3.2.3-47.3 gcc-c++-3.4.6-8
- Verify whether GD is installed
[root@localhost]# rpm -qa gd gd-2.0.28-5.4E
3. Create user and group for nagios
[root@localhost]# useradd nagios [root@localhost]# passwd nagios [root@localhost]# groupadd nagcmd [root@localhost]# usermod -G nagcmd nagios [root@localhost]# usermod -G nagcmd apache
4. Install nagios
[root@localhost]# tar xvf nagios-3.0.1.tar.gz [root@localhost]# cd nagios-3.0.1 [root@localhost]# ./configure --with-command-group=nagcmd [root@localhost]# make all [root@localhost]# make install [root@localhost]# make install-config [root@localhost]# make install-commandmode
Following are some additional parameters that you can pass to ./configure to customize your installation. I used only --with-command-group as shown above.
--prefix /opt/nagios Where to put the Nagios files --with-cgiurl /nagios/cgi-bin Web server url where the cgi's will be available --with-htmurl /nagios Web server url where nagios will be available --with-nagios-user nagios user account under which Nagios will run --with-nagios-group nagios group account under which Nagios will run --with-command-group nagcmd group account which will allow the apache user to submit commands to Nagios
At the end of the configure output, it will display a summary as shown below:
*** Configuration summary for nagios 3.0.1 05-28-2008 ***: General Options: ------------------------- Nagios executable: nagios Nagios user/group: nagios,nagios Command user/group: nagios,nagcmd Embedded Perl: no Event Broker: yes Install ${prefix}: /usr/local/nagios Lock file: ${prefix}/var/nagios.lock Check result directory: ${prefix}/var/spool/checkresults Init directory: /etc/rc.d/init.d Apache conf.d directory: /etc/httpd/conf.d Mail program: /bin/mail Host OS: linux-gnu Web Interface Options: ------------------------ HTML URL: http://localhost/nagios/ CGI URL: http://localhost/nagios/cgi-bin/ Traceroute (used by WAP): /bin/traceroute
5. Configure the web interface.
[root@localhost]# make install-webconf [root@localhost# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin New password: Re-type new password: Adding password for user nagiosadmin
6. Compile and install nagios plugins
[root@localhost]# tar xvf nagios-plugins-1.4.11.tar.gz [root@localhost]# cd nagios-plugins-1.4.11 [root@localhost]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios [root@localhost]# make [root@localhost]# make install
Note: On Red Hat, the ./configure command mentioned above did not work and was hanging at the when it was displaying the message: checking for redhat spopen problem… Add –enable-redhat-pthread-workaround to the ./configure command as a work-around for the above problem as shown below.
[root@localhost]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios --enable-redhat-pthread-workaround
7. Start Nagios
- Add the nagios to the startup routine:
[root@localhost]# chkconfig --add nagios [root@localhost]# chkconfig nagios on
- Verify to make sure there are no errors in the nagios configuration file:
[root@localhost]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
- Start the nagios
[root@localhost]# service nagios start Starting nagios: done.
8. Login to web interface
Nagios Web URL: http://localhost/nagios/
Use the userid, password that was created from step#5 above.
Use the userid, password that was created from step#5 above.
III. Configuration files overview
.
The first configuration to modify is to change the default value of email address in /usr/local/nagios/etc/objects/contacts.cfg file to your email address.
The first configuration to modify is to change the default value of email address in /usr/local/nagios/etc/objects/contacts.cfg file to your email address.
Following are the three major configuration files located under /usr/local/nagios/etc
- nagios.cfg – This is the primary Nagios configuration file where lot of global parameters that controls the nagios can be defined.
- cgi.cfg - This files has configuration information for nagios web interface.
- resource.cfg – If you have to pass some sensitive information (username, password etc.) to a plugin to monitor a specific service, you can define them here. This file is readable only by nagios user and group.
Following are the other configuration files under /usr/local/nagios/etc/objects directory:
- contacts.cfg: All the contacts who needs to be notified should be defined here. You can specify name, email address, what type of notifications they need to receive and what is the time period this particular contact should be receiving notifications etc.
- commands.cfg – All the commands to check services are defined here. You can use $HOSTNAME$ and $HOSTADDRESS$ macro on the command execution that will substitute the corresponding hostname or host ip-address automatically.
- timeperiods.cfg – Define the timeperiods. for e.g. if you want a service to be monitored only during the business hours, define a time period called businesshours and specify the hours that you would like to monitor.
- templates.cfg – Multiple host or service definition that has similar characteristics can use a template, where all the common characteristics can be defined. Use template is a time saver.
- localhost.cfg – Defines the monitoring for the local host. This is a sample configuration file that comes with nagios installation that you can use as a baseline to define other hosts that you would like to monitor.
- printer.cfg – Sample config file for printer
- switch.cfg – Sample config file for switch
- windows.cfg – Sample config file for a windows machine
.
Monitor Remote Linux Host using Nagios
I. Overview
II. 6 steps to install Nagios plugin and NRPE on remote host.
- Download Nagios Plugins and NRPE Add-on
- Create nagios account
- Install Nagios Plugins
- Install NRPE
- Setup NRPE to run as daemon
- Modify the /usr/local/nagios/etc/nrpe.cfg
III. 4 Configuration steps on the Nagios monitoring server to monitor remote host:
- Download NRPE Add-on
- Install check_nrpe
- Create host and service definition for remote host
- Restart the nagios service
I. Overview:
.
Following three steps will happen on a very high level when Nagios (installed on the nagios-servers) monitors a service (for e.g. disk space usage) on the remote Linux host.
- Nagios will execute check_nrpe command on nagios-server and request it to monitor disk usage on remote host using check_disk command.
- The check_nrpe on the nagios-server will contact the NRPE daemon on remote host and request it to execute the check_disk on remote host.
- The results of the check_disk command will be returned back by NRPE daemon to the check_nrpe on nagios-server.
Following flow summarizes the above explanation:
Nagios Server (check_nrpe) —–> Remote host (NRPE deamon) —–> check_disk
Nagios Server (check_nrpe) <—– Remote host (NRPE deamon) <—– check_disk (returns disk space usage)
II. 7 steps to install Nagios Plugins and NRPE on the remote host
.
1. Download Nagios Plugins and NRPE Add-on
Download following files from Nagios.org and move to /home/downloads:
- nagios-plugins-1.4.11.tar.gz
- nrpe-2.12.tar.gz
2. Create nagios account
[remotehost]# useradd nagios [remotehost]# passwd nagios
3. Install nagios-plugin
[remotehost]# cd /home/downloads [remotehost]# tar xvfz nagios-plugins-1.4.11.tar.gz [remotehost]# cd nagios-plugins-1.4.11 [remotehost]# export LDFLAGS=-ldl [remotehost]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios --enable-redhat-pthread-workaround [remotehost]# make [remotehost]# make install [remotehost]# chown nagios.nagios /usr/local/nagios [remotehost]# chown -R nagios.nagios /usr/local/nagios/libexec/
Note: On Red Hat, For me the ./configure command was hanging with the the message:“checking for redhat spopen problem…”. Add --enable-redhat-pthread-workaround to the ./configure command as a work-around for the above problem.
4. Install NRPE
[remotehost]# cd /home/downloads [remotehost]# tar xvfz nrpe-2.12.tar.gz [remotehost]# cd nrpe-2.12 [remotehost]# ./configure [remotehost]# make all [remotehost]# make install-plugin [remotehost]# make install-daemon [remotehost]# make install-daemon-config [remotehost]# make install-xinetd
5. Setup NRPE to run as daemon (i.e as part of xinetd):
- Modify the /etc/xinetd.d/nrpe to add the ip-address of the Nagios monitoring server to the only_from directive. Note that there is a space after the 127.0.0.1 and the nagios monitoring server ip-address (in this example, nagios monitoring server ip-address is: 192.168.1.2)
only_from = 127.0.0.1 192.168.1.2
- Modify the /etc/services and add the following at the end of the file.
nrpe 5666/tcp # NRPE
- Start the service
[remotehost]#service xinetd restart
- Verify whether NRPE is listening
[remotehost]# netstat -at | grep nrpe tcp 0 0 *:nrpe *:* LISTEN
- Verify to make sure the NRPE is functioning properly
[remotehost]# /usr/local/nagios/libexec/check_nrpe -H localhost NRPE v2.12
6. Modify the /usr/local/nagios/etc/nrpe.cfg
The nrpe.cfg file located on the remote host contains the commands that are needed to check the services on the remote host. By default the nrpe.cfg comes with few standard check commands as samples. check_users and check_load are shown below as an example.
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
In all the check commands, the “-w” stands for “Warning” and “-c” stands for “Critical”. for e.g. in the check_disk command below, if the available disk space gets to 20% of less, nagios will send warning message. If it gets to 10% or less, nagios will send critical message. Change the value of “-c” and “-w” parameter below depending on your environment.
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
Note: You can execute any of the commands shown in the nrpe.cfg on the command line on remote host and see the results for yourself. For e.g. When I executed the check_disk command on the command line, it displayed the following:
[remotehost]#/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1 DISK CRITICAL - free space: / 6420 MB (10% inode=98%);| /=55032MB;51792;58266;0;64741
In the above example, since the free disk space on /dev/hda1 is only 10% , it is displaying the CRITICAL message, which will be returned to nagios server.
III. 4 Configuration steps on the Nagios monitoring server to monitor remote host:
.
1. Download NRPE Add-on
Download nrpe-2.12.tar.gz from Nagios.org and move to /home/downloads:
2. Install check_nrpe on the nagios monitoring server
[nagios-server]# tar xvfz nrpe-2.12.tar.gz [nagios-server]# cd nrpe-2.1.2 [nagios-server]# ./configure [nagios-server]# make all [nagios-server]# make install-plugin ./configure will give a configuration summary as shown below: *** Configuration summary for nrpe 2.12 05-31-2008 ***: General Options: ————————- NRPE port: 5666 NRPE user: nagios NRPE group: nagios Nagios user: nagios Nagios group: nagios
Note: I got the “checking for SSL headers… configure: error: Cannot find ssl headers” error message while performing ./configure. Install openssl-devel as shown below and run the ./configure again to fix the problem.
[nagios-server]# rpm -ivh openssl-devel-0.9.7a-43.16.i386.rpm krb5-devel-1.3.4-47.i386.rpm zlib-devel-1.2.1.2-1.2.i386.rpm e2fsprogs-devel-1.35-12.5. el4.i386.rpm warning: openssl-devel-0.9.7a-43.16.i386.rpm: V3 DSA signature: NOKEY, key ID db42a60e Preparing… ########################################### [100%] 1:e2fsprogs-devel ########################################### [ 25%] 2:krb5-devel ########################################### [ 50%] 3:zlib-devel ########################################### [ 75%] 4:openssl-devel ########################################### [100%]
Verify whether nagios monitoring server can talk to the remotehost.
[nagios-server]#/usr/local/nagios/libexec/check_nrpe -H 192.168.1.3 NRPE v2.12
Note: 192.168.1.3 in the ip-address of the remotehost where the NRPE and nagios plugin was installed as explained in Section II above.
3. Create host and service definition for remotehost
Create a new configuration file /usr/local/nagios/etc/objects/remotehost.cfg to define the host and service definition for this particular remotehost. It is good to take the localhost.cfg and copy it as remotehost.cfg and start modifying it according to your needs.
host definition sample:
define host{ use linux-server host_name remotehost alias Remote Host address 192.168.1.3 contact_groups admins }
Service definition sample:
define service{ use generic-service service_description Root Partition contact_groups admins check_command check_nrpe!check_disk }
Note: In all the above examples, replace remotehost with the corresponding hostname of your remotehost.
4. Restart the nagios service
Restart the nagios as shown below and login to the nagios web (http://nagios-server/nagios/) to verify the status of the remotehost linux sever that was added to nagios for monitoring.
[nagios-server]# service nagios reload
Monitor Remote Windows Machine Using Nagios on Linux
I. Overview
II. 4 steps to install nagios on remote windows host
II. 4 steps to install nagios on remote windows host
- Install NSClient++ on the remote windows server
- Modify the NSClient++ Service
- Modify the NSC.ini
- Start the NSClient++ Service
III. 6 configuration steps on nagios monitoring server
- Verify check_nt command and windows-server template
- Uncomment windows.cfg in /usr/local/nagios/etc/nagios.cfg
- Modify /usr/local/nagios/etc/objects/windows.cfg
- Define windows services that should be monitored.
- Enable Password Protection
- Verify Configuration and Restart Nagios.
I. Overview
.
Following three steps will happen on a very high level when Nagios (installed on the nagios-server) monitors a service (for e.g. disk space usage) on the remote Windows host.
Following three steps will happen on a very high level when Nagios (installed on the nagios-server) monitors a service (for e.g. disk space usage) on the remote Windows host.
- Nagios will execute check_nt command on nagios-server and request it to monitor disk usage on remote windows host.
- The check_nt on the nagios-server will contact the NSClient++ service on remote windows host and request it to execute the USEDDISKSPACE on the remote host.
- The results of the USEDDISKSPACE command will be returned back by NSClient++ daemon to the check_nt on nagios-server.
Following flow summarizes the above explanation:
Nagios Server (check_nt) —–> Remote host (NSClient++) —–> USEDDISKSPACE
Nagios Server (check_nt) <—– Remote host (NSClient++) <—– USEDDISKSPACE (returns disk space usage)
II. 4 steps to setup nagios on remote windows host
.
1. Install NSClient++ on the remote windows server
Download NSCP 0.3.1 (NSClient++-Win32-0.3.1.msi) from NSClient++ Project. NSClient++ is an open source windows service that allows performance metrics to be gathered by Nagios for windows services. Go through the following five NSClient++ installation steps to get the installation completed.
(1) NSClient++ Welcome Screen
(2) License Agreement Screen
(3) Select Installation option and location. Use the default option and click next.

(4) Ready to Install Screen. Click on Install to get it started.
(5) Installation completed Screen.
2. Modify the NSClient++ Service
(1) NSClient++ Welcome Screen
(2) License Agreement Screen
(3) Select Installation option and location. Use the default option and click next.

(4) Ready to Install Screen. Click on Install to get it started.
(5) Installation completed Screen.
2. Modify the NSClient++ Service
Go to Control Panel -> Administrative Tools -> Services. Double click on the “NSClientpp (Nagios) 0.3.1.14 2008-03-12 w32″ service and select the check-box that says “Allow service to interact with desktop” as shown below.
3. Modify the NSC.ini
(1) Modify NSC.ini and uncomment *.dll: Edit the C:\Program Files\NSClient++\NSC.ini file and uncomment everything under [modules] except RemoteConfiguration.dll and CheckWMI.dll
[modules] ;# NSCLIENT++ MODULES ;# A list with DLLs to load at startup. ; You will need to enable some of these for NSClient++ to work. ; ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ; * * ; * N O T I C E ! ! ! - Y O U H A V E T O E D I T T H I S * ; * * ; ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! FileLogger.dll CheckSystem.dll CheckDisk.dll NSClientListener.dll NRPEListener.dll SysTray.dll CheckEventLog.dll CheckHelpers.dll ;CheckWMI.dll ; ; RemoteConfiguration IS AN EXTREM EARLY IDEA SO DONT USE FOR PRODUCTION ENVIROMNEMTS! ;RemoteConfiguration.dll ; NSCA Agent is a new beta module use with care! NSCAAgent.dll ; LUA script module used to write your own "check deamon" (sort of) early beta. LUAScript.dll ; Script to check external scripts and/or internal aliases, early beta. CheckExternalScripts.dll ; Check other hosts through NRPE extreme beta and probably a bit dangerous!NRPEClient.dll
(2) Modify NSC.ini and uncomment allowed_hosts. Edit the C:\Program Files\NSClient++\NSC.ini file and Uncomment allowed_host under settings and add the ip-address of the nagios-server.
;# ALLOWED HOST ADDRESSES ; This is a comma-delimited list of IP address of hosts that are allowed to talk to the all daemons. ; If leave this blank anyone can access the deamon remotly (NSClient still requires a valid password). ; The syntax is host or ip/mask so 192.168.0.0/24 will allow anyone on that subnet access allowed_hosts=192.168.1.2/255.255.255.0
Note: allowed_host is located under [Settings], [NSClient] and [NRPE] section. Make sure to change allowed_host under [Settings] for this purpose.
(3) Modify NSC.ini and uncomment port. Edit the C:\Program Files\NSClient++\NSC.ini file and uncomment the port# under [NSClient] section
(3) Modify NSC.ini and uncomment port. Edit the C:\Program Files\NSClient++\NSC.ini file and uncomment the port# under [NSClient] section
;# NSCLIENT PORT NUMBER ; This is the port the NSClientListener.dll will listen to. port=12489
(4) Modify NSC.ini and specify password. You can also specify a password the nagios server needs to use to remotely access the NSClient++ agent.
[Settings] ;# OBFUSCATED PASSWORD ; This is the same as the password option but here you can store the password in an obfuscated manner. ; *NOTICE* obfuscation is *NOT* the same as encryption, someone with access to this file can still figure out the ; password. Its just a bit harder to do it at first glance. ;obfuscated_password=Jw0KAUUdXlAAUwASDAAB ; ;# PASSWORD ; This is the password (-s) that is required to access NSClient remotely. If you leave this blank everyone will be able to access the daemon remotly. password=My2Secure$Password
4. Start the NSClient++ Service
Start the NSClient++ service either from the Control Panel -> Administrative tools -> Services -> Select “NSClientpp (Nagios) 0.3.1.14 2008-03-12 w32″ and click on start (or) Click on “Start -> All Programs -> NSClient++ -> Start NSClient++ (Win32) . Please note that this will start the NSClient++ as a windows service.
Later if you modify anything in the NSC.ini file, you should restart the “NSClientpp (Nagios) 0.3.1.14 2008-03-12 w32″ from the windows service.
Later if you modify anything in the NSC.ini file, you should restart the “NSClientpp (Nagios) 0.3.1.14 2008-03-12 w32″ from the windows service.
III. 6 configuration steps on nagios monitoring server
.
1. Verify check_nt command and windows-server template
Verify that the check_nt is enabled under /usr/local/nagios/etc/objects/commands.cfg
# 'check_nt' command definition define command{ command_name check_nt command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$ }
Verify that the windows-server template is enabled under /usr/local/nagios/etc/objects/templates.cfg
# Windows host definition template - This is NOT a real host, just a template! define host{ name windows-server ; The name of this host template use generic-host ; Inherit default values from the generic-host template check_period 24x7 ; By default, Windows servers are monitored round the clock check_interval 5 ; Actively check the server every 5 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 10 ; Check each server 10 times (max) check_command check-host-alive ; Default command to check if servers are "alive" notification_period 24x7 ; Send notification out at any time - day or night notification_interval 30 ; Resend notifications every 30 minutes notification_options d,r ; Only send notifications for specific host states contact_groups admins ; Notifications get sent to the admins by default hostgroups windows-servers ; Host groups that Windows servers should be a member of register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE }
2. Uncomment windows.cfg in /usr/local/nagios/etc/nagios.cfg
# Definitions for monitoring a Windows machine cfg_file=/usr/local/nagios/etc/objects/windows.cfg
3. Modify /usr/local/nagios/etc/objects/windows.cfg
By default a sample host definition for a windows server is given under windows.cfg, modify this to reflect the appropriate windows server that needs to be monitored through nagios.
# Define a host for the Windows machine we'll be monitoring # Change the host_name, alias, and address to fit your situation define host{ use windows-server ; Inherit default values from a template host_name remote-windows-host ; The name we're giving to this host alias Remote Windows Host ; A longer name associated with the host address 192.168.1.4 ; IP address of the remote windows host }
4. Define windows services that should be monitored.
Following are the default windows services that are already enabled in the sample windows.cfg. Make sure to update the host_name on these services to reflect the host_name defined in the above step.
define service{ use generic-service host_name remote-windows-host service_description NSClient++ Version check_command check_nt!CLIENTVERSION } define service{ use generic-service host_name remote-windows-host service_description Uptime check_command check_nt!UPTIME } define service{ use generic-service host_name remote-windows-host service_description CPU Load check_command check_nt!CPULOAD!-l 5,80,90 } define service{ use generic-service host_name remote-windows-host service_description Memory Usage check_command check_nt!MEMUSE!-w 80 -c 90 } define service{ use generic-service host_name remote-windows-host service_description C:\ Drive Space check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90 } define service{ use generic-service host_name remote-windows-host service_description W3SVC check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC } define service{ use generic-service host_name remote-windows-host service_description Explorer check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe }
5. Enable Password Protection
If you specified a password in the NSC.ini file of the NSClient++ configuration file on the Windows machine, you’ll need to modify the check_nt command definition to include the password. Modify the /usr/local/nagios/etc/commands.cfg file and add password as shown below.
define command{ command_name check_nt command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -s My2Secure$Password -v $ARG1$ $ARG2$ }
6. Verify Configuration and Restart Nagios.
Verify the nagios configuration files as shown below.
[nagios-server]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
Restart nagios as shown below.
[nagios-server]# /etc/rc.d/init.d/nagios stop Stopping nagios: .done. [nagios-server]# /etc/rc.d/init.d/nagios start Starting nagios: done.
Verify the status of the various services running on the remote windows host from the Nagios web UI (http://nagios-server/nagios) as shown below.
Monitor Network Switch and Ports Using Nagios
Nagios is hands-down the best monitoring tool to monitor host and network equipments. Using Nagios plugins you can monitor pretty much monitor anything.
I use Nagios intensively and it gives me peace of mind knowing that I will get an alert on my phone, when there is a problem. More than that, if warning levels are setup properly, Nagios will proactively alert you before a problem becomes critical.
Earlier I wrote about, how to setup Nagios to monitor Linux Host, Windows Host and VPN device.
In this article, I’ll explain how to configure Nagios to monitor network switch and it’s active ports.
I use Nagios intensively and it gives me peace of mind knowing that I will get an alert on my phone, when there is a problem. More than that, if warning levels are setup properly, Nagios will proactively alert you before a problem becomes critical.
Earlier I wrote about, how to setup Nagios to monitor Linux Host, Windows Host and VPN device.
In this article, I’ll explain how to configure Nagios to monitor network switch and it’s active ports.
1. Enable switch.cfg in nagios.cfg
Uncomment the switch.cfg line in /usr/local/nagios/etc/nagios.cfg as shown below.
[nagios-server]# grep switch.cfg /usr/local/nagios/etc/nagios.cfg cfg_file=/usr/local/nagios/etc/objects/switch.cfg
2. Add new hostgroup for switches in switch.cfg
Add the following switches hostgroup to the /usr/local/nagios/etc/objects/switch.cfg file.
define hostgroup{ hostgroup_name switches alias Network Switches }
3. Add a new host for the switch to be monitered
In this example, I’ve defined a host to monitor the core switch in the /usr/local/nagios/etc/objects/switch.cfg file. Change the address directive to your switch ip-address accordingly.
define host{ use generic-switch host_name core-switch alias Cisco Core Switch address 192.168.1.50 hostgroups switches }
4. Add common services for all switches
Displaying the uptime of the switch and verifying whether switch is alive are common services for all switches. So, define these services under the switches hostgroup_name as shown below.
# Service definition to ping the switch using check_ping define service{ use generic-service hostgroup_name switches service_description PING check_command check_ping!200.0,20%!600.0,60% normal_check_interval 5 retry_check_interval 1 } # Service definition to monitor switch uptime using check_snmp define service{ use generic-service hostgroup_name switches service_description Uptime check_command check_snmp!-C public -o sysUpTime.0 }
5. Add service to monitor port bandwidth usage
check_local_mrtgtraf uses the Multil Router Traffic Grapher – MRTG. So, you need to install MRTG for this to work properly. The *.log file mentioned below should point to the MRTG log file on your system.
define service{ use generic-service host_name core-switch service_description Port 1 Bandwidth Usage check_command check_local_mrtgtraf!/var/lib/mrtg/192.168.1.11_1.log!AVG!1000000,2000000!5000000,5000000!10 }
6. Add service to monitor an active switch port
Use check_snmp to monitor the specific port as shown below. The following two services monitors port#1 and port#5. To add additional ports, change the value ifOperStatus.n accordingly. i.e n defines the port#.
# Monitor status of port number 1 on the Cisco core switch define service{ use generic-service host_name core-switch service_description Port 1 Link Status check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB } # Monitor status of port number 5 on the Cisco core switch define service{ use generic-service host_name core-switch service_description Port 5 Link Status check_command check_snmp!-C public -o ifOperStatus.5 -r 1 -m RFC1213-MIB }
7. Add services to monitor multiple switch ports together
Sometimes you may need to monitor the status of multiple ports combined together. i.e Nagios should send you an alert, even if one of the port is down. In this case, define the following service to monitor multiple ports.
# Monitor ports 1 - 6 on the Cisco core switch. define service{ use generic-service host_name core-switch service_description Ports 1-6 Link Status check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB, -o ifOperStatus.2 -r 1 -m RFC1213-MIB, -o ifOperStatus.3 -r 1 -m RFC1213-MIB, -o ifOperStatus.4 -r 1 -m RFC1213-MIB, -o ifOperStatus.5 -r 1 -m RFC1213-MIB, -o ifOperStatus.6 -r 1 -m RFC1213-MIB }
8. Validate configuration and restart nagios
Verify the nagios configuration to make sure there are no warnings and errors.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
Restart the nagios server to start monitoring the VPN device.
# /etc/rc.d/init.d/nagios stop Stopping nagios: .done. # /etc/rc.d/init.d/nagios start Starting nagios: done.
Verify the status of the switch from the Nagios web UI: http://{nagios-server}/nagios as shown below:
Fig: Nagios GUI displaying status of a Network Switch
9. Troubleshooting
Issue1: Nagios GUI displays “check_mrtgtraf: Unable to open MRTG log file” error message for the Port bandwidth usage
Solution1: make sure the *.log file defined in the check_local_mrtgtraf service is pointing to the correct location.
Issue2: Nagios UI displays “Return code of 127 is out of bounds – plugin may be missing” error message for Port Link Status.
Issue2: Nagios UI displays “Return code of 127 is out of bounds – plugin may be missing” error message for Port Link Status.
Solution2: Make sure both net-snmp and net-snmp-util packages are installed. In my case, I was missing the net-snmp-utils package and installing it resolved this issue as shown below.
[nagios-server]# rpm -qa | grep net-snmp net-snmp-libs-5.1.2-11.el4_6.11.2 net-snmp-5.1.2-11.el4_6.11.2 [nagios-server]# rpm -ivh net-snmp-utils-5.1.2-11.EL4.10.i386.rpm Preparing... ########################################### [100%] 1:net-snmp-utils ########################################### [100%] [nagios-server]# rpm -qa | grep net-snmp net-snmp-libs-5.1.2-11.el4_6.11.2 net-snmp-5.1.2-11.el4_6.11.2 net-snmp-utils-5.1.2-11.EL4.10
Note: After you’ve installed net-snmp and net-snmp-utils, re-compile and re-install nagios plugins as explained in “6. Compile and install nagios plugins” in the Nagios 3.0 jumpstart guide.
setup url or website monitoring in nagios server
First of all create a configuration directory for writing the rules. You can also create the rules in localhost.cfg but I recommend to create a separate directory and create the files in it.
#mkdir /etc/nagios/monitor_websites
and cd to this directory
And create file host.cfg in this directory for setting the urls.
#vi host.cfg
Suppose I want to monitor three sites
www.abc.com, www.xyz.com, www.pqr.com
Configure host.cfg as below.
#vi host.cfg
define host{
host_name abc.com
alias abc
address www.abc.com
use generic-host
}
define host{
host_name xyz.com
alias xyz
address www.xyz.com
use generic-host
}
define host{
host_name pqr.com
alias pqr
address www.pqr.com
use generic-host
}
#Defining group of urls - you should add this if you want to set up an HTTP check service.
define hostgroup {
hostgroup_name monitor_websites
alias monitor_urls
members www.abc.com, www.xyz.com, www.pqr.com
}
:wq #save it
And now create the file services.cfg for setting the service ( http_check )
#vi services.cfg
## Hostgroups services ##
define service {
hostgroup_name monitor_websites
service_description HTTP
check_command check_http
use generic-service
notification_interval 0
}
Now give the permissions for directory and configuration files.
#chown -R nagios:nagios monitor_websites
List and check.
[root@mail nagios]# ll monitor_websites
total 16
-rw-r--r-- 1 nagios nagios 669 Apr 25 23:13 host.cfg
-rw-r--r-- 1 nagios nagios 253 Apr 25 23:15 services.cfg
[root@mail nagios]#
Now give the configuration directory path in main nagios configuration file.
#vi /etc/nagios/nagios.cfg
cfg_dir=/etc/nagios/monitor_websites
:wq
Now restart the nagios service.
#service nagios restart
#mkdir /etc/nagios/monitor_websites
and cd to this directory
And create file host.cfg in this directory for setting the urls.
#vi host.cfg
Suppose I want to monitor three sites
www.abc.com, www.xyz.com, www.pqr.com
Configure host.cfg as below.
#vi host.cfg
define host{
host_name abc.com
alias abc
address www.abc.com
use generic-host
}
define host{
host_name xyz.com
alias xyz
address www.xyz.com
use generic-host
}
define host{
host_name pqr.com
alias pqr
address www.pqr.com
use generic-host
}
#Defining group of urls - you should add this if you want to set up an HTTP check service.
define hostgroup {
hostgroup_name monitor_websites
alias monitor_urls
members www.abc.com, www.xyz.com, www.pqr.com
}
:wq #save it
And now create the file services.cfg for setting the service ( http_check )
#vi services.cfg
## Hostgroups services ##
define service {
hostgroup_name monitor_websites
service_description HTTP
check_command check_http
use generic-service
notification_interval 0
}
Now give the permissions for directory and configuration files.
#chown -R nagios:nagios monitor_websites
List and check.
[root@mail nagios]# ll monitor_websites
total 16
-rw-r--r-- 1 nagios nagios 669 Apr 25 23:13 host.cfg
-rw-r--r-- 1 nagios nagios 253 Apr 25 23:15 services.cfg
[root@mail nagios]#
Now give the configuration directory path in main nagios configuration file.
#vi /etc/nagios/nagios.cfg
cfg_dir=/etc/nagios/monitor_websites
:wq
Now restart the nagios service.
#service nagios restart
Thank you for sharing more valuable information on nagios to learn more about this check it once at Devops Online Course Bangalore.
ReplyDelete