Nagios SNMP checks

When setting up Nagios we wanted to use SNMP checking of Linux devices. While there are included checks to do this the decision of using SNMP was to reduce the overhead of using SSH. These are the steps to get Nagios to do SNMP monitoring as well as a Linux host to give information over SNMP.

Note: We are still learning so there may be easier ways to do this

Install and Configure SNMP on Linux

  1. The first step is to install net-snmp on the host
    [root@web1 tmp]# yum install net-snmp -y
    
  2. Next step move the automatically generated configuration and create a new configuration file
    [root@web1 tmp]# mv /etc/snmp/snmpd.conf /etc/snmp/snmpd.conf.bak
    [root@web1 tmp]# vim /etc/snmp/snmpd.conf
    
  3. Next add the following to the snmpd.conf file
    rocommunity public 192.168.100.0/24
    

    We are using public as the snmp community name and all monitoring will come from the 192.168.100.0 network. Modify the command to suit you

  4. Next start snmpd and set it to start on boot
    [root@web1 tmp]# service snmpd start
    [root@web1 tmp]# chkconfig snmpd on
    

The server should now respond to SNMP requests. Note: SNMP is not a secure protocol, make sure your community name is secure and you are limiting where SNMP requests can come from

Setup Nagios to do SNMP Checks

Make sure you have Nagios plugins installed. Note you need to have net-snmp installed before you install nagios-plugins for it to install the check_snmp plugin. If you need help refer to the guide here

This guide assumes that nagios is installed in /usr/local/nagios/ please adjust for your installation. Please check that libexec/check_snmp exists. This is the command used to do the snmp checks.

CPU Load

  1. Open etc/objects/commands.cfg in your text editor
    [root@monitoring nagios]# vim etc/objects/commands.cfg
    
  2. Find the command for snmp monitoring
    define command{
            command_name    check_snmp
            command_line    $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
    }
    
  3. We are going to use this to create 3 new checks. 1 Minute Average, 5 Minute Average and 15 Minute Average. Copy the following and past it below the check_snmp command
    define command{
            command_name    snmp_1minute_load
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.10.1.3.1 -H $HOSTADDRESS$ $ARG1$
    }
    
    define command{
            command_name    snmp_5minute_load
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.10.1.3.2 -H $HOSTADDRESS$ $ARG1$
    }
    
    define command{
            command_name    snmp_15minute_load
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.10.1.3.3 -H $HOSTADDRESS$ $ARG1$
    }
    

    The above creates 3 new service checks using check_snmp to poll the appropriate OID’s for the load averages.

  4. Now in your configuration for checks for a host you can create the following service checks
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     CPU 1 Minute Average
            check_command           snmp_1minute_load!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     CPU 5 Minute Average
            check_command           snmp_5minute_load!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     CPU 15 Minute Average
            check_command           snmp_15minute_load!-C public
    }
    

    Adjust the commands as required. The -C is used to specify the community name, in our case public. Change the community to suit your installation

  5. Restart Nagios for the changes to take effect.
    [root@monitoring nagios]# /etc/init.d/nagios restart
    

RAM/Swap Usage

  1. Open etc/objects/commands.cfg in your text editor
    [root@monitoring nagios]# vim etc/objects/commands.cfg
    
  2. Find the command for snmp monitoring
    define command{
            command_name    check_snmp
            command_line    $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
    }
    
  3. We are going to create 4 checks RAM Total/Free and Swap Total/Free. Copy the following and past it below the check_snmp command
    define command{
            command_name    snmp_SwapSize
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.4.3.0 -H $HOSTADDRESS$ $ARG1$
    }
    
    define command{
            command_name    snmp_SwapFree
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.4.4.0 -H $HOSTADDRESS$ $ARG1$
    }
    
    define command{
            command_name    snmp_RamSize
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.4.5.0 -H $HOSTADDRESS$ $ARG1$
    }
    
    define command{
            command_name    snmp_RamFree
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.4.11.0 -H $HOSTADDRESS$ $ARG1$
    }
    
  4. Now in your configuration for checks for a host you can create the following service checks
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Swap Size
            check_command           snmp_SwapSize!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Swap Free
            check_command           snmp_SwapFree!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     RAM Size
            check_command           snmp_RamSize!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     RAM Free
            check_command           snmp_RamFree!-C public
    }
    

    Adjust the commands as required. The -C is used to specify the community name, in our case public. Change the community to suit your installation

  5. Restart Nagios for the changes to take effect.
    [root@monitoring nagios]# /etc/init.d/nagios restart
    

System Uptime

  1. Open etc/objects/commands.cfg in your text editor
    [root@monitoring nagios]# vim etc/objects/commands.cfg
    
  2. Find the command for snmp monitoring
    define command{
            command_name    check_snmp
            command_line    $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
    }
    
  3. Copy the following and past it below the check_snmp command
    define command{
            command_name    snmp_Uptime
            command_line    $USER1$/check_snmp -o .1.3.6.1.2.1.1.3.0 -H $HOSTADDRESS$ $ARG1$
    }
    
  4. Now in your configuration for checks for a host you can create the following service check
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Uptime
            check_command           snmp_Uptime!-C public
    }
    

    Adjust the commands as required. The -C is used to specify the community name, in our case public. Change the community to suit your installation

  5. Restart Nagios for the changes to take effect.
    [root@monitoring nagios]# /etc/init.d/nagios restart
    

Monitor Disk Usage

Monitoring disk usage over SNMP is slightly more complicated. In your snmpd.conf you need to specify the disks that you want to be able to monitor.

  1. On the host open snmpd.conf with your text editor
    [root@web1 tmp]# vim /etc/snmp/snmpd.conf
    
  2. Now we need to add the disks that we want to monitor. The command to use is disk . In our case we want to monitor the root volume (/) and a volume mounted at /backups.
    disk /
    disk /backups
    

    Add all the mount points that you want to monitor

  3. Now we need to define a checks to check the mount point, disk size, disk usage and percentage free. Because we are checking 2 partitions we need to create a check for each, you will need to create as many checks as you have partitions that you want to check.
    [root@monitoring nagios]# vim etc/objects/commands.cfg
    
    #Get the mount point of the first disk
    define command{
            command_name    snmp_Disk1_Mount
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.2.1 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the size of the first disk
    define command{
            command_name    snmp_Disk1_Size
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.6.1 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the usage of the first disk
    define command{
            command_name    snmp_Disk1_Usage
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.8.1 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the usage as a percentage of the first disk
    define command{
            command_name    snmp_Disk1_UsedPercentage
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.9.1 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the mount point of the second disk
    define command{
            command_name    snmp_Disk2_Mount
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.2.2 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the size of the second disk
    define command{
            command_name    snmp_Disk2_Size
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.6.2 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the usage of the second disk
    define command{
            command_name    snmp_Disk2_Usage
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.8.2 -H $HOSTADDRESS$ $ARG1$
    }
    
    #Get the usage as a percentage of the second disk
    define command{
            command_name    snmp_Disk2_UsedPercentage
            command_line    $USER1$/check_snmp -o .1.3.6.1.4.1.2021.9.1.9.2 -H $HOSTADDRESS$ $ARG1$
    }
    

    The thing to know is that the last number of the OID is the index starting at 1 of the disk. So our first disk the OID ends in 1 and our second disk our OID ends in 2. If you need to add more partitions just update the command name and increment the number at the end of the OID

  4. Now in your configuration for checks for a host you can create the following service checks
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 1 Mountpoint
            check_command           snmp_Disk1_Mount!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 2 Mountpoint
            check_command           snmp_Disk2_Mount!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 1 Size
            check_command           snmp_Disk1_Size!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 2 Size
            check_command           snmp_Disk2_Size!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 1 Usage
            check_command           snmp_Disk1_Usage!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 2 Usage
            check_command           snmp_Disk2_Usage!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 1 Usage Percentage
            check_command           snmp_Disk1_UsedPercentage!-C public
    }
    
    define service{
            use                     generic-service
            host_name               web1.onemetric.com.au
            service_description     Disk 2 Usage Percentage
            check_command           snmp_Disk2_UsedPercentage!-C public
    }
    
    

    Adjust the commands as required. The -C is used to specify the community name, in our case public. Change the community to suit your installation

  5. Restart Nagios for the changes to take effect.
    [root@monitoring nagios]# /etc/init.d/nagios restart