Debian Clusters for Education and Research: The Missing Manual

Nagios NRPE Addon Installation and Configuration

From Debian Clusters

Jump to: navigation, search

Nagios is a service that, by default, runs on only one node. That host node can be used to check various services on other nodes - including SSH, ping, web services, and many others - but it can't execute commands on remote machines. In order to do that, you need to install the Nagios NRPE plugin, sometimes called the Nagios client. This plugin has two components: a simple plugin for Nagios on the host machine, and an NRPE daemon. The daemon needs to be installed on every machine than the host will be running remote commands on.

Contents

Installation on the Host Node

First, you'll need to install the NRPE plugin on the Nagios host node. This is the host that runs the web server and that you've already set up Nagios on. (If you haven't already set up Nagios, this tutorial will help you, and you should do that first.)

To install the host side of NRPE, issue

apt-get install nagios-nrpe-plugin

Installation on the Clients

Each one of the clients will need to have this installed and configured. On each machine that the Nagios host will contact to execute plugins remotely, issue

apt-get install nagios-nrpe-server
  • When prompted for Workgroup/Domain Name: enter your internal domain name
  • When asked, Modify smb.conf to use WINS settings from DHCP?, keep the default of no

Depending on the version of Debian you're running, this may or may not install all of the default Nagios plugins. If you want them (if you're not just running your own custom plugins), install them with

apt-get install nagios-plugins

Configuration on the Clients

Next, /etc/nagios/nrpe_local.cfg needs to be edited on each one of the clients. Add the line

  • allowed_hosts=<Nagios host IP>

and put in the IP address of the machine that runs Nagios. Then, for each plugin you want to be able to run remotely, add a line like this:

  • command[<command name>]=<full path to plugin and any arguments>

Then you'll need to restart the NRPE daemon with

/etc/init.d/nagios-nrpe-server restart

If you create this file once, you can use my Cluster Time-saving Tricks to copy it over to the rest of the nodes and also restart the NRPE on all of them.

Configuration on the Head Node and Sanity Check

Once you've finished with all of the clients, you should be ready to implement a remote check on one of the user nodes. I'm going to use one of the built-in plugins, so I'm assuming you've installed those. If you haven't, a custom plugin will work just as well.

First, you'll need to edit /etc/nagios-plugins/config/check_nrpe.cfg. I changed this file to get it to work for me, because I wanted to use the base check_nrpe without having an extra incorrect argument. I commented out the first command (check_nrpe) and changed the second command definition from check_nrpe_1arg to just check_nrpe.

Then, restart Nagios:

/etc/init.d/nagios2 restart

Now you're ready to check it. Whatever command you're going to run, you need to have this command set up on the client side as shown above. Some of these are already defined in /etc/nagios/nrpe.cfg, but if the plugins weren't automatically installed for you (and you didn't apt-get install them), they might not work. I'm going to run the command check_users. To do this, issue

/usr/lib/nagios/plugins/check_nrpe -H <fully qualified host name> -c check_users

If it finishes correctly, you should see the number of users currently logged into the system.

Possible Problems

There are a number of potential problems at this point. You'll see an error like this,

Connection refused by host

if you didn't put the correct IP address in the nrpe_local.cfg file on the client (see above), or if you did it correctly but forgot to restart NRPE on the client.

Another error,

NRPE: Command 'check_users' not defined

indicates that you didn't define the commands you wanted, again on the client, or you didn't restart NRPE on the client after you defined them.

Finally,

NRPE: Unable to read output

usually means that the path to the plugin to run is incorrect on the client. If you change it, remember to restart NRPE again.

Bringing it All Together

Once you get it working, you're finally ready to implement the command into Nagios on the Nagios host. You'll do this by defining a service. Either in /etc/nagios2/conf.d/services_nagios2.cfg or in a file of your own creation in the same directory, add a section like this:

define service {
        hostgroup_name <hostgroup to take monitor>
        service_description <short description>
        check_command check_nrpe!<command to run remotely>
        use generic-service
}

Then open your web interface and enjoy the new functionality!

References

Personal tools