Creating custom active checks in Nagios

One of the most common areas where Nagios can be suited to fit your needs is that of active checks. These are the checks that are scheduled and run by the Nagios daemon. This functionality is described in more detail in Chapter 2, Installing Nagios 4.

Nagios has a project that ships the commonly-used plugins and comes with a large variety of checks that can be performed. Before thinking of writing anything on your own, it is best to check for standard plugins (described in detail in Chapter 6, Using the Nagios Plugins).

Note

The NagiosExchange (http://exchange.nagios.org) website contains multiple ready to use plugins for performing active checks. It is recommended that you check whether somebody has already written a similar plugin for your needs.

The reason for this is that even though active checks are quite easy to implement, sometimes a complete implementation that handles errors and command-line options parsing is not very easy to create. Typically, proper error handling can take a lot of time to implement. Another thing is that plugins that have already existed for some time have often been thoroughly tested by others. Typical errors will have already been identified and fixed; sometimes the plugins will have been tested in a larger environment, under a wider variety of conditions. Writing check plugins on your own should be preceded by an investigation to find out whether anybody has encountered and solved a similar problem.

Active check commands are very simple to implement. They simply require a plugin to return one or more lines of check output to the standard output stream and return one of the predefined exit codes—OK (code 0), WARNING (code 1), CRITICAL (code 2), or UNKNOWN (code 3). How active check plugins work is described in more detail at the beginning of Chapter 6, Using the Nagios Plugins.

Testing MySQL database correctness

Let’s start with a simple plugin that performs a simple active check. It connects to a MySQL database and verifies if the specified tables are structurally correct. It will accept the connection information from command line as a series of arguments.

From a technical point of view, the check is quite simple—all that’s needed is to connect to a server, choose the database, and run the CHECK TABLE (http://dev.mysql.com/doc/refman/5.7/en/check-table.html) command over SQL.

The plugin uses the mysql driver for Node.js (https://github.com/mysqljs/mysql). The driver is available via npm, therefore, the first thing to do is to define the package.json file as follows:

{ 
  "dependencies": { 
    "mysql": "^2.11.1" 
  } 
} 

The contents of this file are rather straightforward: they state all dependencies—in our case, it is the driver mentioned earlier. In order for npm to download the libraries, once package.json is created, run the following command:

# npm i

We will also need a working MySQL database that we can connect to for testing purposes. It is a good idea to install MySQL server on your local machine and set up a dummy database with tables for testing.

In order to set up a MySQL database server on Ubuntu Linux, install the mysql-server package as follows:

# apt-get install mysql-server

In Red Hat and Fedora Linux, the package is called mysql-server and the command to install is:

# yum install mysql-server

After that, you will be able to connect to the database locally as root, either without a password or with the password supplied during the database installation.

If you do not have any other databases to run the script against, you can use mysql as the database name as this is a database that all instances of MySQL have.

The following is a sample script that performs the test. It should be saved as the index.js file, and needs to be run with the hostname, username, password, database name, and the list of tables to be checked as arguments. The table names should be separated by a comma:

var mysql      = require('mysql'); 
var args = process.argv.slice(2); 
var connection = mysql.createConnection({ 
  host     : args[0], 
  user     : args[1], 
  password : args[2], 
  database : args[3] 
}); 
 
var tables = args[4]; 
var errors = []; 
var count = 0; 
 
connection.connect(); 
 
tables = tables.split(',').map(function (string) {return string.trim();}); 
var queriesLeft = tables.length; 
 
var onResult = function (table, msg) { 
  if (msg === 'OK') { 
    count++; 
  } else { 
   errors.push(table.trim()); 
  } 
  if (--queriesLeft === 0) { 
    connection.end(); 
    if (errors.length === 0) { 
      console.log('check_mysql_table: OK', count, 'table(s) checked'); 
      process.exit(0); 
    } else { 
      console.log('check_mysql_table: CRITICAL: erorrs in', errors.join(', ')); 
      process.exit(2); 
    } 
  }  
}; 
 
tables.forEach(function (table) { 
  connection.query('CHECK TABLE ' + table.trim(), function(err, rows, fields) { 
    if (!err) { 
      onResult(table, rows[0].Msg_text); 
    } else { 
      console.log('Error while performing Query.', err); 
    } 
  }); 
}); 

The code consists of four parts—initializing, argument parsing, connecting, and checking each table. The first part loads the mysql driver. In the second part, the arguments passed by the user are mapped to the various variables, and a connection to the database is made. If the connection succeeds, for each table specified when running the command, a CHECK TABLE command (http://dev.mysql.com/doc/refman/5.0/en/check-table.html) is run. This makes MySQL verify that the table structure is correct.

To use it, let’s run it by specifying the connection information, and tables tbl1, tbl2, and tbl3.

root@ubuntu:~# node index.js \
127.0.0.1 mysqluser secret1 databasename tbl1,tbl2,tbl3
check_mysql_table: OK 3 table(s) checked

As you can see, the script seems quite easy and it is usable.

Monitoring local time against a time server

The next task is to create a check plugin that compares the local time with the time on a remote machine and issues a warning or critical state if the difference exceeds a specified number.

We’ll use npm’s ntp-client package (https://github.com/moonpyk/node-ntp-client) to communicate with remote machines. See contents of the package.json file using the following:

{ 
  "dependencies": { 
    "ntp-client": "^0.5.3" 
  } 
} 

The script will accept the host name and the warning and critical thresholds in a number of seconds. The script will use these to decide on the exit status. It will also output the difference as the number of seconds for informational purposes.

The following is a script to perform a check of the time on a remote machine:

var ntpClient = require('ntp-client'); 
 
var args = process.argv.slice(2); 
var host = args[0]; 
var warnDiff = args[1]; 
var critDiff = args[2]; 
 
ntpClient.getNetworkTime(host, 123, function(err, date) { 
    if(err) { 
        console.error(err); 
        return; 
    } 
    var states = ['OK', 'WARNING', 'CRITICAL']; 
    var diff = Math.abs((new Date().getTime()) - date.getTime()); 
    var i = diff < warnDiff ? 0 : (diff < critDiff ? 1 : 2); 
    console.log('check_time', states[i] + ':', diff, 'seconds difference'); 
    process.exit(i); 
}); 

This command is split into three parts: initializing, parsing arguments, and checking status. The first part loads the ntp-client module and the second maps the arguments to variables. After that, a connection to the remote host is made, the time on the remote machine is received, and this remote time is compared with the local time. Based on what the difference is, the command returns either a CRITICAL, WARNING, or OK status.

And now let’s run it against a sample machine:

root@ubuntu:~# node index.js \
ntp2a.mcc.ac.uk 60 120
check_time WARNING: 76 seconds difference

As shown, the script works properly and returns a WARNING state as the difference is higher than 60 but lower than 120.

Writing plugins the right way

We have already created a few sample scripts, and they’re working. So it is possible to use them from Nagios. But these checks are very far from being complete. They lack error control, parsing, and argument verification.

It is recommended that you write all the commands in a more user-friendly way. The reason is that, in most cases, after some time, someone else will take over using and/or maintaining your custom check commands. You might also come back to your own code after a year of working on completely different things. In such cases, having a check command that is user friendly, has proper comments in the code, and allows debugging will save a lot of time. The standard Nagios plugins guidelines (available at https://nagios-plugins.org/doc/guidelines.html) documents good practices for standard Nagios plugins package developers. While some parts may be specific to C language, it is worth reading when developing in other languages as well.

The first thing that should be done is to provide the proper handling of arguments—this means using a functionality such as the node-getopt or yargs node.js library (https://github.com/yargs/yargs, https://github.com/jiangmiao/node-getopt), the getopt package for Python (http://www.python.org/doc/2.5/lib/module-getopt.html), or the cmdline package for Tcl (http://tcllib.sourceforge.net/doc/cmdline.html) to parse the arguments. This way, a functionality such as the --help parameter will work properly and in a more user-friendly way. The majority of programming languages provide such libraries and it is always recommended that you use them.

Another thing worth considering is proper error handling. If connectivity to a remote machine is not possible, the check command should exit with a critical or unknown status. In addition, all other pieces of the code should be wrapped to catch errors depending on whether an error suggests a failure in the service being checked, or is due to a problem outside a checked service.

Using the example of the first check plugin, we can redesign the beginning of the script to parse the arguments correctly. The reworked plugin defines all the required parameters, so, whenever any of them is missing, the usage information will be printed.

The following code extract shows the rewritten index.js script that uses yargs to parse arguments:

var mysql = require('mysql'); 
var argv = require('yargs') 
  .demand(['h', 'u', 'p', 'd', 't']) 
  .alias('h', 'hostname') 
  .alias('u', 'username') 
  .alias('p', 'password') 
  .alias('d', 'dbname') 
  .alias('t', 'tables') 
  .array('t') 
  .argv; 
 
var connection = mysql.createConnection({ 
  host     : argv.hostname, 
  user     : argv.username, 
  password : argv.password, 
  database : argv.dbname 
}); 
 
var errors = []; 
var count = 0; 
 
connection.connect(); 
var queriesLeft = argv.tables.length; 
 
var onResult = function (table, msg) { 
  if (msg === 'OK') { 
    count++; 
  } else { 
   errors.push(table.trim()); 
  } 
  if (--queriesLeft === 0) { 
    connection.end(); 
    if (errors.length === 0) { 
      console.log('check_mysql_table: OK', count, 'table(s) checked'); 
      process.exit(0); 
    } else { 
      console.log('check_mysql_table: CRITICAL: erorrs in', errors.join(', ')); 
      process.exit(2); 
    } 
  } 
}; 
 
argv.tables.forEach(function (table) { 
  connection.query('CHECK TABLE ' + table, function(err, rows, fields) { 
    if (!err) { 
      onResult(table, rows[0].Msg_text); 
    } else { 
      console.log('Error while performing Query.', err); 
    } 
  }); 
}); 

The dependencies must now include the yargs library:

{ 
  "dependencies": { 
    "mysql": "^2.11.1", 
    "yargs": "^4.8.0" 
  } 
} 

In case we run our code without arguments, it will automatically print out usage information.

root@ubuntu:~# node index.js
Options:
  -h, --hostname                                       [required]
  -u, --username                                       [required]
  -p, --password                                       [required]
  -d, --dbname                                         [required]
  -t, --tables                                 [array] [required]
Missing required arguments: h, u, p, d, t

As another example, we can update our time checking code as follows:

var ntpClient = require('ntp-client'); 
 
var argv = require('yargs') 
  .help('H') 
  .alias('H', 'help') 
  .options({ 
    h: { 
      alias: 'hostname', 
      describe: 'NTP server', 
      default: 'ntp2a.mcc.ac.uk', 
      nargs: 1 
    }, 
    w: { 
      alias: 'warning', 
      describe: 'positive number of seconds', 
      default: '300', 
      type: 'number', 
      nargs: 1 
    }, 
    c: { 
      alias: 'critical', 
      describe: 'positive number of seconds', 
      default: '600', 
      type: 'number', 
      nargs: 1 
    } 
  }) 
  .argv; 
 
['warning', 'critical'].forEach(function (param) { 
  if (argv[param] <= 0) { 
    console.log('Invalid', param, 'time specified'); 
    process.exit(3); 
  } 
}); 
 
ntpClient.getNetworkTime(argv.hostname, 123, function(err, date) { 
    if(err) { 
        console.error(err); 
        return; 
    } 
    var states = ['OK', 'WARNING', 'CRITICAL']; 
    var diff = Math.abs((new Date().getTime()) - date.getTime()); 
    var i = diff < argv.warning ? 0 : (diff < argv.critical ? 1 : 2); 
    console.log('check_time', states[i] + ':', diff, 'seconds difference'); 
    process.exit(i); 
});  

In this case, we use yargs more deeply, not only to define arguments but also to specify their default values along with the types for numeric values. As a result, we not only have better value checking but the help information is much more descriptive.

root@ubuntu:~# node index.js -H
Options:
  -H, --help      Show help                                            [boolean]
  -h, --hostname  NTP server              [default: "ntp2a.mcc.ac.uk"]
  -w, --warning   positive number of seconds   [number] [default: "300"]
  -c, --critical  positive number of seconds   [number] [default: "600"]

The code also verifies that the passed numbers are positive and prints out the relevant message if any of them is no:

root@ubuntu:~# node index.js -w -10
Invalid warning time specified

Of course, the changes mentioned here are just small examples of how plugins should be written. It’s not possible to cover all the possible aspects of what plugins should take into account. It’s your responsibility as the command’s author to make sure that all scenarios are covered in your plugin.

Typically, this means correct error handling—usually related to catching all of the exceptions that the underlying functions might throw. There are also additional things to take into account. For example, if you are writing a networked plugin, the remote server can return error messages that also need to be handled properly.

An important thing worth considering is handling timeouts properly.

Usually, a plugin tries to connect in the background, and if it fails within a specified period of time, the plugin will exit the check and report an error status. This is usually done through the use of child threads or child processes. In languages that are event driven, this can be done by scheduling an event that exits with a timeout message after a specified time interval.

Related Articles

How to add swap space on Ubuntu 21.04 Operating System

How to add swap space on Ubuntu 21.04 Operating System

The swap space is a unique space on the disk that is used by the system when Physical RAM is full. When a Linux machine runout the RAM it use swap space to move inactive pages from RAM. Swap space can be created into Linux system in two ways, one we can create a...

read more

Lorem ipsum dolor sit amet consectetur

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

five × 1 =