diff options
-rw-r--r-- | doc/README | 3 | ||||
-rw-r--r-- | doc/developer-guidelines.html | 931 | ||||
-rw-r--r-- | doc/developer-guidelines.sgml | 483 |
3 files changed, 1417 insertions, 0 deletions
diff --git a/doc/README b/doc/README new file mode 100644 index 00000000..388bc1d7 --- /dev/null +++ b/doc/README @@ -0,0 +1,3 @@ +The developer documentation here is generated from the DocBook format. + + diff --git a/doc/developer-guidelines.html b/doc/developer-guidelines.html new file mode 100644 index 00000000..efac605f --- /dev/null +++ b/doc/developer-guidelines.html @@ -0,0 +1,931 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> +<HTML +><HEAD +><TITLE +>Nagios plug-in development guidelines</TITLE +><META +NAME="GENERATOR" +CONTENT="Modular DocBook HTML Stylesheet Version 1.64 +"></HEAD +><BODY +CLASS="BOOK" +BGCOLOR="#FFFFFF" +TEXT="#000000" +LINK="#0000FF" +VLINK="#840084" +ALINK="#0000FF" +><DIV +CLASS="BOOK" +><A +NAME="AEN1" +></A +><DIV +CLASS="TITLEPAGE" +><H1 +CLASS="TITLE" +><A +NAME="AEN3" +>Nagios plug-in development guidelines</A +></H1 +><H3 +CLASS="AUTHOR" +><A +NAME="AEN5" +>Karl DeBisschop</A +></H3 +><DIV +CLASS="AFFILIATION" +><DIV +CLASS="ADDRESS" +><P +CLASS="ADDRESS" +>karl@debisschop.net</P +></DIV +></DIV +><H3 +CLASS="AUTHOR" +><A +NAME="AEN11" +>Ethan Galstad</A +></H3 +><DIV +CLASS="AFFILIATION" +><DIV +CLASS="ADDRESS" +><P +CLASS="ADDRESS" +>netsaint@linuxbox.com</P +></DIV +></DIV +><H3 +CLASS="AUTHOR" +><A +NAME="AEN21" +>Hugo Gayosso</A +></H3 +><DIV +CLASS="AFFILIATION" +><DIV +CLASS="ADDRESS" +><P +CLASS="ADDRESS" +>hgayosso@gnu.org</P +></DIV +></DIV +><H3 +CLASS="AUTHOR" +><A +NAME="AEN27" +>Subhendu Ghosh</A +></H3 +><DIV +CLASS="AFFILIATION" +><DIV +CLASS="ADDRESS" +><P +CLASS="ADDRESS" +>sghosh@sourceforge.net</P +></DIV +></DIV +><H3 +CLASS="AUTHOR" +><A +NAME="AEN33" +>Stanley Hopcroft</A +></H3 +><DIV +CLASS="AFFILIATION" +><DIV +CLASS="ADDRESS" +><P +CLASS="ADDRESS" +>stanleyhopcroft@sourceforge.net</P +></DIV +></DIV +><P +CLASS="COPYRIGHT" +>Copyright © 2000 2001 2002 by Karl DeBisschop, Ethan Galstad, + Hugo Gayosso, Stanley Hopcroft, Subhendu Ghosh</P +><HR></DIV +><DIV +CLASS="TOC" +><DL +><DT +><B +>Table of Contents</B +></DT +><DT +><A +HREF="#PREFACE" +>About the guidelines</A +></DT +><DD +><DL +><DT +><A +HREF="#AEN51" +>Copyright</A +></DT +></DL +></DD +><DT +><A +HREF="#AEN56" +></A +></DT +><DD +><DL +><DT +><A +HREF="#PLUGOUTPUT" +>Plugin Output for Nagios</A +></DT +><DD +><DL +><DT +><A +HREF="#AEN60" +>Print only one line of text</A +></DT +><DT +><A +HREF="#AEN63" +>Screen Output</A +></DT +><DT +><A +HREF="#AEN67" +>Return the proper status code</A +></DT +><DT +><A +HREF="#AEN71" +>Plugin Return Codes</A +></DT +></DL +></DD +><DT +><A +HREF="#SYSCMDAUXFILES" +>System Commands and Auxiliary Files</A +></DT +><DD +><DL +><DT +><A +HREF="#AEN117" +>Don't execute system commands without specifying their + full path</A +></DT +><DT +><A +HREF="#AEN121" +>Use spopen() if external commands must be executed</A +></DT +><DT +><A +HREF="#AEN125" +>Don't make temp files unless absolutely required</A +></DT +><DT +><A +HREF="#AEN128" +>Don't be tricked into following symlinks</A +></DT +><DT +><A +HREF="#AEN131" +>Validate all input</A +></DT +></DL +></DD +><DT +><A +HREF="#PERLPLUGIN" +>Perl Plugins</A +></DT +><DT +><A +HREF="#RUNTIME" +>Runtime Timeouts</A +></DT +><DD +><DL +><DT +><A +HREF="#AEN165" +>Use DEFAULT_SOCKET_TIMEOUT</A +></DT +><DT +><A +HREF="#AEN168" +>Add alarms to network plugins</A +></DT +></DL +></DD +><DT +><A +HREF="#PLUGOPTIONS" +>Plugin Options</A +></DT +><DD +><DL +><DT +><A +HREF="#AEN174" +>Option Processing</A +></DT +><DT +><A +HREF="#AEN187" +>Plugins with more than one type of threshold, or with + threshold ranges</A +></DT +></DL +></DD +><DT +><A +HREF="#SUBMITTINGCHANGES" +>New submissions and patches</A +></DT +></DL +></DD +></DL +></DIV +><DIV +CLASS="PREFACE" +><HR><H1 +><A +NAME="PREFACE" +>About the guidelines</A +></H1 +><P +>The purpose of this guidelines is to provide a reference for + the plug-in developers and encourage the standarization of the + different kind of plug-ins: C, shell, perl, python, etc.</P +><DIV +CLASS="SECTION" +><HR><H1 +CLASS="SECTION" +><A +NAME="AEN51" +>Copyright</A +></H1 +><P +>Nagios Plug-in Development Guidelines Copyright (C) 2000 2001 + 2002 + Karl DeBisschop, Ethan Galstad, Hugo Gayosso, Stanley Hopcroft, + Subhendu Ghosh</P +><P +>Permission is granted to make and distribute verbatim + copies of this manual provided the copyright notice and this + permission notice are preserved on all copies.</P +><P +>The plugins themselves are copyrighted by their respective + authors.</P +></DIV +></DIV +><DIV +CLASS="ARTICLE" +><DIV +CLASS="TOC" +><DL +><DT +><B +>Table of Contents</B +></DT +><DT +><A +HREF="#PLUGOUTPUT" +>Plugin Output for Nagios</A +></DT +><DT +><A +HREF="#SYSCMDAUXFILES" +>System Commands and Auxiliary Files</A +></DT +><DT +><A +HREF="#PERLPLUGIN" +>Perl Plugins</A +></DT +><DT +><A +HREF="#RUNTIME" +>Runtime Timeouts</A +></DT +><DT +><A +HREF="#PLUGOPTIONS" +>Plugin Options</A +></DT +><DT +><A +HREF="#SUBMITTINGCHANGES" +>New submissions and patches</A +></DT +></DL +></DIV +><DIV +CLASS="SECTION" +><H1 +CLASS="SECTION" +><A +NAME="PLUGOUTPUT" +>Plugin Output for Nagios</A +></H1 +><P +>You should always print something to STDOUT that tells if the + service is working or why its failing. Try to keep the output short - + probably less that 80 characters. Remember that you ideally would like + the entire output to appear in a pager message, which will get chopped + off after a certain length.</P +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN60" +>Print only one line of text</A +></H2 +><P +>Nagios will only grab the first line of text from STDOUT + when it notifies contacts about potential problems. If you print + multiple lines, you're out of luck. Remember, keep it short and + to the point.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN63" +>Screen Output</A +></H2 +><P +>The plug-in should print the diagnostic and just the + synopsis part of the help message. A well written plugin would + then have --help as a way to get the verbose help.</P +><P +>Code and output should try to respect the 80x25 size of a + crt (remember when fixing stuff in the server room!)</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN67" +>Return the proper status code</A +></H2 +><P +>See <A +HREF="#RETURNCODES" +>Table 1 in the section called <I +>Plugin Return Codes</I +></A +> below + for the numeric values of status codes and their + description. Remember to return an UNKNOWN state if bogus or + invalid command line arguments are supplied or it you are unable + to check the service.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN71" +>Plugin Return Codes</A +></H2 +><P +>The return codes below are based on the POSIX spec of returning + a positive value. Netsaint prior to v0.0.7 supported non-POSIX + compliant return code of "-1" for unknown. Nagios supports POSIX return + codes by default.</P +><P +>Note: Some plugins will on occasion print on STDOUT that an error + occurred and error code is 138 or 255 or some such number. These + are usually caused by plugins using system commands and having not + enough checks to catch unexpected output. Developers should include a + default catch-all for system command output that returns an UNKOWN + return code.</P +><DIV +CLASS="TABLE" +><A +NAME="RETURNCODES" +></A +><P +><B +>Table 1. Plugin Return Codes</B +></P +><TABLE +BORDER="1" +BGCOLOR="#E0E0E0" +CELLSPACING="0" +CELLPADDING="4" +CLASS="CALSTABLE" +><THEAD +><TR +><TH +ALIGN="LEFT" +VALIGN="TOP" +><P +>Numeric Value</P +></TH +><TH +ALIGN="LEFT" +VALIGN="TOP" +><P +>Service Status</P +></TH +><TH +ALIGN="LEFT" +VALIGN="TOP" +><P +>Status Description</P +></TH +></TR +></THEAD +><TBODY +><TR +><TD +ALIGN="CENTER" +VALIGN="TOP" +><P +>0</P +></TD +><TD +ALIGN="LEFT" +VALIGN="MIDDLE" +><P +>OK</P +></TD +><TD +ALIGN="LEFT" +VALIGN="TOP" +><P +>The plugin was able to check the service and it + appeared to be functioning properly</P +></TD +></TR +><TR +><TD +ALIGN="CENTER" +VALIGN="TOP" +><P +>1</P +></TD +><TD +ALIGN="LEFT" +VALIGN="MIDDLE" +><P +>Warning</P +></TD +><TD +ALIGN="LEFT" +VALIGN="TOP" +><P +>The plugin was able to check the service, but it + appeared to be above some "warning" threshold or did not appear + to be working properly</P +></TD +></TR +><TR +><TD +ALIGN="CENTER" +VALIGN="TOP" +><P +>2</P +></TD +><TD +ALIGN="LEFT" +VALIGN="MIDDLE" +><P +>Critical</P +></TD +><TD +ALIGN="LEFT" +VALIGN="TOP" +><P +>The plugin detected that either the service was not + running or it was above some "critical" threshold</P +></TD +></TR +><TR +><TD +ALIGN="CENTER" +VALIGN="TOP" +><P +>3</P +></TD +><TD +ALIGN="LEFT" +VALIGN="MIDDLE" +><P +>Unknown</P +></TD +><TD +ALIGN="LEFT" +VALIGN="TOP" +><P +>Invalid command line arguments were supplied to the + plugin or the plugin was unable to check the status of the given + hosts/service</P +></TD +></TR +></TBODY +></TABLE +></DIV +></DIV +></DIV +><DIV +CLASS="SECTION" +><HR><H1 +CLASS="SECTION" +><A +NAME="SYSCMDAUXFILES" +>System Commands and Auxiliary Files</A +></H1 +><DIV +CLASS="SECTION" +><H2 +CLASS="SECTION" +><A +NAME="AEN117" +>Don't execute system commands without specifying their + full path</A +></H2 +><P +>Don't use exec(), popen(), etc. to execute external + commands without explicity using the full path of the external + program.</P +><P +>Doing otherwise makes the plugin vulnerable to hijacking + by a trojan horse earlier in the search path. See the main + plugin distribution for examples on how this is done.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN121" +>Use spopen() if external commands must be executed</A +></H2 +><P +>If you have to execute external commands from within your + plugin and you're writing it in C, use the spopen() function + that Karl DeBisschop has written.</P +><P +>The code for spopen() and spclose() is included with the + core plugin distribution.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN125" +>Don't make temp files unless absolutely required</A +></H2 +><P +>If temp files are needed, make sure that the plugin will + fail cleanly if the file can't be written (e.g., too few file + handles, out of disk space, incorrect permissions, etc.) and + delete the temp file when processing is complete.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN128" +>Don't be tricked into following symlinks</A +></H2 +><P +>If your plugin opens any files, take steps to ensure that + you are not following a symlink to another location on the + system.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN131" +>Validate all input</A +></H2 +><P +>use routines in utils.c or utils.pm and write more as needed</P +></DIV +></DIV +><DIV +CLASS="SECTION" +><HR><H1 +CLASS="SECTION" +><A +NAME="PERLPLUGIN" +>Perl Plugins</A +></H1 +><P +>Perl plugins are coded a little more defensively than other + plugins because of embedded Perl. When configured as such, embedded + Perl Nagios (ePN) requires stricter use of the some of Perl's features. + This section outlines some of the steps needed to use ePN + effectively.</P +><P +></P +><OL +TYPE="1" +><LI +><P +> Do not use BEGIN and END blocks since they will be called + the first time and when Nagios shuts down with Embedded Perl (ePN). In + particular, do not use BEGIN blocks to initialize variables.</P +></LI +><LI +><P +>To use utils.pm, you need to provide a full path to the + module in order for it to work with ePN.</P +><P +CLASS="LITERALLAYOUT" +> e.g.<br> + use lib "/usr/local/nagios/libexec";<br> + use utils qw(...);<br> + </P +></LI +><LI +><P +>Perl scripts should be called with "-w"</P +></LI +><LI +><P +>All Perl plugins must compile cleanly under "use strict" - i.e. at + least explicitly package names as in "$main::x" or predeclare every + variable. </P +><P +>Explicitly initialize each varialable in use. Otherwise with + caching enabled, the plugin will not be recompilied each time, and + therefore Perl will not reinitialize all the variables. All old + variable values will still be in effect.</P +></LI +><LI +><P +>Do not use < DATA > (these simply do not compile under ePN).</P +></LI +><LI +><P +>Do not use named subroutines</P +></LI +><LI +><P +>If writing to a file (perhaps recording + performance data) explicitly close close it. The plugin never + calls <I +CLASS="EMPHASIS" +>exit</I +>; that is caught by + p1.pl, so output streams are never closed.</P +></LI +><LI +><P +>As in <A +HREF="#RUNTIME" +>the section called <I +>Runtime Timeouts</I +></A +> all plugins need + to monitor their runtime, specially if they are using network + resources. Use of the <I +CLASS="EMPHASIS" +>alarm</I +> is recommended. + Plugins may import a default time out ($TIMEOUT) from utils.pm. + </P +></LI +><LI +><P +>Perl plugins should import %ERRORS from utils.pm + and then "exit $ERRORS{'OK'}" rather than "exit 0" + </P +></LI +></OL +></DIV +><DIV +CLASS="SECTION" +><HR><H1 +CLASS="SECTION" +><A +NAME="RUNTIME" +>Runtime Timeouts</A +></H1 +><P +>Plugins have a very limited runtime - typically 10 sec. + As a result, it is very important for plugins to maintain internal + code to exit if runtime exceeds a threshold. </P +><P +>All plugins should timeout gracefully, not just networking + plugins. For instance, df may lock if you have automounted + drives and your network fails - but on first glance, who'd think + df could lock up like that. Plus, it should just be more error + resistant to be able to time out rather than consume + resources.</P +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN165" +>Use DEFAULT_SOCKET_TIMEOUT</A +></H2 +><P +>All network plugins should use DEFAULT_SOCKET_TIMEOUT to timeout</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN168" +>Add alarms to network plugins</A +></H2 +><P +>If you write a plugin which communicates with another + networked host, you should make sure to set an alarm() in your + code that prevents the plugin from hanging due to abnormal + socket closures, etc. Nagios takes steps to protect itself + against unruly plugins that timeout, but any plugins you create + should be well behaved on their own.</P +></DIV +></DIV +><DIV +CLASS="SECTION" +><HR><H1 +CLASS="SECTION" +><A +NAME="PLUGOPTIONS" +>Plugin Options</A +></H1 +><P +>A well written plugin should have --help as a way to get + verbose help. Code and output should try to respect the 80x25 size of a + crt (remember when fixing stuff in the server room!)</P +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN174" +>Option Processing</A +></H2 +><P +>For plugins written in C, we recommend the C standard + getopt library for short options. If using getopt_long, check to + be sure that HAVE_GETOPT_H is defined (configure checks this and + sets the #define in common/config.h).</P +><P +>For plugins written in Perl, we recommend Getopt::Long module.</P +><P +>Positional arguments are strongly discouraged.</P +><P +>There are a few reserved options that should not be used + for other purposes:</P +><P +CLASS="LITERALLAYOUT" +> -V version (--version)<br> + -h help (--help)<br> + -t timeout (--timeout)<br> + -w warning threshold (--warning)<br> + -c critical threshold (--critical)<br> + -H hostname (--hostname)<br> + </P +><P +>In addition to the reserved options above, some other standard options are:</P +><P +CLASS="LITERALLAYOUT" +> -C SNMP community (--community)<br> + -a authentication password (--authentication)<br> + -l login name (--logname)<br> + -p port or password (--port or --passwd/--password)monitors operational<br> + -u url or username (--url or --username)<br> + </P +><P +>Look at check_pgsql and check_procs to see how I currently + think this can work. Standard options are:</P +><P +>The option -V or --version should be present in all + plugins. For C plugins it should result in a call to print_revision, a + function in utils.c which takes two character arguments, the + command name and the plugin revision.</P +><P +>The -? option, or any other unparsable set of options, + should print out a short usage statement. Character width should + be 80 and less and no more that 23 lines should be printed (it + should display cleanly on a dumb terminal in a server + room).</P +><P +>The option -h or --help should be present in all plugins. + In C plugins, it should result in a call to print_help (or + equivalent). The function print_help should call print_revision, + then print_usage, then should provide detailed + help. Help text should fit on an 80-character width display, but + may run as many lines as needed.</P +></DIV +><DIV +CLASS="SECTION" +><HR><H2 +CLASS="SECTION" +><A +NAME="AEN187" +>Plugins with more than one type of threshold, or with + threshold ranges</A +></H2 +><P +>Old style was to do things like -ct for critical time and + -cv for critical value. That goes out the window with POSIX + getopt. The allowable alternatves are:</P +><P +></P +><OL +TYPE="1" +><LI +><P +>long options like -critical-time (or -ct and -cv, I + suppose).</P +></LI +><LI +><P +>repeated options like `check_load -w 10 -w 6 -w 4 -c + 16 -c 10 -c 10`</P +></LI +><LI +><P +>for brevity, the above can be expressed as `check_load + -w 10,6,4 -c 16,10,10`</P +></LI +><LI +><P +>ranges are expressed with colons as in `check_procs -C + httpd -w 1:20 -c 1:30` which will warn above 20 instances, + and critical at 0 and above 30</P +></LI +><LI +><P +>lists are expressed with commas, so Jacob's check_nmap + uses constructs like '-p 1000,1010,1050:1060,2000'</P +></LI +><LI +><P +>If possible when writing lists, use tokens to make the + list easy to remember and non-order dependent - so + check_disk uses '-c 10000,10%' so that it is clear which is + the precentage and which is the KB values (note that due to + my own lack of foresight, that used to be '-c 10000:10%' but + such constructs should all be changed for consistency, + though providing reverse compatibility is fairly + easy).</P +></LI +></OL +><P +>As always, comments are welcome - making this consistent + without a host of long options was quite a hassle, and I would + suspect that there are flaws in this strategy. Perhaps clear + long-options is the most important of the above choices, but not + all POSIX systems have C libraries for long options, so the + short forms must exist as well.</P +></DIV +></DIV +><DIV +CLASS="SECTION" +><HR><H1 +CLASS="SECTION" +><A +NAME="SUBMITTINGCHANGES" +>New submissions and patches</A +></H1 +><P +>If you would like other to use your plugins and have it included in + the standard distribution, please include patches for the relavant + configuration files, in particular "configure.in" Otherwise submitted + plugins will be included in the contrib directory.</P +><P +>Plugins in the contrib directory are going to be migrated to the + standard plugins/plugin-scripts directory as time permits and per user + requests</P +><P +>Patches should be submitted via the SourceForge and be announced to + the mailing list.</P +><P +>For new plugins, provide a diff to add to the EXTRAS list (configure.in) + unless you are fairly sure that the plugin will work for all platforms with + no non-standard software added.</P +><P +>If possible please submit a test harness. Documentation on sample + tests coming soon.</P +></DIV +></DIV +></DIV +></BODY +></HTML +>
\ No newline at end of file diff --git a/doc/developer-guidelines.sgml b/doc/developer-guidelines.sgml new file mode 100644 index 00000000..42ad8964 --- /dev/null +++ b/doc/developer-guidelines.sgml @@ -0,0 +1,483 @@ +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> +<book> + <title>Nagios Plug-in Developer Guidelines</title> + + <bookinfo> + <authorgroup> + <author> + <firstname>Karl</firstname> + <surname>DeBisschop</surname> + <affiliation> + <address><email>karl@debisschop.net</email></address> + </affiliation> + </author> + + <author> + <firstname>Ethan</firstname> + <surname>Galstad</surname> + <authorblurb> + <para>Author of Nagios</para> + <para><ulink url="http://www.nagios.org"></ulink></para> + </authorblurb> + <affiliation> + <address><email>netsaint@linuxbox.com</email></address> + </affiliation> + </author> + + <author> + <firstname>Hugo</firstname> + <surname>Gayosso</surname> + <affiliation> + <address><email>hgayosso@gnu.org</email></address> + </affiliation> + </author> + + + <author> + <firstname>Subhendu</firstname> + <surname>Ghosh</surname> + <affiliation> + <address><email>sghosh@sourceforge.net</email></address> + </affiliation> + </author> + + <author> + <firstname>Stanley</firstname> + <surname>Hopcroft</surname> + <affiliation> + <address><email>stanleyhopcroft@sourceforge.net</email></address> + </affiliation> + </author> + + </authorgroup> + + <pubdate>2002</pubdate> + <title>Nagios plug-in development guidelines</title> + + <revhistory> + <revision> + <revnumber>0.4</revnumber> + <date>2 May 2002</date> + </revision> + </revhistory> + + <copyright> + <year>2000 2001 2002</year> + <holder>Karl DeBisschop, Ethan Galstad, + Hugo Gayosso, Stanley Hopcroft, Subhendu Ghosh</holder> + </copyright> + +</bookinfo> + + + <preface id=preface> + <title>About the guidelines</title> + + <para>The purpose of this guidelines is to provide a reference for + the plug-in developers and encourage the standarization of the + different kind of plug-ins: C, shell, perl, python, etc.</para> + + + <section> <title>Copyright</title> + + <para>Nagios Plug-in Development Guidelines Copyright (C) 2000 2001 + 2002 + Karl DeBisschop, Ethan Galstad, Hugo Gayosso, Stanley Hopcroft, + Subhendu Ghosh</para> + + <para>Permission is granted to make and distribute verbatim + copies of this manual provided the copyright notice and this + permission notice are preserved on all copies.</para> + + <para>The plugins themselves are copyrighted by their respective + authors.</para> + + </section> +</preface> + +<article> +<section id="PlugOutput"><title>Plugin Output for Nagios</title> + + <para>You should always print something to STDOUT that tells if the + service is working or why its failing. Try to keep the output short - + probably less that 80 characters. Remember that you ideally would like + the entire output to appear in a pager message, which will get chopped + off after a certain length.</para> + + <section><title>Print only one line of text</title> + <para>Nagios will only grab the first line of text from STDOUT + when it notifies contacts about potential problems. If you print + multiple lines, you're out of luck. Remember, keep it short and + to the point.</para> + </section> + + <section><title>Screen Output</title> + <para>The plug-in should print the diagnostic and just the + synopsis part of the help message. A well written plugin would + then have --help as a way to get the verbose help.</para> + <para>Code and output should try to respect the 80x25 size of a + crt (remember when fixing stuff in the server room!)</para> + </section> + + <section><title>Return the proper status code</title> + <para>See <xref linkend="ReturnCodes"> below + for the numeric values of status codes and their + description. Remember to return an UNKNOWN state if bogus or + invalid command line arguments are supplied or it you are unable + to check the service.</para> + </section> + + <section><title>Plugin Return Codes</title> + <para>The return codes below are based on the POSIX spec of returning + a positive value. Netsaint prior to v0.0.7 supported non-POSIX + compliant return code of "-1" for unknown. Nagios supports POSIX return + codes by default.</para> + + <para>Note: Some plugins will on occasion print on STDOUT that an error + occurred and error code is 138 or 255 or some such number. These + are usually caused by plugins using system commands and having not + enough checks to catch unexpected output. Developers should include a + default catch-all for system command output that returns an UNKOWN + return code.</para> + + <table id="ReturnCodes"><title>Plugin Return Codes</title> + <tgroup cols="3"> + <thead> + <row> + <entry><para>Numeric Value</para></entry> + <entry><para>Service Status</para></entry> + <entry><para>Status Description</para></entry> + </row> + </thead> + <tbody> + <row> + <entry align=center><para>0</para></entry> + <entry valign=middle><para>OK</para></entry> + <entry><para>The plugin was able to check the service and it + appeared to be functioning properly</para></entry> + </row> + <row> + <entry align=center><para>1</para></entry> + <entry valign=middle><para>Warning</para></entry> + <entry><para>The plugin was able to check the service, but it + appeared to be above some "warning" threshold or did not appear + to be working properly</para></entry> + </row> + <row> + <entry align=center><para>2</para></entry> + <entry valign=middle><para>Critical</para></entry> + <entry><para>The plugin detected that either the service was not + running or it was above some "critical" threshold</para></entry> + </row> + <row> + <entry align=center><para>3</para></entry> + <entry valign=middle><para>Unknown</para></entry> + <entry><para>Invalid command line arguments were supplied to the + plugin or the plugin was unable to check the status of the given + hosts/service</para></entry> + </row> + </tbody> + </tgroup> + </table> + + + </section> + + +</section> + +<section id="SysCmdAuxFiles"><title>System Commands and Auxiliary Files</title> + + <section><title>Don't execute system commands without specifying their + full path</title> + <para>Don't use exec(), popen(), etc. to execute external + commands without explicity using the full path of the external + program.</para> + + <para>Doing otherwise makes the plugin vulnerable to hijacking + by a trojan horse earlier in the search path. See the main + plugin distribution for examples on how this is done.</para> + </section> + + <section><title>Use spopen() if external commands must be executed</title> + + <para>If you have to execute external commands from within your + plugin and you're writing it in C, use the spopen() function + that Karl DeBisschop has written.</para> + + <para>The code for spopen() and spclose() is included with the + core plugin distribution.</para> + </section> + + <section><title>Don't make temp files unless absolutely required</title> + + <para>If temp files are needed, make sure that the plugin will + fail cleanly if the file can't be written (e.g., too few file + handles, out of disk space, incorrect permissions, etc.) and + delete the temp file when processing is complete.</para> + </section> + + <section><title>Don't be tricked into following symlinks</title> + + <para>If your plugin opens any files, take steps to ensure that + you are not following a symlink to another location on the + system.</para> + </section> + + <section><title>Validate all input</title> + + <para>use routines in utils.c or utils.pm and write more as needed</para> + </section> + +</section> + + + + +<section id="PerlPlugin"><title>Perl Plugins</title> + + <para>Perl plugins are coded a little more defensively than other + plugins because of embedded Perl. When configured as such, embedded + Perl Nagios (ePN) requires stricter use of the some of Perl's features. + This section outlines some of the steps needed to use ePN + effectively.</para> + + <orderedlist> + + <listitem><para> Do not use BEGIN and END blocks since they will be called + the first time and when Nagios shuts down with Embedded Perl (ePN). In + particular, do not use BEGIN blocks to initialize variables.</para> + </listitem> + + <listitem><para>To use utils.pm, you need to provide a full path to the + module in order for it to work with ePN.</para> + + <literallayout> + e.g. + use lib "/usr/local/nagios/libexec"; + use utils qw(...); + </literallayout> + </listitem> + + <listitem><para>Perl scripts should be called with "-w"</para> + </listitem> + + <listitem><para>All Perl plugins must compile cleanly under "use strict" - i.e. at + least explicitly package names as in "$main::x" or predeclare every + variable. </para> + + + <para>Explicitly initialize each varialable in use. Otherwise with + caching enabled, the plugin will not be recompilied each time, and + therefore Perl will not reinitialize all the variables. All old + variable values will still be in effect.</para> + </listitem> + + <listitem><para>Do not use < DATA > (these simply do not compile under ePN).</para> + </listitem> + + <listitem><para>Do not use named subroutines</para> + </listitem> + + <listitem><para>If writing to a file (perhaps recording + performance data) explicitly close close it. The plugin never + calls <emphasis role=strong>exit</emphasis>; that is caught by + p1.pl, so output streams are never closed.</para> + </listitem> + + <listitem><para>As in <xref linkend="runtime"> all plugins need + to monitor their runtime, specially if they are using network + resources. Use of the <emphasis>alarm</emphasis> is recommended. + Plugins may import a default time out ($TIMEOUT) from utils.pm. + </para> + </listitem> + + <listitem><para>Perl plugins should import %ERRORS from utils.pm + and then "exit $ERRORS{'OK'}" rather than "exit 0" + </para> + </listitem> + + </orderedlist> + +</section> + +<section id="runtime"><title>Runtime Timeouts</title> + + <para>Plugins have a very limited runtime - typically 10 sec. + As a result, it is very important for plugins to maintain internal + code to exit if runtime exceeds a threshold. </para> + + <para>All plugins should timeout gracefully, not just networking + plugins. For instance, df may lock if you have automounted + drives and your network fails - but on first glance, who'd think + df could lock up like that. Plus, it should just be more error + resistant to be able to time out rather than consume + resources.</para> + + <section><title>Use DEFAULT_SOCKET_TIMEOUT</title> + + <para>All network plugins should use DEFAULT_SOCKET_TIMEOUT to timeout</para> + + </section> + + + <section><title>Add alarms to network plugins</title> + + <para>If you write a plugin which communicates with another + networked host, you should make sure to set an alarm() in your + code that prevents the plugin from hanging due to abnormal + socket closures, etc. Nagios takes steps to protect itself + against unruly plugins that timeout, but any plugins you create + should be well behaved on their own.</para> + + </section> + + + +</section> + +<section id="PlugOptions"><title>Plugin Options</title> + + <para>A well written plugin should have --help as a way to get + verbose help. Code and output should try to respect the 80x25 size of a + crt (remember when fixing stuff in the server room!)</para> + + <section><title>Option Processing</title> + + <para>For plugins written in C, we recommend the C standard + getopt library for short options. If using getopt_long, check to + be sure that HAVE_GETOPT_H is defined (configure checks this and + sets the #define in common/config.h).</para> + + <para>For plugins written in Perl, we recommend Getopt::Long module.</para> + + <para>Positional arguments are strongly discouraged.</para> + + <para>There are a few reserved options that should not be used + for other purposes:</para> + + <literallayout> + -V version (--version) + -h help (--help) + -t timeout (--timeout) + -w warning threshold (--warning) + -c critical threshold (--critical) + -H hostname (--hostname) + </literallayout> + + <para>In addition to the reserved options above, some other standard options are:</para> + + <literallayout> + -C SNMP community (--community) + -a authentication password (--authentication) + -l login name (--logname) + -p port or password (--port or --passwd/--password)monitors operational + -u url or username (--url or --username) + </literallayout> + + <para>Look at check_pgsql and check_procs to see how I currently + think this can work. Standard options are:</para> + + + <para>The option -V or --version should be present in all + plugins. For C plugins it should result in a call to print_revision, a + function in utils.c which takes two character arguments, the + command name and the plugin revision.</para> + + <para>The -? option, or any other unparsable set of options, + should print out a short usage statement. Character width should + be 80 and less and no more that 23 lines should be printed (it + should display cleanly on a dumb terminal in a server + room).</para> + + <para>The option -h or --help should be present in all plugins. + In C plugins, it should result in a call to print_help (or + equivalent). The function print_help should call print_revision, + then print_usage, then should provide detailed + help. Help text should fit on an 80-character width display, but + may run as many lines as needed.</para> + + </section> + + <section> + <title>Plugins with more than one type of threshold, or with + threshold ranges</title> + + <para>Old style was to do things like -ct for critical time and + -cv for critical value. That goes out the window with POSIX + getopt. The allowable alternatves are:</para> + + <orderedlist> + <listitem> + <para>long options like -critical-time (or -ct and -cv, I + suppose).</para> + </listitem> + + <listitem> + <para>repeated options like `check_load -w 10 -w 6 -w 4 -c + 16 -c 10 -c 10`</para> + </listitem> + + <listitem> + <para>for brevity, the above can be expressed as `check_load + -w 10,6,4 -c 16,10,10`</para> + </listitem> + + <listitem> + <para>ranges are expressed with colons as in `check_procs -C + httpd -w 1:20 -c 1:30` which will warn above 20 instances, + and critical at 0 and above 30</para> + </listitem> + + <listitem> + <para>lists are expressed with commas, so Jacob's check_nmap + uses constructs like '-p 1000,1010,1050:1060,2000'</para> + </listitem> + + <listitem> + <para>If possible when writing lists, use tokens to make the + list easy to remember and non-order dependent - so + check_disk uses '-c 10000,10%' so that it is clear which is + the precentage and which is the KB values (note that due to + my own lack of foresight, that used to be '-c 10000:10%' but + such constructs should all be changed for consistency, + though providing reverse compatibility is fairly + easy).</para> + </listitem> + + </orderedlist> + + <para>As always, comments are welcome - making this consistent + without a host of long options was quite a hassle, and I would + suspect that there are flaws in this strategy. Perhaps clear + long-options is the most important of the above choices, but not + all POSIX systems have C libraries for long options, so the + short forms must exist as well.</para> + </section> +</section> + +<section id="SubmittingChanges"><title>New submissions and patches</title> + + <para>If you would like other to use your plugins and have it included in + the standard distribution, please include patches for the relavant + configuration files, in particular "configure.in" Otherwise submitted + plugins will be included in the contrib directory.</para> + + <para>Plugins in the contrib directory are going to be migrated to the + standard plugins/plugin-scripts directory as time permits and per user + requests</para> + + <para>Patches should be submitted via the SourceForge and be announced to + the mailing list.</para> + + <para>For new plugins, provide a diff to add to the EXTRAS list (configure.in) + unless you are fairly sure that the plugin will work for all platforms with + no non-standard software added.</para> + + <para>If possible please submit a test harness. Documentation on sample + tests coming soon.</para> + +</section> +</article> + +</book> |