Friday 21 October 2016

In this tutorial we will learn How to install and configure nagios nrpe in CentOS and Red Hat. We will also do some configuration changes in nagios server so that we could monitor the servers.
In this scenario we will add a server to monitor.
Operating System : CentOS 6.3
Nagios Server : hostname: nagios-server , ip-address: 192.168.122.22
Nagios Client : hostname : web-node01 , ip-address: 192.168.122.94
Installing and configuring NRPE in nagios-client
Step 1: Download the epel repo and Install the nagios plugins and nrpe
wget dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
rpm -ivh epel-release-6-8.noarch.rpm

yum install -y nrpe nagios-plugins-all openssl
Step 2: All the nagios plugins will be by-default installed at /usr/lib/nagios/plugins/
ls -l /usr/lib/nagios/plugins/
Step 3: Take the backup of nrpe.cfg file located at /etc/nagios
[root@web-node01 ~]# cd /etc/nagios/
[root@web-node01 nagios]# ls -l
total 16
-rw-r--r-- 1 root root 7296 Mar 2 15:49 nrpe.cfg
[root@web-node01 nagios]# pwd
/etc/nagios
[root@web-node01 nagios]# cp -p nrpe.cfg nrpe.cfg.orig
Step 4: Configure nrpe.cfg
Add the nagios server ip in allowed_hosts as edited in given below nrpe.cfg file.
[root@web-node01 nagios]# egrep -v '^#|^$' nrpe.cfg
log_facility=daemon
pid_file=/var/run/nrpe/nrpe.pid
server_port=5666
nrpe_user=nrpe
nrpe_group=nrpe
allowed_hosts=127.0.0.1,192.168.122.22

dont_blame_nrpe=0
debug=0
command_timeout=60
connection_timeout=300
include_dir=/etc/nrpe.d/
command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200
[root@web-node01 nagios]#
In above block you can see command are already defined.
for eg. command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
just keep a note of it.The nrpe will use only these commands or plugins in nagios client which are defined in nrpe.cfg file.
If you want to add any other command or plugin you have to edit the nrpe.cfg file in same manner.
for eg.
command[check_pluginName]=/usr/lib/nagios/plugins/check_pluginName -w <value>  -c <value>
Step 5: Restart the nrpe service.
/etc/init.d/nrpe restart

Now adding web-node01 in nagios server so that we could monitor the server’s services.

nagios-server-diag
OK, let me explain the scenario, here we will monitor the server web-node01 of company calledcompanyA . So we will do some little change in nagios server. In same way you can also customize depending upon the no. of different networks you want.
For companyA I will create a directory inside /etc/nagios/objects/ with name called companyA. Then I will edit the nagios.cfg file to get the configuration information from directory /etc/nagios/objects/companyA
Mainly I will create 2 important file for monitoring companyA’s web-node01 server.
So here we go with following steps.
Login into nagios-server and do the below configuration
Step 1: create a directory inside /etc/nagios/objects/
mkdir -p /etc/nagios/objects/companyA
Step 2: now editing nagios.cfg for pointing configuration directory.
Add these line in /etc/nagios/nagios.cfg file just below the some examples of cfg_dir which are commented.(It is only because to easy to find in nagios.cfg file else you can add anywhere in nagios.cfg).
I will also show the complete nagios.cfg file configuration at the end of this post.)
cfg_dir=/etc/nagios/objects/companyA/
vi /etc/nagios/nagios.cfg

cfg_dir=/etc/nagios/objects/companyA/
Step 3: Now adding host and services config file inside /etc/nagios/objects/companyA . Here I am calling different files so that it could be easy to manage once your no. of more servers are added in file.
Lets create hosts.cfg file first for web-node01 (In this file we can add more hosts,specially created to only keep all the hosts information in one file)
vi /etc/nagios/objects/companyA/hosts.cfg

define host{
use linux-server
host_name web-node01
alias web-node01
address 192.168.122.94
}

<strong>save and exit by pressing ESC key then typing :wq enter</strong>
Step 4: Now we will add all services in services.cfg file . Here I am adding only two. Ping service is not using nrpe ,other two services are through NRPE.
vi /etc/nagios/objects/companyA/services.cfg

define service{
use generic-service
host_name web-node01
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}

define service{
use generic-service
host_name web-node01
service_description Current Load
check_command check_nrpe!check_load
}

define service{
use generic-service
host_name web-node01
service_description Total Processes
check_command check_nrpe!check_users
}

save and exit by pressing ESC key then typing :wq enter
Step 5: Now we will edit the commands.cfg file so that nrpe could run the command in web-node01 to fetch the data.
For nrpe, Add the below given parameter in commands.cfg file
vi /etc/nagios/objects/commands.cfg

define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

save and exit by pressing ESC key then typing :wq enter
Note: At the end of this post I will show the entire commands.cfg file.If you have any confusion you can take the reference from it.
Step 6: Now we will restart the nagios,apache and nrpe service in nagios-server .
/etc/init.d/nagios restart
/etc/init.d/nrpe restart
/etc/init.d/httpd restart
We are done here. Now check the nagios Dashboard and wait for a 90 seconds. The web-node01 server will be displaying 3 services and monitoring.
Note: iptable and selinux is disabled. for iptable you have to open the port no. 5666. in both nagios server and client.
nagios-ok

Below are the all configuration files where we did editing

A: /etc/nagios/nagios.cfg
[root@nagios-server nagios]# egrep -v '^#|^$' nagios.cfg
log_file=/var/log/nagios/nagios.log
cfg_file=/etc/nagios/objects/commands.cfg
cfg_file=/etc/nagios/objects/contacts.cfg
cfg_file=/etc/nagios/objects/timeperiods.cfg
cfg_file=/etc/nagios/objects/templates.cfg
cfg_file=/etc/nagios/objects/localhost.cfg
<strong>cfg_dir=/etc/nagios/objects/companyA/</strong>
object_cache_file=/var/log/nagios/objects.cache
precached_object_file=/var/log/nagios/objects.precache
resource_file=/etc/nagios/private/resource.cfg
status_file=/var/log/nagios/status.dat
status_update_interval=10
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=-1
command_file=/var/spool/nagios/cmd/nagios.cmd
external_command_buffer_slots=4096
lock_file=/var/run/nagios.pid
temp_file=/var/log/nagios/nagios.tmp
temp_path=/tmp
event_broker_options=-1
log_rotation_method=d
log_archive_path=/var/log/nagios/archives
use_syslog=1
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=30
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=30
max_concurrent_checks=0
check_result_reaper_frequency=10
max_check_result_reaper_time=30
check_result_path=/var/log/nagios/spool/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=15
cached_service_check_horizon=15
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=/var/log/nagios/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
check_for_updates=1
bare_update_check=0
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=0
obsess_over_services=0
obsess_over_hosts=0
translate_passive_host_checks=0
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
check_service_freshness=1
service_freshness_check_interval=60
service_check_timeout_state=c
check_host_freshness=0
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
p1_file=/usr/sbin/p1.pl
enable_embedded_perl=1
use_embedded_perl_implicitly=1
illegal_object_name_chars=`~!$%^&amp;*|'"&lt;&gt;?,()=
illegal_macro_output_chars=`~$&amp;|'"&lt;&gt;
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=nagios@localhost
admin_pager=pagenagios@localhost
daemon_dumps_core=0
use_large_installation_tweaks=0
enable_environment_macros=1
debug_level=0
debug_verbosity=1
debug_file=/var/log/nagios/nagios.debug
max_debug_file_size=1000000
[root@nagios-server nagios]#
B: /etc/nagios/objects/commands.cfg
[root@nagios-server objects]# egrep -v '^#|^$' commands.cfg
define command{
command_name notify-host-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****nnNotification Type: $NOTIFICATIONTYPE$nHost: $HOSTNAME$nState: $HOSTSTATE$nAddress: $HOSTADDRESS$nInfo: $HOSTOUTPUT$nnDate/Time: $LONGDATETIME$n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****nnNotification Type: $NOTIFICATIONTYPE$nnService: $SERVICEDESC$nHost: $HOSTALIAS$nAddress: $HOSTADDRESS$nState: $SERVICESTATE$nnDate/Time: $LONGDATETIME$nnAdditional Info:nn$SERVICEOUTPUT$n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
define command{
command_name check_local_load
command_line $USER1$/check_load -w $ARG1$ -c $ARG2$
}
define command{
command_name check_local_procs
command_line $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
}
define command{
command_name check_local_users
command_line $USER1$/check_users -w $ARG1$ -c $ARG2$
}
define command{
command_name check_local_swap
command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$
}
define command{
command_name check_local_mrtgtraf
command_line $USER1$/check_mrtgtraf -F $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -e $ARG5$
}
define command{
command_name check_ftp
command_line $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_hpjd
command_line $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_snmp
command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_http
command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_ssh
command_line $USER1$/check_ssh $ARG1$ $HOSTADDRESS$
}
define command{
command_name check_dhcp
command_line $USER1$/check_dhcp $ARG1$
}
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
define command{
command_name check_pop
command_line $USER1$/check_pop -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_imap
command_line $USER1$/check_imap -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_smtp
command_line $USER1$/check_smtp -H $HOSTADDRESS$ $ARG1$
}
define command{
command_name check_tcp
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
}
define command{
command_name check_udp
command_line $USER1$/check_udp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
}
define command{
command_name check_nt
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
}
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
define command{
command_name process-host-perfdata
command_line /usr/bin/printf "%b" "$LASTHOSTCHECK$t$HOSTNAME$t$HOSTSTATE$t$HOSTATTEMPT$t$HOSTSTATETYPE$t$HOSTEXECUTIONTIME$t$HOSTOUTPUT$t$HOSTPERFDATA$n" &gt;&gt; /var/log/nagios/host-perfdata.out
}
define command{
command_name process-service-perfdata
command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$t$HOSTNAME$t$SERVICEDESC$t$SERVICESTATE$t$SERVICEATTEMPT$t$SERVICESTATETYPE$t$SERVICEEXECUTIONTIME$t$SERVICELATENCY$t$SERVICEOUTPUT$t$SERVICEPERFDATA$n" &gt;&gt; /var/log/nagios/service-perfdata.out
}
[root@nagios-server objects]#
C: /etc/nagios/objects/companyA/hosts.cfg
[root@nagios-server companyA]# cat hosts.cfg 
##### in this file only list host information  only add hosts of companyA#######

define host{
    use                     linux-server
    host_name               web-node01
    alias                   web-node01
    address                 192.168.122.94
}
[root@nagios-server companyA]# 
D: /etc/nagios/objects/companyA/services.cfg

[root@nagios-server companyA]# cat services.cfg 
### services running in companyA servers

define service{
    use                     generic-service
    host_name               web-node01
    service_description     PING
    check_command           check_ping!100.0,20%!500.0,60%
}

define service{
    use                     generic-service
    host_name               web-node01
    service_description     Current Load
    check_command           check_nrpe!check_load
}

define service{
    use                     generic-service
    host_name               web-node01
    service_description     Total Processes
    check_command           check_nrpe!check_users
}