I have been testing various monitoring systems lately. This month I am looking at Sensu. When I deployed Quagga, it comes with a little monit type application called watchquagga. This application, within a few seconds, restarts a failed routing protocol, like ospf6d or bgpd. Watching the routing protocol processes using check-proc.rb does not help.

So I needed to write a triggered Sensu alert within the BASH script that restarts the routing protocol. Because watchquagga does not currently support issuing a script when it reloads a routing protocol, I had to hack the /etc/init.d/quagga startup script. Not sure how this hack would work on a systemd system.

Here is a snippet of the appropriate section, that was modified.


# Starts the server if it's not alrady running according to the pid file.
# The Quagga daemons creates the pidfile when starting.
start()
{
  if [ "$1" = "watchquagga" ]; then

    # We may need to restart watchquagga if new daemons are added and/or
    # removed
    if started "$1" ; then
      stop watchquagga
    else
      # Echo only once. watchquagga is printed in the stop above
      echo -n " $1"
    fi


    start-stop-daemon \
      --start \
      --pidfile=`pidfile $1` \
      --exec "$D_PATH/$1" \
      -- \
      "${watchquagga_options[@]}"

    
    /etc/sensu/plugins/triggers/quagga-reload-event.rb --handler pagerduty,logstash "${watchquagga_options[@]}"
    
  elif [ -n "$2" ]; then
    echo -n " $1-$2"
    if ! check_daemon $1 $2

...
.......

The quagga-reload-event.rb script takes the watchquagga options and strips all options leaving only the routing protocols that were reloaded in the argument string list. This is then passed as a message to the send_event() function so that the user can be told exactly which routing protocols were restarted.

In the quagga-reload-event.rb file, the magic to send a Sensu event without using a check is shown below

require 'socket'
require 'json'

def send_event(metric_name, options, msg, check_type='standard')
  data = {
    'name'      => metric_name,
    'type'      => check_type,
    'output'    => msg,
    # options is an OpenStruct object
    'handlers'  => options.handler,
    'status'    => 2
  }
  # Dump the data to the socket
  socket = TCPSocket.new '127.0.0.1', 3030
  socket.print data.to_json
  socket.close
end

Here is an image of the alert on Uchiwa Sensu Quagga