Thread: Nagios Plugin for pgAgent Failures?

Nagios Plugin for pgAgent Failures?

From
"David E. Wheeler"
Date:
pgAdmins,

Anyone written a Nagios plugin to report pgAgent failures? If not, I expect I’ll be writing one, or perhaps submitting
oneto check_postgres.pl. 

Thanks,

David



Re: Nagios Plugin for pgAgent Failures?

From
Guillaume Lelarge
Date:
On Tue, 2012-01-03 at 09:21 -0800, David E. Wheeler wrote:
> pgAdmins,
>
> Anyone written a Nagios plugin to report pgAgent failures? If not, I expect I’ll be writing one, or perhaps
submittingone to check_postgres.pl. 
>

I don't know of any, but if you do write one, please, submit it to
check_postgres.pl.

Thank you.


--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com
PostgreSQL Sessions #3: http://www.postgresql-sessions.org



Re: Nagios Plugin for pgAgent Failures?

From
"David E. Wheeler"
Date:
On Jan 3, 2012, at 9:37 AM, Guillaume Lelarge wrote:

> I don't know of any, but if you do write one, please, submit it to
> check_postgres.pl.

Yeah, I’m a committer on check_postgres.pl, so the other check_postgres.pl folks think it’s not a good place for it for
somereason I can’t imagine right now, you can expect that. 

David



Re: Nagios Plugin for pgAgent Failures?

From
Guillaume Lelarge
Date:
On Tue, 2012-01-03 at 09:44 -0800, David E. Wheeler wrote:
> On Jan 3, 2012, at 9:37 AM, Guillaume Lelarge wrote:
>
> > I don't know of any, but if you do write one, please, submit it to
> > check_postgres.pl.
>
> Yeah, I’m a committer on check_postgres.pl, so the other check_postgres.pl folks think it’s not a good place for it
forsome reason I can’t imagine right now, you can expect that. 
>

Oh, you too? funny :)

Anyway, I don't recall any thread on that issue on the check_postgres.pl
list.


--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com
PostgreSQL Sessions #3: http://www.postgresql-sessions.org



Re: Nagios Plugin for pgAgent Failures?

From
"David E. Wheeler"
Date:
On Jan 3, 2012, at 9:50 AM, Guillaume Lelarge wrote:

Yeah, I’m a committer on check_postgres.pl, so the other check_postgres.pl folks think it’s not a good place for it for some reason I can’t imagine right now, you can expect that.


Oh, you too? funny :)

Anyway, I don't recall any thread on that issue on the check_postgres.pl
list.

Here is the implementation I’m working on. Comments? Does the query look right?

sub check_pgagent_jobs {    ## Check for failed pgAgent jobs.    ## Supports: Nagios    ## Critical and warning are intervals.    ## Example: --critical="1 hour"    ## Example: --warning="2 hours"
    my ($warning, $critical) = validate_range({ type => 'time' });
    # Determine critcal warning columns.    my $is_crit = $critical        ? "CASE WHEN NOW() - (jlog.jlgstart + jlog.jlgduration) < '$critical'::interval THEN 1 ELSE 0 END"        : 'false';
    # Determine WHERE clause    my $where = do {        if ($critical && $warning) {            "GREATEST('$critical}'::interval, '$opt{warning}'::interval)";        } else {            my $val = $critical || $warning or ndie msg('range-noopt-orboth');            "'$val'::interval";        }    };
    $SQL = qq{
        SELECT jlog.jlgid
             , job.jobname
             , step.jstname
             , slog.jslresult
             , slog.jsloutput
             , $is_crit AS critical
          FROM pgagent.pga_job job
          JOIN pgagent.pga_joblog     jlog ON job.jobid  = jlog.jlgjobid
          JOIN pgagent.pga_jobstep    step ON job.jobid  = step.jstjobid
          JOIN pgagent.pga_jobsteplog slog ON jlog.jlgid = slog.jsljlgid AND step.jstid = slog.jsljstid
         WHERE slog.jslresult <> 0
           AND NOW() - (jlog.jlgstart + jlog.jlgduration) < $where;
    };
    my $info = run_command($SQL);
    for $db (@{$info->{db}}) {        my @rows = @{ $db->{slurp} } or do {            add_ok msg('pgagent-jobs-ok');            next;        };
        if ($rows[0]{critical} !~ /^[01]$/) {            add_unknown msg('invalid-query', $db->{slurp});            next;        }
        my ($is_crit, @msg);        my $log_id = -1;        for my $step (@rows) {            my $output = $step->{jsloutput} || '(NO OUTPUT)';            push @msg => "$step->{jslresult} $step->{jobname}/$step->{jstname}: $output";            $is_crit ||= $step->{critical};        }
        (my $msg = join '; ' => @msg) =~ s{\r?\n}{ }g;        if ($is_crit) {            add_critical $msg;        } else {            add_warning $msg;        }    }
    return;
}

Thanks,

David