Thread: pg_stat_statements : how to catch non successfully finishedstatements ?

pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
Hello all,

I was wondering if there is a hook to collect non successfully finished SQL
statements in pg_stat_statements (timed-out, cancelled, killed, or simply
errored) ?

Thanks in advance
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
Arthur Zakirov
Date:
Hello,

On Thu, Apr 26, 2018 at 01:24:25PM -0700, legrand legrand wrote:
> Hello all,
> 
> I was wondering if there is a hook to collect non successfully finished SQL
> statements in pg_stat_statements (timed-out, cancelled, killed, or simply
> errored) ?

Some time ago I looked for a such hook. My case was to log failed access
permissions checks. I didn't find a way to do it, except adding a
narrow-focused hook within ExecCheckRTPerms().

There is ErrorContextCallback. Context callbacks are called when an
error was raised. But I don't think that it is a good approach for
pg_stat_statements cases.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
OK I see ...

This hook should be used only for ERROR (no WARNING nor NOTICE ...)
and the only real interesting information is  
  ErrorData -> internalquery; /* text of internally-generated query */

This doesn't permit to (re)build the link to queryid 
(that is based on parse tree, but not available here)

So the only solution is to had queryId to ErrorData in this hook
or create a new hook fired on ERROR and containing queryId ?

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


legrand legrand <legrand_legrand@hotmail.com> writes:
> So the only solution is to had queryId to ErrorData in this hook
> or create a new hook fired on ERROR and containing queryId ?

I see no particular need for a new hook.  What's needed here is for
pgss_ExecutorRun (and maybe some other existing functions in
pg_stat_statements) to go ahead and record the statement when they catch
an error thrown out of standard_ExecutorRun, rather than just updating
the module's nested_level variable and re-throwing.

The hard part here is that you have to be really careful what you do in
a PG_CATCH block, because the only thing you know for sure about the
backend's state is that it's not good.  Catalog fetches are right out,
and anything that might itself throw an error had best be avoided as
well.  (Which, among other things, means that examining executor state
would be a bad idea, and I'm not even sure you'd want to traverse the plan
tree.)

I'm not convinced that it's practical for pg_stat_statements to make a new
shared hashtable entry under those constraints.  But figuring out how to
minimize the risks around that is the stumbling block, not lack of a hook.

            regards, tom lane


Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
OK no need of a new hook. 
I'll try to implement what you suggest here, but this clearly exceeds my
(poor) development skill.

As you also noticed, in addition to collect this SQL statement counters for
the corresponding QueryId,
I would have been interested in its PlanId ... but it's an other subject

Regards
PAscal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
Progress report on this subject:

1/ Some clarifications: 

What is expected here is to update pgss counters for ALL the queries that
have been executed,
taking into account queries finished in SUCCESS and thoses finised with
ERROR.

Main interest here is to catch queries that are cancelled or interrupted
by a timeout after a long running period, that could be hours in BI
reporting environments.

Errors during parsing are not catched, because there is no hook available
before parsing 
and because witout parse tree available, QueryId can not be calculated. 
This is not a problem yet because the objective is not to count errors.

There was a remark off-list saying that cummulating SUCCESS and ERROR
counters
for the same query, could be a problem for thoses monitoring AVG indicators 
(becomming smaller than the SUCCESS ones).

One proposal is to add a Boolean flag (success) in the key of pgss:

dbid, userid, queryid, success, calls, total_time, ...
1     1       123      t        100      100 000
1     1       123      f         10    1 000 000

  

2/ Modifying pgss_ExecutorRun (as suggested by Tom Lane) with:


    PG_CATCH();
    {
        /* Added part to get counters on errors */
            EState       *estate;
            if (queryDesc->totaltime && pgss_enabled())
            {
                    estate = queryDesc->estate;
                    InstrStopNode(queryDesc->totaltime, estate->es_processed);
                    InstrEndLoop(queryDesc->totaltime);

            pgss_store(queryDesc->sourceText,
                   queryDesc->plannedstmt->queryId,
                   queryDesc->plannedstmt->stmt_location,
                   queryDesc->plannedstmt->stmt_len,
                   queryDesc->totaltime->total * 1000.0,    /* convert to msec */
                   queryDesc->estate->es_processed,
                   &queryDesc->totaltime->bufusage,
                   NULL);
            }

        nested_level--;
        PG_RE_THROW();
    }
    PG_END_TRY();


permits to catch simple queries in pgss (to be enhanced for utility
statements, pl/pgsql, 
parallel queries, ...).
Would such a code have a chance to be validated ?

Feedback is welcome.
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
Hello,

Here is a patch that  :
- adds a new guc: pg_stat_statements.track_errors boolean (default to true),
- capture of DML, DDL, PL/PGSQL commands in error into pgss.

There is always a risk that new code used in PG_CATCH (mainly pgss_store)
gives an error.
I'm not able to tell when it could occur and what would be the impact ...

see
pgss_with_errors.patch
<http://www.postgresql-archive.org/file/t348768/pgss_with_errors.patch>  

pgss_with_errors.txt
<http://www.postgresql-archive.org/file/t348768/pgss_with_errors.txt>  

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
Tom Lane-2 wrote
> legrand legrand <

> legrand_legrand@

> > writes:
>> So the only solution is to had queryId to ErrorData in this hook
>> or create a new hook fired on ERROR and containing queryId ?
> 
> I see no particular need for a new hook.  What's needed here is for
> pgss_ExecutorRun (and maybe some other existing functions in
> pg_stat_statements) to go ahead and record the statement when they catch
> an error thrown out of standard_ExecutorRun, rather than just updating
> the module's nested_level variable and re-throwing.
> 
> The hard part here is that you have to be really careful what you do in
> a PG_CATCH block, because the only thing you know for sure about the
> backend's state is that it's not good.  Catalog fetches are right out,
> and anything that might itself throw an error had best be avoided as
> well.  (Which, among other things, means that examining executor state
> would be a bad idea, and I'm not even sure you'd want to traverse the plan
> tree.)
> 
> I'm not convinced that it's practical for pg_stat_statements to make a new
> shared hashtable entry under those constraints.  But figuring out how to
> minimize the risks around that is the stumbling block, not lack of a hook.
> 
>             regards, tom lane

As far as I have been testing this with *cancelled* queries (Cancel, 
pg_cancel_backend(), statement_timeout, ...), I haven't found any problem.
 
Would limiting the PG_CATCH block to thoses *cancelled* queries  
and *no other error*, be an alternate solution ?

If yes, is there a way to identify what was the reason of the error when 
entering the PG_CATCH block (and point me to any exemple) ?

Thanks in advance.

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



legrand legrand <legrand_legrand@hotmail.com> writes:
> Tom Lane-2 wrote
>> The hard part here is that you have to be really careful what you do in
>> a PG_CATCH block, because the only thing you know for sure about the
>> backend's state is that it's not good.  Catalog fetches are right out,
>> and anything that might itself throw an error had best be avoided as
>> well.  (Which, among other things, means that examining executor state
>> would be a bad idea, and I'm not even sure you'd want to traverse the plan
>> tree.)
>> I'm not convinced that it's practical for pg_stat_statements to make a new
>> shared hashtable entry under those constraints.  But figuring out how to
>> minimize the risks around that is the stumbling block, not lack of a hook.

> As far as I have been testing this with *cancelled* queries (Cancel, 
> pg_cancel_backend(), statement_timeout, ...), I haven't found any problem.
> Would limiting the PG_CATCH block to thoses *cancelled* queries  
> and *no other error*, be an alternate solution ?

I do not see that that would make one iota of difference to the risk that
the executor state tree is inconsistent at the instant the error is
thrown.  You can't test your way to the conclusion that it's safe, either
(much less that it'd remain safe); your test cases surely haven't hit
every CHECK_FOR_INTERRUPTS call in the backend.

            regards, tom lane



Re: pg_stat_statements : how to catch non successfully finishedstatements ?

From
legrand legrand
Date:
Tom Lane-2 wrote
> legrand legrand <

> legrand_legrand@

> > writes:
>> Tom Lane-2 wrote
>>> The hard part here is that you have to be really careful what you do in
>>> a PG_CATCH block, because the only thing you know for sure about the
>>> backend's state is that it's not good.  Catalog fetches are right out,
>>> and anything that might itself throw an error had best be avoided as
>>> well.  (Which, among other things, means that examining executor state
>>> would be a bad idea, and I'm not even sure you'd want to traverse the
>>> plan
>>> tree.)
>>> I'm not convinced that it's practical for pg_stat_statements to make a
>>> new
>>> shared hashtable entry under those constraints.  But figuring out how to
>>> minimize the risks around that is the stumbling block, not lack of a
>>> hook.
> 
>> As far as I have been testing this with *cancelled* queries (Cancel, 
>> pg_cancel_backend(), statement_timeout, ...), I haven't found any
>> problem.
>> Would limiting the PG_CATCH block to thoses *cancelled* queries  
>> and *no other error*, be an alternate solution ?
> 
> I do not see that that would make one iota of difference to the risk that
> the executor state tree is inconsistent at the instant the error is
> thrown.  You can't test your way to the conclusion that it's safe, either
> (much less that it'd remain safe); your test cases surely haven't hit
> every CHECK_FOR_INTERRUPTS call in the backend.
> 
>             regards, tom lane


new try:

 Considering that executor state tree is limited to QueryDesc->estate,
 that would mean that rows processed can not be trusted, but that 
 queryid, buffers and *duration* (that is the more important one)
 can still be used ?
  
 Knowing that shared hashtable entries are now (in pg13) created during 
 planning time. There is no need to create a new one for execution error: 
 just update counters (current ones or new columns like "errors" , 
 "total_error_time",  ... added to pg_stat_statements view).
 
Is that better ?
 
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html