Thread: pg_stat_statements : how to catch non successfully finishedstatements ?
Hello all, I was wondering if there is a hook to collect non successfully finished SQL statements in pg_stat_statements (timed-out, cancelled, killed, or simply errored) ? Thanks in advance Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
Arthur Zakirov
Date:
Hello, On Thu, Apr 26, 2018 at 01:24:25PM -0700, legrand legrand wrote: > Hello all, > > I was wondering if there is a hook to collect non successfully finished SQL > statements in pg_stat_statements (timed-out, cancelled, killed, or simply > errored) ? Some time ago I looked for a such hook. My case was to log failed access permissions checks. I didn't find a way to do it, except adding a narrow-focused hook within ExecCheckRTPerms(). There is ErrorContextCallback. Context callbacks are called when an error was raised. But I don't think that it is a good approach for pg_stat_statements cases. -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
legrand legrand
Date:
OK I see ... This hook should be used only for ERROR (no WARNING nor NOTICE ...) and the only real interesting information is ErrorData -> internalquery; /* text of internally-generated query */ This doesn't permit to (re)build the link to queryid (that is based on parse tree, but not available here) So the only solution is to had queryId to ErrorData in this hook or create a new hook fired on ERROR and containing queryId ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
legrand legrand <legrand_legrand@hotmail.com> writes: > So the only solution is to had queryId to ErrorData in this hook > or create a new hook fired on ERROR and containing queryId ? I see no particular need for a new hook. What's needed here is for pgss_ExecutorRun (and maybe some other existing functions in pg_stat_statements) to go ahead and record the statement when they catch an error thrown out of standard_ExecutorRun, rather than just updating the module's nested_level variable and re-throwing. The hard part here is that you have to be really careful what you do in a PG_CATCH block, because the only thing you know for sure about the backend's state is that it's not good. Catalog fetches are right out, and anything that might itself throw an error had best be avoided as well. (Which, among other things, means that examining executor state would be a bad idea, and I'm not even sure you'd want to traverse the plan tree.) I'm not convinced that it's practical for pg_stat_statements to make a new shared hashtable entry under those constraints. But figuring out how to minimize the risks around that is the stumbling block, not lack of a hook. regards, tom lane
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
legrand legrand
Date:
OK no need of a new hook. I'll try to implement what you suggest here, but this clearly exceeds my (poor) development skill. As you also noticed, in addition to collect this SQL statement counters for the corresponding QueryId, I would have been interested in its PlanId ... but it's an other subject Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
legrand legrand
Date:
Progress report on this subject: 1/ Some clarifications: What is expected here is to update pgss counters for ALL the queries that have been executed, taking into account queries finished in SUCCESS and thoses finised with ERROR. Main interest here is to catch queries that are cancelled or interrupted by a timeout after a long running period, that could be hours in BI reporting environments. Errors during parsing are not catched, because there is no hook available before parsing and because witout parse tree available, QueryId can not be calculated. This is not a problem yet because the objective is not to count errors. There was a remark off-list saying that cummulating SUCCESS and ERROR counters for the same query, could be a problem for thoses monitoring AVG indicators (becomming smaller than the SUCCESS ones). One proposal is to add a Boolean flag (success) in the key of pgss: dbid, userid, queryid, success, calls, total_time, ... 1 1 123 t 100 100 000 1 1 123 f 10 1 000 000 2/ Modifying pgss_ExecutorRun (as suggested by Tom Lane) with: PG_CATCH(); { /* Added part to get counters on errors */ EState *estate; if (queryDesc->totaltime && pgss_enabled()) { estate = queryDesc->estate; InstrStopNode(queryDesc->totaltime, estate->es_processed); InstrEndLoop(queryDesc->totaltime); pgss_store(queryDesc->sourceText, queryDesc->plannedstmt->queryId, queryDesc->plannedstmt->stmt_location, queryDesc->plannedstmt->stmt_len, queryDesc->totaltime->total * 1000.0, /* convert to msec */ queryDesc->estate->es_processed, &queryDesc->totaltime->bufusage, NULL); } nested_level--; PG_RE_THROW(); } PG_END_TRY(); permits to catch simple queries in pgss (to be enhanced for utility statements, pl/pgsql, parallel queries, ...). Would such a code have a chance to be validated ? Feedback is welcome. Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
legrand legrand
Date:
Hello, Here is a patch that : - adds a new guc: pg_stat_statements.track_errors boolean (default to true), - capture of DML, DDL, PL/PGSQL commands in error into pgss. There is always a risk that new code used in PG_CATCH (mainly pgss_store) gives an error. I'm not able to tell when it could occur and what would be the impact ... see pgss_with_errors.patch <http://www.postgresql-archive.org/file/t348768/pgss_with_errors.patch> pgss_with_errors.txt <http://www.postgresql-archive.org/file/t348768/pgss_with_errors.txt> Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
legrand legrand
Date:
Tom Lane-2 wrote > legrand legrand < > legrand_legrand@ > > writes: >> So the only solution is to had queryId to ErrorData in this hook >> or create a new hook fired on ERROR and containing queryId ? > > I see no particular need for a new hook. What's needed here is for > pgss_ExecutorRun (and maybe some other existing functions in > pg_stat_statements) to go ahead and record the statement when they catch > an error thrown out of standard_ExecutorRun, rather than just updating > the module's nested_level variable and re-throwing. > > The hard part here is that you have to be really careful what you do in > a PG_CATCH block, because the only thing you know for sure about the > backend's state is that it's not good. Catalog fetches are right out, > and anything that might itself throw an error had best be avoided as > well. (Which, among other things, means that examining executor state > would be a bad idea, and I'm not even sure you'd want to traverse the plan > tree.) > > I'm not convinced that it's practical for pg_stat_statements to make a new > shared hashtable entry under those constraints. But figuring out how to > minimize the risks around that is the stumbling block, not lack of a hook. > > regards, tom lane As far as I have been testing this with *cancelled* queries (Cancel, pg_cancel_backend(), statement_timeout, ...), I haven't found any problem. Would limiting the PG_CATCH block to thoses *cancelled* queries and *no other error*, be an alternate solution ? If yes, is there a way to identify what was the reason of the error when entering the PG_CATCH block (and point me to any exemple) ? Thanks in advance. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
legrand legrand <legrand_legrand@hotmail.com> writes: > Tom Lane-2 wrote >> The hard part here is that you have to be really careful what you do in >> a PG_CATCH block, because the only thing you know for sure about the >> backend's state is that it's not good. Catalog fetches are right out, >> and anything that might itself throw an error had best be avoided as >> well. (Which, among other things, means that examining executor state >> would be a bad idea, and I'm not even sure you'd want to traverse the plan >> tree.) >> I'm not convinced that it's practical for pg_stat_statements to make a new >> shared hashtable entry under those constraints. But figuring out how to >> minimize the risks around that is the stumbling block, not lack of a hook. > As far as I have been testing this with *cancelled* queries (Cancel, > pg_cancel_backend(), statement_timeout, ...), I haven't found any problem. > Would limiting the PG_CATCH block to thoses *cancelled* queries > and *no other error*, be an alternate solution ? I do not see that that would make one iota of difference to the risk that the executor state tree is inconsistent at the instant the error is thrown. You can't test your way to the conclusion that it's safe, either (much less that it'd remain safe); your test cases surely haven't hit every CHECK_FOR_INTERRUPTS call in the backend. regards, tom lane
Re: pg_stat_statements : how to catch non successfully finishedstatements ?
From
legrand legrand
Date:
Tom Lane-2 wrote > legrand legrand < > legrand_legrand@ > > writes: >> Tom Lane-2 wrote >>> The hard part here is that you have to be really careful what you do in >>> a PG_CATCH block, because the only thing you know for sure about the >>> backend's state is that it's not good. Catalog fetches are right out, >>> and anything that might itself throw an error had best be avoided as >>> well. (Which, among other things, means that examining executor state >>> would be a bad idea, and I'm not even sure you'd want to traverse the >>> plan >>> tree.) >>> I'm not convinced that it's practical for pg_stat_statements to make a >>> new >>> shared hashtable entry under those constraints. But figuring out how to >>> minimize the risks around that is the stumbling block, not lack of a >>> hook. > >> As far as I have been testing this with *cancelled* queries (Cancel, >> pg_cancel_backend(), statement_timeout, ...), I haven't found any >> problem. >> Would limiting the PG_CATCH block to thoses *cancelled* queries >> and *no other error*, be an alternate solution ? > > I do not see that that would make one iota of difference to the risk that > the executor state tree is inconsistent at the instant the error is > thrown. You can't test your way to the conclusion that it's safe, either > (much less that it'd remain safe); your test cases surely haven't hit > every CHECK_FOR_INTERRUPTS call in the backend. > > regards, tom lane new try: Considering that executor state tree is limited to QueryDesc->estate, that would mean that rows processed can not be trusted, but that queryid, buffers and *duration* (that is the more important one) can still be used ? Knowing that shared hashtable entries are now (in pg13) created during planning time. There is no need to create a new one for execution error: just update counters (current ones or new columns like "errors" , "total_error_time", ... added to pg_stat_statements view). Is that better ? Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html