Thread: Is it possible to log long running recursive/internal postgresql queries?

Is it possible to log long running recursive/internal postgresql queries?

From
Hotmail
Date:
Hi all,


Postgers version - 11.17
OS: Linux

We recently had an interesting missing foreign key index issue that was very hard to detect. It involved a parent/child relationship where the data type of the parent key did not match the data type of the child key. (It’s a longer story about why they don’t match so I won’t get into it here). The issue presented itself in our database as long running commits (> 1 minute). We immediately thought unindexed foreign key but after running a query to identify any unindexed foreign keys we couldn’t find one. 

After some investigating we learned that the long running commit was in fact being caused by the check for child keys when deleting a parent record as part of multiple sql statement transaction. The internal query that postgres was running looked like this:


"SELECT 1 FROM ONLY “<app_schema>”.”<child_table>" x WHERE $1 OPERATOR(pg_catalog.=) “<child_column>"::pg_catalog."numeric" FOR KEY SHARE OF x"

The problem ended up being that the parent data type (numeric) was different than the child(bigint). So when we were deleting the parent, the check for child records on the child table was ignoring the index (on the bigint column) on the child column because of the differing data types (numeric vs bigint). The short term workaround was to create an index casting the child column as numeric which resolved the long running commits. 

We have log_min_duration_statement set to log any query that runs longer that 2 seconds but the long running internal statement was not logged.  My question: is there any way to record an entry in the postgres server log for the long running internal/recursive query that was checking for child records as a result of of deleting the parent record?

The only way we were able to identify the long running query that was checking for child records was to ctrl-c out of the long running delete from the parent delete statement in a test system. We  tried 'set log_statement= “all”' in psql before executing the long running delete in the test system but the internal check for child records was not logged. 

If it is possible to log long running internal queries as outlined above any guidance would be appreciated. 

Regards,

Craig Jackson
Hotmail <crajac66@hotmail.com> writes:
> If it is possible to log long running internal queries as outlined above any guidance would be appreciated.

Either auto_explain or pg_stat_statements could have helped you
with that, I think.  I believe (without having tested) that if
configured to track nested statements they would perceive the
FK enforcement query as a separate statement to track.

            regards, tom lane



Re: Is it possible to log long running recursive/internal postgresql queries?

From
Laurenz Albe
Date:
On Fri, 2023-02-17 at 13:28 -0500, Tom Lane wrote:
> Hotmail <crajac66@hotmail.com> writes:
> > If it is possible to log long running internal queries as outlined above any guidance would be appreciated.
>
> Either auto_explain or pg_stat_statements could have helped you
> with that, I think.  I believe (without having tested) that if
> configured to track nested statements they would perceive the
> FK enforcement query as a separate statement to track.

Yes, they do.
You need to set "auto_explain.log_nested_statements = on" or
"pg_stat_statements.track = all".

Yours,
Laurenz Albe



Thanks for the guidance. FYI, this is what I observed after enabling auto_explain in my test system and tracking nested statements in pg_stat_statements. 


>load 'auto_explain';
LOAD
>set auto_explain.log_min_duration=10000;
SET
>set auto_explain.log_nested_statements=true;
set 

>delete from parent_oids where parent_oid=281907;
DELETE 1
Time: 18554.906 ms (00:18.555)

Server log entries:

2023-02-20 08:38:22 MST [30367]: user=cjackson,db=test1,app=psql,client=[local] CONTEXT: SQL statement "SELECT 1 FROM ONLY "cjackson"."child_oids" x WHERE $1 OPERATOR(pg_catalog.=) "child_oid"::pg_catalog."numeric" FOR KEY SHARE OF x"
2023-02-20 08:38:22 MST [30367]: user=cjackson,db=test1,app=psql,client=[local] LOG: duration: 18621.383 ms plan:
Query Text: delete from parent_oids where parent_oid=281907;
Delete on parent_oids (cost=0.42..2.54 rows=1 width=6)
 -> Index Scan using pk_parent_oid on parent_oids (cost=0.42..2.54 rows=1 width=6)
 Index Cond: (parent_oid = '281907'::numeric)

pg_stat_statements

>select total_time, mean_time, query from pg_stat_statements where mean_time > 10000 and query like '%FOR KEY SHARE OF x%' \gx
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------
total_time | 59550.947124
mean_time | 19850.315708
query | SELECT $2 FROM ONLY "cjackson"."child_oids" x WHERE $1 OPERATOR(pg_catalog.=) "child_oid"::pg_catalog."numeric" FOR KEY SHARE OF x


Craig
On Feb 20, 2023 at 5:52 AM -0800, Laurenz Albe <laurenz.albe@cybertec.at>, wrote:
On Fri, 2023-02-17 at 13:28 -0500, Tom Lane wrote:
Hotmail <crajac66@hotmail.com> writes:
If it is possible to log long running internal queries as outlined above any guidance would be appreciated.

Either auto_explain or pg_stat_statements could have helped you
with that, I think.  I believe (without having tested) that if
configured to track nested statements they would perceive the
FK enforcement query as a separate statement to track.

Yes, they do.
You need to set "auto_explain.log_nested_statements = on" or
"pg_stat_statements.track = all".

Yours,
Laurenz Albe

Re: Is it possible to log long running recursive/internal postgresql queries?

From
Laurenz Albe
Date:
On Mon, 2023-02-20 at 07:53 -0800, Hotmail wrote:
> this is what I observed after enabling auto_explain in my test system and tracking nested statements in
pg_stat_statements. 
>
> Server log entries:
>
> 2023-02-20 08:38:22 MST [30367]: user=cjackson,db=test1,app=psql,client=[local] CONTEXT: SQL statement "SELECT 1 FROM
ONLY"cjackson"."child_oids" x WHERE $1 OPERATOR(pg_catalog.=) 
> "child_oid"::pg_catalog."numeric" FOR KEY SHARE OF x"
> 2023-02-20 08:38:22 MST [30367]: user=cjackson,db=test1,app=psql,client=[local] LOG: duration: 18621.383 ms plan:
>     Query Text: delete from parent_oids where parent_oid=281907;
>     Delete on parent_oids (cost=0.42..2.54 rows=1 width=6)
>  -> Index Scan using pk_parent_oid on parent_oids (cost=0.42..2.54 rows=1 width=6)
>  Index Cond: (parent_oid = '281907'::numeric)
>
> pg_stat_statements
>
> > select total_time, mean_time, query from pg_stat_statements where mean_time > 10000 and query like '%FOR KEY SHARE
OFx%' \gx 
> -[ RECORD 1
]----------------------------------------------------------------------------------------------------------------------------------
> total_time | 59550.947124
> mean_time | 19850.315708
> query | SELECT $2 FROM ONLY "cjackson"."child_oids" x WHERE $1 OPERATOR(pg_catalog.=)
"child_oid"::pg_catalog."numeric"FOR KEY SHARE OF x 

You have some long running transactions that lock the row.
Long running transactions are a problem fo the application (which forgot to close it).
This may be exacerbated by a SELECT ... FOR UPDATE which conflicts with FOR KEY SHARE.
Often, SELECT ... FOR NO KEY UPDATE is the better choice.

Yours,
Laurenz Albe