[BUG] pg_stat_statements and extended query protocol - Mailing list pgsql-hackers

From Imseih (AWS), Sami
Subject [BUG] pg_stat_statements and extended query protocol
Date
Msg-id EBE6C507-9EB6-4142-9E4D-38B1673363A7@amazon.com
Whole thread Raw
Responses Re: [BUG] pg_stat_statements and extended query protocol  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers

Doing some work with extended query protocol, I encountered the same
issue that was discussed in [1]. It appears when a client is using
extended query protocol and sends an Execute message to a portal with
max_rows, and a portal is executed multiple times,
pg_stat_statements does not correctly track rows and calls.

Using the attached jdbc script, TEST.java, which can reproduce the issue
with setFetchSize of 100 with autocommit mode set to OFF. We can
see that although pg_class has 414 rows, the total call and
rows returned is 14. the first 4 * 100 fetches did not get accounted for,.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
1 | 14 | select * from pg_class
(1 row)

The execution work flow goes something like this:
ExecutorStart
ExecutorRun – which will be called multiple times to fetch from the
                           portal until the caller Closes the portal or the portal
                           runs out of rows.
ExecutorFinish
ExecutorEnd – portal is closed & pg_stat_statements stores the final rows processed

Where this breaks for pg_stat_statements is during ExecutorRun,
es_processed is reset to 0 every iteration. So by the time the portal
is closed, es_processed will only show the total from the last execute
message.

This appears to be only an issue for portals fetched
through extended query protocol and not explicit cursors
that go through simple query protocol (i.e. FETCH <cursor>)

I attached a JDBC script to repro the issue.

One potential fix I see is to introduce 2 new counters in the
ExecutionState which will track the total rows processed
and the number of calls. These counters can then be used
by pg_stat_statements. Attached is an experimental patch
which shows the correct number of rows and number of
calls.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
5 | 414 | select * from pg_class
(1 row)

[1] https://www.postgresql.org/message-id/flat/c90890e7-9c89-c34f-d3c5-d5c763a34bd8%40dunslane.net

Thanks


Sami Imseih
Amazon Web Services (AWS)

 

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security
Next
From: Justin Pryzby
Date:
Subject: Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound (stop telling users to "vacuum that database in single-user mode")