Thread: Improve cursor handling in pg_stat_statements
Hi Hackers,
A recent observation in the way pg_stat_statements
handles CURSORS appears to be undesirable. This
was also recently brought up by Fujii Masao in the thread [1].
The findings are:
1. DECLARE CURSOR statements are not normalized.
2. The statistics are aggregated on the FETCH
statement.
3. Planning time is not tracked for DECLARE CURSOR
statements.
For #1, the concern is for applications that heavily
use cursors could end up seeing heavy pgss thrashing if
the query parameter change often.
For #2, since the FETCH statement only deals with a
cursor name, similar cursor names with different
underlying cursor statements will be lumped into
the same entry, which is absolutely incorrect from a
statistics perspective. Even if the same cursor name
is always for the same underlying statement, the pgss
user has to do extra parsing work to figure out which
underlying SQL statement is for the cursor.
For #3, planning time for cursors not being considered is
because pgss always sets queryId to 0 for utility statements,
and pgss_planner is taught to ignore queryId's = 0. This should
not be the case if the UTILITY statement has an underlying
optimizable statement.
I have attached v01-improve-cursor-tracking-in-pg_stat_statements.patch
which does the following:
## without the patch
postgres=# begin;
BEGIN
postgres=*# declare c1 cursor for select * from foo where id = 1;
DECLARE CURSOR
postgres=*# fetch c1; close c1;
id
----
1
(1 row)
CLOSE CURSOR
postgres=*# declare c1 cursor for select * from foo where id = 2;
DECLARE CURSOR
postgres=*# fetch c1; close c1;
id
----
2
(1 row)
CLOSE CURSOR
postgres=*# select query, calls from pg_stat_statements where query like '%c1%';
query | calls | plans
----------------------------------------------------------------------+--------+-------
declare c1 cursor for select * from foo where id = 1 | 1 | 0
declare c1 cursor for select * from foo where id = 2 | 1 | 0
close c1 | 2 | 0
fetch c1 | 2 | 0
(4 rows)
### with the patch
postgres=# begin;
BEGIN
postgres=*# declare c1 cursor for select * from foo where id = 1;
DECLARE CURSOR
postgres=*# fetch c1;
id
----
1
(1 row)
postgres=*# close c1;
CLOSE CURSOR
postgres=*# declare c1 cursor for select * from foo where id = 1;
DECLARE CURSOR
postgres=*# fetch c1;
id
----
1
(1 row)
postgres=*# close c1;
CLOSE CURSOR
postgres=*# select query, calls from pg_stat_statements where query like '%c1%';
postgres=*# select query, calls, plans from pg_stat_statements where query like '%c1%';
query | calls | plans
------------------------------------------------------------------------+------+-------
declare c1 cursor for select * from foo where id = $1 | 2 | 2
(1 row)
We can see that:
A. without the patch, planning stats were not considered,
but with the patch they are.
B. With the patch, the queries are normalized,
reducing the # of entries.
C. With the patch, the cursor stats are tracked by the top
level DECLARE CURSOR statement instead of the FETCH. This
means all FETCHes to the same cursor, regardless of the
FETCH options, will be tracked together. This to me is
more reasonable than lumping all FETCH stats for a cursor,
even if the cursors underlying statement is different.
D. The CLOSE <cursor> statement is also not tracked any
longer with the patch.
E. For queryId Jumbling, an underlying statement will not have
the same queryId as a similar statement that is not
run in a cursor. i.e. "select * from tab" will have a differet
queryId from "declare cursor c1 for select * from tab"
Feedback on this patch will be appreciated.
Regards,
[1] https://www.postgresql.org/message-id/37d32e91-4a08-afaf-a3a8-fd0578e4db50%40oss.nttdata.com
--
Sami Imseih
Amazon Web Services (AWS)