Thread: Improve cursor handling in pg_stat_statements

Improve cursor handling in pg_stat_statements

From
"Imseih (AWS), Sami"
Date:

Hi Hackers,

 

A recent observation in the way pg_stat_statements

handles CURSORS appears to be undesirable. This

was also recently brought up by Fujii Masao in the thread [1].

 

The findings are:

1. DECLARE CURSOR statements are not normalized.

2. The statistics are aggregated on the FETCH

     statement.

3. Planning time is not tracked for DECLARE CURSOR

     statements.

 

For #1, the concern is for applications that heavily

use cursors could end up seeing heavy pgss thrashing if

the query parameter change often.

 

For #2, since the FETCH statement only deals with a

cursor name, similar cursor names with different

underlying cursor statements will be lumped into

the same entry, which is absolutely incorrect from a

statistics perspective. Even if the same cursor name

is always for the same underlying statement, the pgss

user has to do extra parsing work to figure out which

underlying SQL statement is for the cursor.

 

For #3, planning time for cursors not being considered is

because pgss always sets queryId to 0 for utility statements,

and pgss_planner is taught to ignore queryId's = 0. This should

not be the case if the UTILITY statement has an underlying

optimizable statement.

 

I have attached v01-improve-cursor-tracking-in-pg_stat_statements.patch

which does the following:

 

## without the patch

 

postgres=# begin;

BEGIN

postgres=*# declare c1 cursor for select * from foo where id = 1;

DECLARE CURSOR

postgres=*# fetch c1; close c1;

id

----

  1

(1 row)

 

CLOSE CURSOR

postgres=*#  declare c1 cursor for select * from foo where id = 2;

DECLARE CURSOR

postgres=*# fetch c1; close c1;

id

----

  2

(1 row)

 

CLOSE CURSOR

postgres=*# select query, calls from pg_stat_statements where query like '%c1%';

                        query                                                            | calls | plans

----------------------------------------------------------------------+--------+-------

declare c1 cursor for select * from foo where id = 1 |     1  |  0

declare c1 cursor for select * from foo where id = 2 |     1  |  0

close c1                                                                              |     2  |  0

fetch c1                                                                              |     2  |  0

(4 rows)

 

### with the patch

 

postgres=# begin;

BEGIN

postgres=*# declare c1 cursor for select * from foo where id = 1;

DECLARE CURSOR

postgres=*# fetch c1;

id

----

  1

(1 row)

 

postgres=*# close c1;

CLOSE CURSOR

postgres=*# declare c1 cursor for select * from foo where id = 1;

DECLARE CURSOR

postgres=*# fetch c1;

id

----

  1

(1 row)

 

postgres=*# close c1;

CLOSE CURSOR

postgres=*# select query, calls from pg_stat_statements where query like '%c1%';

postgres=*# select query, calls, plans from pg_stat_statements where query like '%c1%';

                         query                                                            | calls | plans

------------------------------------------------------------------------+------+-------

declare c1 cursor for select * from foo where id = $1 |     2  |     2

(1 row)

 

We can see that:

A. without the patch, planning stats were not considered,

     but with the patch they are.

B. With the patch, the queries are normalized,

     reducing the # of entries.

C. With the patch, the cursor stats are tracked by the top

     level DECLARE CURSOR statement instead of the FETCH. This

     means all FETCHes to the same cursor, regardless of the

     FETCH options, will be tracked together. This to me is

     more reasonable than lumping all FETCH stats for a cursor,

     even if the cursors underlying statement is different.

D. The CLOSE <cursor> statement is also not tracked any

     longer with the patch.

E. For queryId Jumbling, an underlying statement will not have

     the same queryId as a similar statement that is not

     run in a cursor. i.e. "select * from tab" will have a differet

     queryId from "declare cursor c1 for select * from tab"

 

Feedback on this patch will be appreciated.

 

Regards,

 

[1]  https://www.postgresql.org/message-id/37d32e91-4a08-afaf-a3a8-fd0578e4db50%40oss.nttdata.com

 

--

Sami Imseih

Amazon Web Services (AWS)

Attachment