Thread: pgsql: Move pg_stat_statements query jumbling to core.

pgsql: Move pg_stat_statements query jumbling to core.

From

Bruce Momjian

Date:

07 April 2021, 17:06:59

Move pg_stat_statements query jumbling to core.

Add compute_query_id GUC to control whether a query identifier should be
computed by the core (off by default).  It's thefore now possible to
disable core queryid computation and use pg_stat_statements with a
different algorithm to compute the query identifier by using a
third-party module.

To ensure that a single source of query identifier can be used and is
well defined, modules that calculate a query identifier should throw an
error if compute_query_id specified to compute a query id and if a query
idenfitier was already calculated.

Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol

Author: Julien Rouhaud

Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/5fd9dfa5f50e4906c35133a414ebec5b6d518493

Modified Files
--------------
contrib/pg_stat_statements/pg_stat_statements.c    | 805 +-------------------
contrib/pg_stat_statements/pg_stat_statements.conf |   1 +
doc/src/sgml/config.sgml                           |  25 +
doc/src/sgml/pgstatstatements.sgml                 |  20 +-
src/backend/parser/analyze.c                       |  14 +-
src/backend/tcop/postgres.c                        |   6 +-
src/backend/utils/misc/Makefile                    |   1 +
src/backend/utils/misc/guc.c                       |  10 +
src/backend/utils/misc/postgresql.conf.sample      |   1 +
src/backend/utils/misc/queryjumble.c               | 834 +++++++++++++++++++++
src/include/parser/analyze.h                       |   4 +-
src/include/utils/guc.h                            |   1 +
src/include/utils/queryjumble.h                    |  58 ++
13 files changed, 995 insertions(+), 785 deletions(-)

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Andres Freund

Date:

08 April 2021, 05:17:35

Hi,

On 2021-04-07 17:06:59 +0000, Bruce Momjian wrote:
> Move pg_stat_statements query jumbling to core.
>
> Add compute_query_id GUC to control whether a query identifier should be
> computed by the core (off by default).  It's thefore now possible to
> disable core queryid computation and use pg_stat_statements with a
> different algorithm to compute the query identifier by using a
> third-party module.
>
> To ensure that a single source of query identifier can be used and is
> well defined, modules that calculate a query identifier should throw an
> error if compute_query_id specified to compute a query id and if a query
> idenfitier was already calculated.
>
> Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol
>
> Author: Julien Rouhaud

This - or something in the vicinity - seems to break running the
regression tests with force_parallel_mode=regress. There's lots of
differences like

 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
                                     query                                     | calls | rows
 ------------------------------------------------------------------------------+-------+------
  SELECT (i + $2 + $3)::INTEGER                                                |     2 |    2
  SELECT (i + $2)::INTEGER LIMIT $3                                            |     2 |    2
- SELECT PLUS_ONE($1)                                                          |     2 |    2
- SELECT PLUS_TWO($1)                                                          |     2 |    2
- SELECT pg_stat_statements_reset()                                            |     1 |    1
+ SELECT PLUS_ONE($1)                                                          |     4 |    4
+ SELECT PLUS_TWO($1)                                                          |     4 |    4
+ SELECT pg_stat_statements_reset();                                           |     2 |    2
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" |     0 |    0
 (6 rows)

 SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
  calls |                                          query
 -------+------------------------------------------------------------------------------------------
-     1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id
+     2 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR KEY SHARE
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR NO KEY UPDATE
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR SHARE
      2 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE NOWAIT
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_a
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b, pgss_a
      1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE SKIP LOCKED
      0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
-     1 | SELECT pg_stat_statements_reset()
+     2 | SELECT pg_stat_statements_reset();
 (12 rows)


Greetings,

Andres Freund

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Julien Rouhaud

Date:

08 April 2021, 05:22:49

Le jeu. 8 avr. 2021 à 13:17, Andres Freund <andres@anarazel.de> a écrit :

This - or something in the vicinity - seems to break running the
regression tests with force_parallel_mode=regress. There's lots of
differences like

SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
query | calls | rows
------------------------------------------------------------------------------+-------+------
SELECT (i + $2 + $3)::INTEGER | 2 | 2
SELECT (i + $2)::INTEGER LIMIT $3 | 2 | 2
- SELECT PLUS_ONE($1) | 2 | 2
- SELECT PLUS_TWO($1) | 2 | 2
- SELECT pg_stat_statements_reset() | 1 | 1
+ SELECT PLUS_ONE($1) | 4 | 4
+ SELECT PLUS_TWO($1) | 4 | 4
+ SELECT pg_stat_statements_reset(); | 2 | 2
SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" | 0 | 0
(6 rows)

SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------------------------
- 1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id
+ 2 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR KEY SHARE
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR NO KEY UPDATE
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR SHARE
2 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE NOWAIT
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_a
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b, pgss_a
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE SKIP LOCKED
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
- 1 | SELECT pg_stat_statements_reset()
+ 2 | SELECT pg_stat_statements_reset();
(12 rows)

oh, I think it's because parallel workers now have the queryid of the main query. Probably ignoring parallel workers in the executor end hook will fix the problem. I'll look at it as soon as I will be back home.

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Julien Rouhaud

Date:

08 April 2021, 06:11:02

On Thu, Apr 08, 2021 at 01:22:49PM +0800, Julien Rouhaud wrote:
> Le jeu. 8 avr. 2021 à 13:17, Andres Freund <andres@anarazel.de> a écrit :
> 
> >
> > This - or something in the vicinity - seems to break running the
> > regression tests with force_parallel_mode=regress. There's lots of
> > differences like
> >
> >  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE
> > "C";
> >                                      query
> >      | calls | rows
> >
> >  ------------------------------------------------------------------------------+-------+------
> >   SELECT (i + $2 + $3)::INTEGER
> >     |     2 |    2
> >   SELECT (i + $2)::INTEGER LIMIT $3
> >     |     2 |    2
> > - SELECT PLUS_ONE($1)
> >     |     2 |    2
> > - SELECT PLUS_TWO($1)
> >     |     2 |    2
> > - SELECT pg_stat_statements_reset()
> >     |     1 |    1
> > + SELECT PLUS_ONE($1)
> >     |     4 |    4
> > + SELECT PLUS_TWO($1)
> >     |     4 |    4
> > + SELECT pg_stat_statements_reset();
> 
> oh, I think it's because parallel workers now have the queryid of the main
> query. Probably ignoring parallel workers in the executor end hook will fix
> the problem. I'll look at it as soon as I will be back home.

That was indeed the problem.  I think the best is to entirely ignore parallel
workers in pg_stat_statements, as done in attached patch, which fixes the
regression tests with force_parallel_mode = regress.

Attachment

v1-0001-Ignore-parallel-workers-in-pg_stat_statements.patch

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Amit Kapila

Date:

08 April 2021, 06:38:02

On Thu, Apr 8, 2021 at 11:40 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Apr 08, 2021 at 01:22:49PM +0800, Julien Rouhaud wrote:
> > Le jeu. 8 avr. 2021 à 13:17, Andres Freund <andres@anarazel.de> a écrit :
> >
> > >
> > > This - or something in the vicinity - seems to break running the
> > > regression tests with force_parallel_mode=regress. There's lots of
> > > differences like
> > >
> > >  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE
> > > "C";
> > >                                      query
> > >      | calls | rows
> > >
> > >  ------------------------------------------------------------------------------+-------+------
> > >   SELECT (i + $2 + $3)::INTEGER
> > >     |     2 |    2
> > >   SELECT (i + $2)::INTEGER LIMIT $3
> > >     |     2 |    2
> > > - SELECT PLUS_ONE($1)
> > >     |     2 |    2
> > > - SELECT PLUS_TWO($1)
> > >     |     2 |    2
> > > - SELECT pg_stat_statements_reset()
> > >     |     1 |    1
> > > + SELECT PLUS_ONE($1)
> > >     |     4 |    4
> > > + SELECT PLUS_TWO($1)
> > >     |     4 |    4
> > > + SELECT pg_stat_statements_reset();
> >
> > oh, I think it's because parallel workers now have the queryid of the main
> > query. Probably ignoring parallel workers in the executor end hook will fix
> > the problem. I'll look at it as soon as I will be back home.
>
> That was indeed the problem.  I think the best is to entirely ignore parallel
> workers in pg_stat_statements,
>

I haven't studied this patch but I have a question here. We normally
do accumulate the stats from parallel workers for the purpose of
Explain Analyze, so won't ignoring them in pg_stat_statements a bit
inconsistent but if that is the case even before this patch then that
might be fine?

--
With Regards,
Amit Kapila.

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Julien Rouhaud

Date:

08 April 2021, 06:49:19

On Thu, Apr 08, 2021 at 12:08:02PM +0530, Amit Kapila wrote:
> On Thu, Apr 8, 2021 at 11:40 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > That was indeed the problem.  I think the best is to entirely ignore parallel
> > workers in pg_stat_statements,
> >
> 
> I haven't studied this patch but I have a question here. We normally
> do accumulate the stats from parallel workers for the purpose of
> Explain Analyze, so won't ignoring them in pg_stat_statements a bit
> inconsistent but if that is the case even before this patch then that
> might be fine?

I think that this patch shouldn't change pg_stat_statements previous behavior,
and that the previous behavior was correct.

It doesn't change the way we get instrumentation data from parallel workers,
just avoid to have parallel workers call pgss_store and report multiple time
the same activity.  The only difference is that now pg_stat_statements won't
setup instrumentation flags, but those should be inherited in parallel worker
from the main process right?

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Amit Kapila

Date:

08 April 2021, 09:22:26

On Thu, Apr 8, 2021 at 12:18 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Apr 08, 2021 at 12:08:02PM +0530, Amit Kapila wrote:
> > On Thu, Apr 8, 2021 at 11:40 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > That was indeed the problem.  I think the best is to entirely ignore parallel
> > > workers in pg_stat_statements,
> > >
> >
> > I haven't studied this patch but I have a question here. We normally
> > do accumulate the stats from parallel workers for the purpose of
> > Explain Analyze, so won't ignoring them in pg_stat_statements a bit
> > inconsistent but if that is the case even before this patch then that
> > might be fine?
>
> I think that this patch shouldn't change pg_stat_statements previous behavior,
> and that the previous behavior was correct.
>
> It doesn't change the way we get instrumentation data from parallel workers,
> just avoid to have parallel workers call pgss_store and report multiple time
> the same activity.  The only difference is that now pg_stat_statements won't
> setup instrumentation flags, but those should be inherited in parallel worker
> from the main process right?
>

Yeah, we do pass instrumentation flags from leader to worker but not
the one allocated via pg_stat_staments (pgss_ExecutorStart {...
queryDesc->totaltime ...}. However, I guess prior to this patch those
were anyway not allocated separately because queryid was not passed to
workers. So, I think you are right it shouldn't change the previous
behavior.

I am not completely sure but it seems we were not accumulating the
instrumentation info by workers required by pg_stat_statements neither
before this patch nor after but I might be missing something.

-- 
With Regards,
Amit Kapila.

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Julien Rouhaud

Date:

08 April 2021, 09:42:21

On Thu, Apr 08, 2021 at 02:52:26PM +0530, Amit Kapila wrote:
> 
> Yeah, we do pass instrumentation flags from leader to worker but not
> the one allocated via pg_stat_staments (pgss_ExecutorStart {...
> queryDesc->totaltime ...}.

Ah indeed, I now remember that problem that I had to deal with in some
extensions [1].  For the record this was initially raised by Tomas in [2], but
unfortunately that thread got forgotten and probably all third-party extensions
ended up allocating some shmem for their need.

> However, I guess prior to this patch those
> were anyway not allocated separately because queryid was not passed to
> workers. So, I think you are right it shouldn't change the previous
> behavior.

I think that previously parallel workers always added the required
instrumentation if pg_stat_statements is loaded, but InstrEndLoop() wasn't
called by pgss itself so I'm not sure what happens in that case.

> I am not completely sure but it seems we were not accumulating the
> instrumentation info by workers required by pg_stat_statements neither
> before this patch nor after but I might be missing something.

Same here.  I'll try to dig to get an answer, but it seems to me that it's an
old bug and not really a new one.  It would probably still make sense to add an
open item for this in pg14 as it's evidently more broken that before.

[1]: https://github.com/powa-team/pg_qualstats/issues/9
[2]: https://www.postgresql.org/message-id/3f62f24e-51b3-175c-9222-95f25fd2a9d6@2ndquadrant.com

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Amit Kapila

Date:

08 April 2021, 10:22:02

On Thu, Apr 8, 2021 at 3:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Apr 08, 2021 at 02:52:26PM +0530, Amit Kapila wrote:
> >
> > Yeah, we do pass instrumentation flags from leader to worker but not
> > the one allocated via pg_stat_staments (pgss_ExecutorStart {...
> > queryDesc->totaltime ...}.
>
> Ah indeed, I now remember that problem that I had to deal with in some
> extensions [1].  For the record this was initially raised by Tomas in [2], but
> unfortunately that thread got forgotten and probably all third-party extensions
> ended up allocating some shmem for their need.
>
> > However, I guess prior to this patch those
> > were anyway not allocated separately because queryid was not passed to
> > workers. So, I think you are right it shouldn't change the previous
> > behavior.
>
> I think that previously parallel workers always added the required
> instrumentation if pg_stat_statements is loaded,
>

How? AFAICS, in pgss_ExecutorStart(), we allocate the instrumentation
structure only if queryid is non-zero which prior to this patch
shouldn't happen.


-- 
With Regards,
Amit Kapila.

Re: pgsql: Move pg_stat_statements query jumbling to core.

From

Julien Rouhaud

Date:

08 April 2021, 10:56:09

On Thu, Apr 08, 2021 at 03:52:02PM +0530, Amit Kapila wrote:
> On Thu, Apr 8, 2021 at 3:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Thu, Apr 08, 2021 at 02:52:26PM +0530, Amit Kapila wrote:
> > >
> > > Yeah, we do pass instrumentation flags from leader to worker but not
> > > the one allocated via pg_stat_staments (pgss_ExecutorStart {...
> > > queryDesc->totaltime ...}.
> >
> > Ah indeed, I now remember that problem that I had to deal with in some
> > extensions [1].  For the record this was initially raised by Tomas in [2], but
> > unfortunately that thread got forgotten and probably all third-party extensions
> > ended up allocating some shmem for their need.
> >
> > > However, I guess prior to this patch those
> > > were anyway not allocated separately because queryid was not passed to
> > > workers. So, I think you are right it shouldn't change the previous
> > > behavior.
> >
> > I think that previously parallel workers always added the required
> > instrumentation if pg_stat_statements is loaded,
> >
> 
> How? AFAICS, in pgss_ExecutorStart(), we allocate the instrumentation
> structure only if queryid is non-zero which prior to this patch
> shouldn't happen.

Doh right, I think I need some sleep :)

So I tried to run that scenario against pg13 and master + the ignore parallel
worker patch:

CREATE TABLE pgss AS SELECT generate_series(1, 10000) id;

ALTER TABLE pgss SET (max_parallel_workers = 10);
SET force_parallel_mode = on;
SET parallel_leader_participation = off;
SET parallel_setup_cost = 0;
SET parallel_tuple_cost = 0;

SELECT version();
SELECT pg_prewarm('pgss');

EXPLAIN (ANALYZE, VERBOSE, BUFFERS, COSTS OFF) SELECT COUNT(*) FROM pgss;
SELECT pg_stat_statements_reset();
SELECT COUNT(*) FROM pgss;
SELECT query, shared_blks_hit FROM pg_stat_statements WHERE query LIKE '%pgss%';

On pg13, I get:

 pg_prewarm
------------
         45
(1 row)

                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 13.2@49076fd3ba on x86_64-pc-linux-gnu, compiled by gcc (Gentoo 10.2.0-r5 p6) 10.2.0, 64-bit
(1 row)

                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Gather (actual time=8.124..9.004 rows=1 loops=1)
   Output: (count(*))
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   Buffers: shared hit=45
   ->  Aggregate (actual time=3.649..3.650 rows=1 loops=1)
         Output: count(*)
         Buffers: shared hit=45
         Worker 0:  actual time=3.649..3.650 rows=1 loops=1
           Buffers: shared hit=45
         ->  Seq Scan on public.pgss (actual time=0.057..2.331 rows=10000 loops=1)
               Output: id
               Buffers: shared hit=45
               Worker 0:  actual time=0.057..2.331 rows=10000 loops=1
                 Buffers: shared hit=45
 Planning:
   Buffers: shared hit=15 read=5
 Planning Time: 0.211 ms
 Execution Time: 9.051 ms
(20 rows)

 pg_stat_statements_reset 
--------------------------
 
(1 row)

 count 
-------
 10000
(1 row)

           query           | shared_blks_hit 
---------------------------+-----------------
 SELECT COUNT(*) FROM pgss |              45
(1 row)


On head + patch:

 pg_prewarm
------------
         45
(1 row)

------------------------------------------------------------------------------------------------------>
 PostgreSQL 14devel-O0@45885bee58 on x86_64-pc-linux-gnu, compiled by gcc (Gentoo 10.2.0-r5 p6) 10.2.0,>
(1 row)

                                     QUERY PLAN
------------------------------------------------------------------------------------
 Gather (actual time=39.278..42.912 rows=1 loops=1)
   Output: (count(*))
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   Buffers: shared hit=45
   ->  Aggregate (actual time=20.251..20.254 rows=1 loops=1)
         Output: count(*)
         Buffers: shared hit=45
         Worker 0:  actual time=20.251..20.254 rows=1 loops=1
           Buffers: shared hit=45
         ->  Seq Scan on public.pgss (actual time=0.214..13.746 rows=10000 loops=1)
               Output: id
               Buffers: shared hit=45
               Worker 0:  actual time=0.214..13.746 rows=10000 loops=1
                 Buffers: shared hit=45
 Query Identifier: 5605505304133078562
 Planning:
   Buffers: shared hit=21
 Planning Time: 0.857 ms
 Execution Time: 43.237 ms
(21 rows)

 pg_stat_statements_reset
--------------------------

(1 row)

 count
-------
 10000
(1 row)

           query           | shared_blks_hit
---------------------------+-----------------
 SELECT COUNT(*) FROM pgss |              45
(1 row)

And I get the exact same results if I change the scenario to SET
force_parallel_mode = off.  So it seems that it was working as expected as I
can see the 45 buffers being access, and is still working as expected, although
I'm not exactly sure why right now if queryDesc->totaltime isn't somehow
gathered by the parallel worker instrastructure. (Note that 45885bee58 is my
local commit for "Ignore parallel workers in pg_stat_statements").

Also note that the buffers shown in the planning part are due to the catcache
population, so they're not expected on the 2nd query planning.