Re: Sample rate added to pg_stat_statements - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: Sample rate added to pg_stat_statements
Date
Msg-id CAA5RZ0sBsajHdE_HPoJ-C5rp8QmfArd6a6yJo6bDMVpyW25E+Q@mail.gmail.com
Whole thread Raw
In response to Re: Sample rate added to pg_stat_statements  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: Sample rate added to pg_stat_statements
List pgsql-hackers
>> You are right. This is absolutely unexpected for users. Thank you for
>> the review.
>>
>> To fix this, I suggest storing a random number in the [0, 1) range in a
>> separate variable, which will be used for comparisons in any place. We
>> will compare 'sample_rate' and random value not only in
>> pgss_post_parse_analyze(), but also in pgss_ProcessUtility() and
>> pgss_planner().
>>
>> I attached patch with test and fixes.

I still think this is not the correct approach. It seems that post_parse_analyze
should not have to deal with checking for a sample rate. This is because
post_parse_analyze, which is the only hook with access to jstate, is
only responsible
for storing a normalized query text on disk and creating a not-yet
user visible entry
in the hash. i.e. pgss_store will never increment counters when called from
pgss_post_parse_analyze.

/* Increment the counts, except when jstate is not NULL */
if (!jstate)
{

What I think may be a better idea is to add something similar
to auto_explain.c, but it should only be added to the top of
pgss_planner, pgss_ExecutorStart and pgss_ProcessUtility.

if (nesting_level == 0)
{
if (!IsParallelWorker())
current_query_sampled = pg_prng_double(&pg_global_prng_state) <
pgss_sample_rate;
else
current_query_sampled = false;
}

This is needed for ExecutorStart and not ExecutorEnd because
ExecutorStart is where the instrumentation is initialized with
queryDesc->totaltime. The above code block could be
turned into a macro and reused in the routines mentioned.

However, it seems with this change, we can see a much
higher likelihood of non-normalized query texts being stored.
This is because jstate is only available during post_parse_analyze.
Therefore, if the first time you are sampling the statement is in ExecutorEnd,
then you will end up storing a non-normalized version of the query text,
see below example with the attached v8.

postgres=# set pg_stat_statements.sample_rate = 0;
SET
postgres=# select pg_stat_statements_reset();
   pg_stat_statements_reset
-------------------------------
 2025-01-07 13:02:11.807952-06
(1 row)

postgres=# SELECT 1 \parse stmt

postgres=# \bind_named stmt \g
 ?column?
----------
        1
(1 row)

postgres=# \bind_named stmt \g
 ?column?
----------
        1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
 query | calls
-------+-------
(0 rows)

postgres=# set pg_stat_statements.sample_rate = 1;
SET
postgres=# \bind_named stmt \g
 ?column?
----------
        1
(1 row)

postgres=# \bind_named stmt \g
 ?column?
----------
        1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
 query | calls
-------+-------
(0 rows)

postgres=#  \bind_named stmt \g
 ?column?
----------
        1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
                    query                    | calls
---------------------------------------------+-------
 SELECT 1                                    |     1
 SELECT query, calls FROM pg_stat_statements |     1
(2 rows)

postgres=#  \bind_named stmt \g
 ?column?
----------
        1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
                    query                    | calls
---------------------------------------------+-------
 SELECT 1                                    |     2
 SELECT query, calls FROM pg_stat_statements |     2
(2 rows)

One idea is to make jstate available to all hooks, and completely
remove reliance on post_parse_analyze in pg_stat_statements.
I can't find the thread now, but I know this has come up in past discussions
when troubleshooting gaps in query normalization. My concern is this
feature will greatly increase the likelihood of non-normalized query texts.

Also, with regards to the benchmarks, it seems
sampling will be beneficial for workloads that are touching a small number
of entries with high concurrency. This is why we see benefit for
a standard pgbench workload.
Samping becomes less beneficial when there is a large set of queries
being updated.
I still think this is a good approach for workloads that touch a small set
of entries.

Regards,

Sami



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: allow changing autovacuum_max_workers without restarting
Next
From: Mats Kindahl
Date:
Subject: Coccinelle for PostgreSQL development [2/N]: autoconf support