Re: Doc update for pg_stat_statements normalization - Mailing list pgsql-hackers

From Imseih (AWS), Sami
Subject Re: Doc update for pg_stat_statements normalization
Date
Msg-id 10015573-0539-43BC-B2EB-14E39AB7BC46@amazon.com
Whole thread Raw
In response to Re: Doc update for pg_stat_statements normalization  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Doc update for pg_stat_statements normalization
List pgsql-hackers
>    On Sat, Feb 25, 2023 at 01:59:04PM +0000, Imseih (AWS), Sami wrote:>
>    > The overhead of storing this additional private data for the life of the query
>    > execution may not be  desirable.

>    Okay, but why?

Additional memory to maintain the JumbleState data in other structs, and
the additional copy operations.

>    This seems like the key point to me here.  If we copy more information
>    into the Query structures, then we basically have no need for sticky
>    entries, which could be an advantage on its own as it simplifies the
>    deallocation and lookup logic.

Removing the sticky entry logic will be a big plus, and of course we can
eliminate a query not showing up properly normalized.

>    For a DML or a SELECT, the manipulation of the hash table would still
>    be a three-step process (post-analyze, planner and execution end), but
>    the first step would have no need to use an exclusive lock on the hash
>    table because we could just read and copy over the Query the
>    normalized query if an entry exists, meaning that we could actually
>    relax things a bit?  

No lock is held while normalizing, and a shared lock is held while storing,
so there is no apparent benefit from that aspect.

>    This relaxation has as cost the extra memory used
>    to store more data to allow the insertion to use a proper state of the
>    Query[Desc] coming from the JumbleState (this extra data has no need
>    to be JumbleState, just the results we generate from it aka the
>    normalized query).

Wouldn't be less in terms of memory usage to just store the required
JumbleState fields in Query[Desc]?

clocations_count,
highest_extern_param_id,
clocations_locations,
clocations_length

>    > In v14, we added a dealloc metric to pg_stat_statements_info, which is helpful.
>    > However, this only deals with the pgss_hash entry deallocation.
>    > I think we should also add a metric for the text file garbage collection.

>    This sounds like a good idea on its own.

I can create a separate patch for this.

Regards,

--
Sami Imseih
Amazon Web services


pgsql-hackers by date:

Previous
From: Kirk Wolak
Date:
Subject: Re: Proposal: :SQL_EXEC_TIME (like :ROW_COUNT) Variable (psql)
Next
From: Andres Freund
Date:
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)