Hi,
On Wed, Aug 10, 2022 at 10:42:31PM +0500, Ibrar Ahmed wrote:
> On Mon, Aug 8, 2022 at 10:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> > First, for now each entry contains its own
> > query text in the query file. There can already be some duplication, which
> > isn't great, but adding the application_name and/or IP address will make
> > things
> > way worse, so you would probably need to fix that first.
>
> I doubt that makes it worst because these (IP and Application) will be part
> of
> the key, not the query text.
It's because you want to add new elements to the key that it would make it
worse, as the exact same query text will be stored much more often.
> But yes, I agree that it will increase the
> footprint of rows,
> excluding query text.
I don't think that this part should be a concern.
> I am not 100% sure about the query text duplication but will look at that
> in detail,
> if you have more insight, then it will help to solve that.
You can refer to the mentioned thread for the (only?) discussion about that.
> > There has been some
> > discussion about it recently (1) but more work and benchmarking are needed.
> >
> > The other problem is the multiplication of entries. It's a well known
> > limitation that pg_stat_statements eviction are so costly that it makes it
> > unusable. The last numbers I saw about it was ~55% overhead (2). Adding
> > application_name or ip address to the key would probably make
> > pg_stat_statements unusable for anyone who would actually need those
> > metrics.
> >
>
> I am sure adding a new item in the key does not affect the performance of
> evictions of the row,
> as it will not be part of that area. I am doing some benchmarking and
> hacking to reduce that and will
> send results with the patch.
Sorry if that was unclear. I didn't meant that adding new items to the key
would make evictions way costlier (although it would have some impact), but
much more frequent. Adding application_name and the IP to the key can very
easily amplify the number of entries so much that you will either need an
unreasonable value for pg_stat_statements.max (which would likely bring its own
set of problems) if possible at all, or evict entries frequently.