Re: pg_stat_statements locking - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: pg_stat_statements locking
Date
Msg-id 20220912131800.ttcgyorpc3wvzwsq@jrouhaud
Whole thread Raw
In response to Re: pg_stat_statements locking  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: pg_stat_statements locking
List pgsql-hackers
On Mon, Sep 12, 2022 at 05:32:55PM +0500, Andrey Borodin wrote:
>
> > On 12 Sep 2022, at 13:40, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > I'm not a fan of that patch as it now silently ignores entries if the lwlock
> > can't be acquired *immediately*, without any way to avoid that if your
> > configuration and/or workload doesn't lead to this problem, or any way to know
> > that entries were ignored.
>
> Practically, workload of this configuration is uncommon. At least I could not
> find any reports of such locking.  But theoretically, all prerequisites of a
> disaster is very common (variety of queries + some QPS of pg_stat_statements
> view + small work_mem + occasional reset() or GC).

Simply needing to evict entries regularly is already famous for being really
expensive.  See for instance [1].

> Maybe it's only a problem of programs that use pgss. pgwatch is calling pgss
> on every DB in the cluster, that's how check once in a minute became some
> queries per second.

Ah, I wasn't sure if that's what you meant in your original message.  Calling
pg_stat_statements *for every database* doesn't sound like a good idea.

Also ideally you shouldn't need to retrieve the query text every time.  There's
now pg_stat_statements_info.dealloc, so between that and the number of row
returned you can easily know if there are new query texts that you never saw
yet and cache those on the application side rather than retrieving them again
and again.

> Personally, I'd prefer if I could configure a timeout to aquire lock. That
> timeout would denote maximum delay that pgss can incur on the query. But we
> would need to extend LWLock API for this.

Yeah, that's what I meant by "immediately" in my previous message.  That being
said I don't know if adding a timeout would be too expensive for the lwlock
infrastructure.

> > On 12 Sep 2022, at 13:40, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > I think that the better long term approach is to move pg_stat_statements and
> > the query texts to dynamic shared memory.
>
> BTW we don't even need a dynamic memory. We need just a shared memory,
> probably pre-allocated.  I agree that pgss must reside in main memory only,
> never on disk.

We still need dynamic shared memory to get rid of the query text file, which is
a big problem on its own.  For the main hash table, relying on dshash could
allow increasing the maximum number of entries without a restart, which could
be cool if you're in a situation where you have a finite number of entries
that's higher than pg_stat_statements.max (like after creating a new role or
something).
>
> But we still will have a possibility of long lock conflicts preventing
> queries from completing. And the ability to configure pgss hooks timeout
> would be useful anyway.

I didn't look thoroughly at the new pgstats infrastructure, but from what I saw
it should be able to leverage most of the current problems.

[1] https://twitter.com/AndresFreundTec/status/1105585237772263424



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: Tom Lane
Date:
Subject: Re: Expand palloc/pg_malloc API