Thread: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

PG Bug reporting form

Date:

13 June 2023, 23:49:27

The following bug has been logged on the website:

Bug reference:      17973
Logged by:          Will Mortensen
Email address:      will@extrahop.com
PostgreSQL version: 15.3
Operating system:   Ubuntu 22.04
Description:

My colleague Jacob Speidel (jacob@extrahop.com) and I have diagnosed a race
condition that, under certain conditions, can cause the autovacuum daemon to
stop launching autovacuum workers until the autovacuum daemon (or the whole
server) is restarted. This obviously causes serious problems for the
server.

We've reproduced this on both 15.2 and REL_15_STABLE (git commit
bd590d1fea1ba9245c791d589eea94d2dbad5a2b). We've never seen a similar issue
on 14 despite very similar conditions occurring frequently. We haven't yet
tried on 16.

In Jacob's repro, the problem begins when AutoVacWorkerMain() is in
InitPostgres() and some other backend drops the database. Specifically, the
autovacuum worker's InitPostgres() is just about to obtain the lock at

https://github.com/postgres/postgres/blob/bd590d1fea1ba9245c791d589eea94d2dbad5a2b/src/backend/utils/init/postinit.c#L1012
. The backend dropping the DB marks the DB's pgstats entry as dropped but
can’t free it because its refcount is nonzero, so
AtEOXact_PgStat_DroppedStats() calls pgstat_request_entry_refs_gc(). The
autovacuum worker's InitPostgres() proceeds to call GetDatabaseTuple() and
notices the database has been dropped, at some point calling
pgstat_gc_entry_refs() to release its reference to the DB's pgstats entry,
and the worker decides to exit with a fatal error. But while exiting, it
calls pgstat_report_stat(), which calls pgstat_update_dbstats() for the
dropped DB, which calls pgstat_prep_database_pending(), which calls
pgstat_prep_pending_entry(), which calls pgstat_get_entry_ref() with create
== true, which calls pgstat_reinit_entry() against the DB's pgstats entry.
This sets dropped to false on that entry. Finally, the autovacuum worker
exits.

The fact that a dropped database now indefinitely has a pgstats entry with
dropped == false seemingly violates some assumptions and confuses the
autovacuum daemon. In particular, rebuild_database_list() will forever
include it in DatabaseList and take it into account when computing the
adl_next_worker for each DB, but do_start_worker() won’t consider the DB
because it's never returned by get_database_list(). In our repros, this
mismatch causes do_start_worker() to get stuck never processing any DB: in
particular, it always sees that for each non-dropped database
adl_next_worker is in the future but within the next autovacuum_naptime,
i.e. skipit = true. This causes do_start_worker() to call
rebuild_database_list() at the end, which again miscomputes adl_next_worker
and pushes it further into the future, so that the situation repeats on the
next call to do_start_worker(), and so on indefinitely.

That’s the crux of our issue. Please let us know if any clarification or
more detailed repro steps are needed. Our repro patch just sleeps before and
after the LockSharedObject() call in InitPostgres() (to widen the race
windows) and adds a lot of logging. (Jacob did >90% of the debugging here; I
merely determined how the pgstats entry lost its dropped flag.)

We assume that one fix would be to somehow ensure that the dropped flag
remains true on a dropped database’s pgstats entry until it’s freed, but
also, it seems a bit fragile for autovacuum’s do_start_worker() to sometimes
call rebuild_database_list() and delay all the adl_next_worker times.
Without thinking about it too hard, we wonder if there would still be a
pattern of ongoing DB creates and drops that could cause it to misbehave in
a similar way, never deciding to autovacuum any database even if one lives
long enough that it should be autovacuumed.

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

14 June 2023, 21:37:24

Hi,

Good catch!

On 2023-06-13 23:49:27 +0000, PG Bug reporting form wrote:
> We've reproduced this on both 15.2 and REL_15_STABLE (git commit
> bd590d1fea1ba9245c791d589eea94d2dbad5a2b). We've never seen a similar issue
> on 14 despite very similar conditions occurring frequently. We haven't yet
> tried on 16.

The relevant code is new for 15, so that makes sense.

> In Jacob's repro, the problem begins when AutoVacWorkerMain() is in
> InitPostgres() and some other backend drops the database. Specifically, the
> autovacuum worker's InitPostgres() is just about to obtain the lock at
>
https://github.com/postgres/postgres/blob/bd590d1fea1ba9245c791d589eea94d2dbad5a2b/src/backend/utils/init/postinit.c#L1012
> . The backend dropping the DB marks the DB's pgstats entry as dropped but
> can’t free it because its refcount is nonzero, so
> AtEOXact_PgStat_DroppedStats() calls pgstat_request_entry_refs_gc(). The
> autovacuum worker's InitPostgres() proceeds to call GetDatabaseTuple() and
> notices the database has been dropped, at some point calling
> pgstat_gc_entry_refs() to release its reference to the DB's pgstats entry,
> and the worker decides to exit with a fatal error. But while exiting, it
> calls pgstat_report_stat(), which calls pgstat_update_dbstats() for the
> dropped DB, which calls pgstat_prep_database_pending(), which calls
> pgstat_prep_pending_entry(), which calls pgstat_get_entry_ref() with create
> == true, which calls pgstat_reinit_entry() against the DB's pgstats entry.
> This sets dropped to false on that entry. Finally, the autovacuum worker
> exits.

Hm. It's pretty nasty that we end up in the proc_exit() hooks with
MyDatabaseId set, even though we aren't actually validly connected to that
database.

ISTM that at the very least we should make "Recheck pg_database to make sure
the target database hasn't gone away." in InitPostgres() unset
MyDatabaseId, MyProc->databaseId.

We unfortunately can't just defer setting up MyProc->databaseId, I
think. Otherwise a concurrent vacuum could decide to remove database contents
we think we still can see.

If we had something like that it should be fairly easy to skip relevant parts
of pgstat_report_stat(), based on MyDatabaseId being invalid.

Greetings,

Andres Freund

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

14 June 2023, 21:57:44

Hi,

On 2023-06-14 14:37:24 -0700, Andres Freund wrote:
> On 2023-06-13 23:49:27 +0000, PG Bug reporting form wrote:
> > In Jacob's repro, the problem begins when AutoVacWorkerMain() is in
> > InitPostgres() and some other backend drops the database. Specifically, the
> > autovacuum worker's InitPostgres() is just about to obtain the lock at
> >
https://github.com/postgres/postgres/blob/bd590d1fea1ba9245c791d589eea94d2dbad5a2b/src/backend/utils/init/postinit.c#L1012
> > . The backend dropping the DB marks the DB's pgstats entry as dropped but
> > can’t free it because its refcount is nonzero, so
> > AtEOXact_PgStat_DroppedStats() calls pgstat_request_entry_refs_gc(). The
> > autovacuum worker's InitPostgres() proceeds to call GetDatabaseTuple() and
> > notices the database has been dropped, at some point calling
> > pgstat_gc_entry_refs() to release its reference to the DB's pgstats entry,
> > and the worker decides to exit with a fatal error. But while exiting, it
> > calls pgstat_report_stat(), which calls pgstat_update_dbstats() for the
> > dropped DB, which calls pgstat_prep_database_pending(), which calls
> > pgstat_prep_pending_entry(), which calls pgstat_get_entry_ref() with create
> > == true, which calls pgstat_reinit_entry() against the DB's pgstats entry.
> > This sets dropped to false on that entry. Finally, the autovacuum worker
> > exits.
>
> Hm. It's pretty nasty that we end up in the proc_exit() hooks with
> MyDatabaseId set, even though we aren't actually validly connected to that
> database.
>
> ISTM that at the very least we should make "Recheck pg_database to make sure
> the target database hasn't gone away." in InitPostgres() unset
> MyDatabaseId, MyProc->databaseId.
>
> We unfortunately can't just defer setting up MyProc->databaseId, I
> think. Otherwise a concurrent vacuum could decide to remove database contents
> we think we still can see.

Hm. On second thought, I don't see any reason to do

    /*
     * Now we can mark our PGPROC entry with the database ID.
     *
     * We assume this is an atomic store so no lock is needed; though actually
     * things would work fine even if it weren't atomic.  Anyone searching the
     * ProcArray for this database's ID should hold the database lock, so they
     * would not be executing concurrently with this store.  A process looking
     * for another database's ID could in theory see a chance match if it read
     * a partially-updated databaseId value; but as long as all such searches
     * wait and retry, as in CountOtherDBBackends(), they will certainly see
     * the correct value on their next try.
     */
    MyProc->databaseId = MyDatabaseId;

    /*
     * We established a catalog snapshot while reading pg_authid and/or
     * pg_database; but until we have set up MyDatabaseId, we won't react to
     * incoming sinval messages for unshared catalogs, so we won't realize it
     * if the snapshot has been invalidated.  Assume it's no good anymore.
     */
    InvalidateCatalogSnapshot();

before
    /*
     * Recheck pg_database to make sure the target database hasn't gone away.
     * If there was a concurrent DROP DATABASE, this ensures we will die
     * cleanly without creating a mess.
     */
    if (!bootstrap)
    {

as pg_database is a shared catalog. So we can safely access it at that
point. Or am I missing something?


If I am not, then we we should defer setting MyDatabaseId until after "Recheck
pg_database", e.g. by setting 'dboid' for the in_dbname path.

I think that would likely end up with *less* code than before, because we'd
need less redundant code.


It's a bit odd that we do the second lookup with dbname, rather the oid we
previously looked up.


Afaics there's no reason to look up MyDatabaseTableSpace before the recheck.

Greetings,

Andres Freund

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

14 June 2023, 22:30:12

Hi,

On 2023-06-14 14:57:44 -0700, Andres Freund wrote:
> If I am not, then we we should defer setting MyDatabaseId until after "Recheck
> pg_database", e.g. by setting 'dboid' for the in_dbname path.
> 
> I think that would likely end up with *less* code than before, because we'd
> need less redundant code.
> 
> 
> It's a bit odd that we do the second lookup with dbname, rather the oid we
> previously looked up.
> 
> 
> Afaics there's no reason to look up MyDatabaseTableSpace before the recheck.

Attached is a rough prototype implementing this idea.  Could you check if that
fixes the issue?

Greetings,

Andres Freund

Attachment

pgstat-mydatabase-fix-prototype.diff

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Will Mortensen

Date:

15 June 2023, 20:35:17

Thanks for the quick patch Andres! Jacob tested and that appears to
fix our scenario.


On Wed, Jun 14, 2023 at 3:30 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-06-14 14:57:44 -0700, Andres Freund wrote:
> > If I am not, then we we should defer setting MyDatabaseId until after "Recheck
> > pg_database", e.g. by setting 'dboid' for the in_dbname path.
> >
> > I think that would likely end up with *less* code than before, because we'd
> > need less redundant code.
> >
> >
> > It's a bit odd that we do the second lookup with dbname, rather the oid we
> > previously looked up.
> >
> >
> > Afaics there's no reason to look up MyDatabaseTableSpace before the recheck.
>
> Attached is a rough prototype implementing this idea.  Could you check if that
> fixes the issue?
>
> Greetings,
>
> Andres Freund

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

16 June 2023, 01:01:33

On Wed, Jun 14, 2023 at 03:30:12PM -0700, Andres Freund wrote:
> Attached is a rough prototype implementing this idea.  Could you check if that
> fixes the issue?

It requires a few manual steps, but I have been able to stuck the
autovacuum launcher schedule.  Nice investigation from the reporters.

I may be missing something here, but finishing with an inconsistent
database list (generated based on the pgstat database entries) in the
autovacuum launcher is not something that can happen only because of a
worker, right?  A normal backend would call pgstat_update_dbstats()
once it exists, re-creating a fresh entry with the dropped database
OID.  Is that right?

+   /*
+    * If we haven't connected to a database yet, don't attribute time to
+    * "shared state" (InvalidOid is used to track stats for shared relations
+    * etc).
+    */
+   if (!OidIsValid(MyDatabaseId))
+       return;

Hmm.  pgstat_report_stat() is called by standby_redo() for a
XLOG_RUNNING_XACTS record so this would prevent the startup process
from doing any stats updates for the shared db state, no?

Looking at the most recent things in this area, like bc49d93 or
ac23b71, I don't immediately see why your new ordering suggestion
would not be OK with MyDatabaseId getting set before acquiring the
shared lock of the database, so as the stat entries don't get messed
up.  The result feels cleaner in the initialization sequence,
additionally.  So, nice.
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Will Mortensen

Date:

16 June 2023, 01:36:41

On Thu, Jun 15, 2023 at 6:01 PM Michael Paquier <michael@paquier.xyz> wrote:
> It requires a few manual steps, but I have been able to stuck the
> autovacuum launcher schedule.  Nice investigation from the reporters.
>
> I may be missing something here, but finishing with an inconsistent
> database list (generated based on the pgstat database entries) in the
> autovacuum launcher is not something that can happen only because of a
> worker, right?  A normal backend would call pgstat_update_dbstats()
> once it exists, re-creating a fresh entry with the dropped database
> OID.  Is that right?

Yes, sorry, Jacob was able to repro with a normal backend just now. We
probably should have tried that earlier. :-)

I'm also unsure if reiniting the pgstats entry (as opposed to creating
a new one) is actually necessary or just what we happened to observe.
We're definitely not very familiar with these internals. :-)

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

16 June 2023, 01:58:38

On Thu, Jun 15, 2023 at 06:36:41PM -0700, Will Mortensen wrote:
> On Thu, Jun 15, 2023 at 6:01 PM Michael Paquier <michael@paquier.xyz> wrote:
>> It requires a few manual steps, but I have been able to stuck the
>> autovacuum launcher schedule.  Nice investigation from the reporters.
>>
>> I may be missing something here, but finishing with an inconsistent
>> database list (generated based on the pgstat database entries) in the
>> autovacuum launcher is not something that can happen only because of a
>> worker, right?  A normal backend would call pgstat_update_dbstats()
>> once it exists, re-creating a fresh entry with the dropped database
>> OID.  Is that right?
>
> Yes, sorry, Jacob was able to repro with a normal backend just now. We
> probably should have tried that earlier. :-)

Okay, thanks for confirming.  Yes, I was able to stuck the scheduler
for both.  FWIW, I have initially hardcoded a stop point with an
on-disk file around InitPostgres() before we take the shared lock.
But thinking harder, it is possible to be sneaky with the following
steps to get these incorrect stat entries:
1) LOCK pg_database in a first session, on a database different than
the one to drop.
2) Connect with a second session to the database to drop, stuck
because of the previous LOCK when scanning pg_database to find the
tuple OID, during connection startup, just before taking the shared
lock.
3) Third session executing DROP DATABASE, with a session that
connected before the pg_database lock.
4) Commit transaction of first session to release pg_database lock,
the database is dropped and the second session fails to connect with
the database renamed.

> I'm also unsure if reiniting the pgstats entry (as opposed to creating
> a new one) is actually necessary or just what we happened to observe.
> We're definitely not very familiar with these internals. :-)

The proposed patch does better than that, actually, and takes the
approach of not touching the stats for the database dropped/renamed
by delaying MyDatabaseId which is what the backend uses to update the
database-level pgstat entries.
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

19 June 2023, 17:45:23

Hi,

On 2023-06-16 10:01:33 +0900, Michael Paquier wrote:
> +   /*
> +    * If we haven't connected to a database yet, don't attribute time to
> +    * "shared state" (InvalidOid is used to track stats for shared relations
> +    * etc).
> +    */
> +   if (!OidIsValid(MyDatabaseId))
> +       return;
> 
> Hmm.  pgstat_report_stat() is called by standby_redo() for a
> XLOG_RUNNING_XACTS record so this would prevent the startup process
> from doing any stats updates for the shared db state, no?

Hm.  Effectively the startup process isn't doing that in a useful way
today. Commit/Rollback and session stats don't make sense, and block
read/write time are wrongly attributed.

I think it'd take a fair amount of work to track these stats in a more useful
manner for the startup process, by virtue of it effectively being connected to
multiple databases. We'd need to track
pgStatBlockReadTime/pgStatBlockWriteTime on a per-database level, which
wouldn't be easy to do without increasing overhead.

Greetings,

Andres Freund

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

20 June 2023, 06:41:22

On Mon, Jun 19, 2023 at 10:45:23AM -0700, Andres Freund wrote:
> I think it'd take a fair amount of work to track these stats in a more useful
> manner for the startup process, by virtue of it effectively being connected to
> multiple databases. We'd need to track
> pgStatBlockReadTime/pgStatBlockWriteTime on a per-database level, which
> wouldn't be easy to do without increasing overhead.

Is it really necessary to do this much amount of work for the scope of
this issue, though?  Relying on MyDatabaseId to control if these
updates should happen does not look like the right move to me, to be
honest, because this can be used to update shared stats.  In the
pgstat shutdown callback, shouldn't we try to check if the database
entry exists and/or has been dropped and just do nothing in this case?
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

21 June 2023, 03:59:06

Hi,

On 2023-06-20 15:41:22 +0900, Michael Paquier wrote:
> On Mon, Jun 19, 2023 at 10:45:23AM -0700, Andres Freund wrote:
> > I think it'd take a fair amount of work to track these stats in a more useful
> > manner for the startup process, by virtue of it effectively being connected to
> > multiple databases. We'd need to track
> > pgStatBlockReadTime/pgStatBlockWriteTime on a per-database level, which
> > wouldn't be easy to do without increasing overhead.
> 
> Is it really necessary to do this much amount of work for the scope of
> this issue, though?

No! That'd just be for actually making db-level stats for the startup process
work. In other words, I don't think there's anything we could break, because
it already doesn't work :)

> Relying on MyDatabaseId to control if these updates should happen does not
> look like the right move to me, to be honest, because this can be used to
> update shared stats.

Shared stats only make sense when you know what non-shared stats are. At the
moment that afaict only works (and afaict only ever has worked) for database
connected backends.

I guess we could start reporting everything to the shared entry if we aren't
connected to a database, but that doesn't really seem like an improvement.

> In the pgstat shutdown callback, shouldn't we try to check if the database
> entry exists and/or has been dropped and just do nothing in this case?

I don't think that'd be a good approach.

Greetings,

Andres Freund

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Jacob Speidel

Date:

11 August 2023, 21:15:34

On Thu, Jun 15, 2023 at 1:35 PM Will Mortensen <will@extrahop.com> wrote:

Thanks for the quick patch Andres! Jacob tested and that appears to
fix our scenario.

On Wed, Jun 14, 2023 at 3:30 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-06-14 14:57:44 -0700, Andres Freund wrote:
> > If I am not, then we we should defer setting MyDatabaseId until after "Recheck
> > pg_database", e.g. by setting 'dboid' for the in_dbname path.
> >
> > I think that would likely end up with *less* code than before, because we'd
> > need less redundant code.
> >
> >
> > It's a bit odd that we do the second lookup with dbname, rather the oid we
> > previously looked up.
> >
> >
> > Afaics there's no reason to look up MyDatabaseTableSpace before the recheck.
>
> Attached is a rough prototype implementing this idea. Could you check if that
> fixes the issue?
>
> Greetings,
>
> Andres Freund

Hello,

I don't think this is a surprise to anyone but this is just an update that the bug still exists in 15.4 and the patch from Andres still fixes the problem.

Thanks,

Jacob

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

12 August 2023, 00:23:35

On Fri, Aug 11, 2023 at 02:15:34PM -0700, Jacob Speidel wrote:
> I don't think this is a surprise to anyone but this is just an update that
> the bug still exists in 15.4 and the patch from Andres still fixes the
> problem.

Yes, this one has been sitting in my inbox for a few weeks now.  The
part I am not quite sure of is what we should do for the stats
gathered by startup process, but that's separate from this autovacuum
scheduling problem at the end.

Andres, were you planning to do something here or did you need some
help?
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

12 August 2023, 00:48:12

Hi,

On 2023-08-12 09:23:35 +0900, Michael Paquier wrote:
> On Fri, Aug 11, 2023 at 02:15:34PM -0700, Jacob Speidel wrote:
> > I don't think this is a surprise to anyone but this is just an update that
> > the bug still exists in 15.4 and the patch from Andres still fixes the
> > problem.
> 
> Yes, this one has been sitting in my inbox for a few weeks now.  The
> part I am not quite sure of is what we should do for the stats
> gathered by startup process, but that's separate from this autovacuum
> scheduling problem at the end.
> 
> Andres, were you planning to do something here or did you need some
> help?

I was basically still wondering whether I defused your startup process
concerns or not.

Greetings,

Andres Freund

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

14 August 2023, 00:11:50

On Fri, Aug 11, 2023 at 05:48:12PM -0700, Andres Freund wrote:
> I was basically still wondering whether I defused your startup process
> concerns or not.

My apologies for dropping the ball for so long here.

FWIW, I am still confused about what the long-term plan should be for
the startup process for these database-level stats, and how
significant it would be to get access to them (mostly for pg_stat_io,
right?  Still what does it mean for the startup process?).  I
understand that we should not undo what e3cb1a58 has tried to improve
post-feature freeze, still the POC patch of this thread does not touch
standby.c.  So would it be better to just undo e3cb1a58?  Or rework
the stat structures and track pgStatBlockWriteTime for each database,
studying cheaper methods to achieve that?  If we do so, what's there
to gain in terms of user experience?

+       /*
+        * If we haven't connected to a database yet, don't attribute time to
+        * "shared state" (InvalidOid is used to track stats for shared relations
+        * etc).
+        */
At least this commit is incorrect because pgstat_update_dbstats() can
be called via pgstat_report_stat() with an invalid MyDatabaseId on
purpose?  And I assume that standby.c should perhaps document a bit
better that this does not affect database-level stats.

Perhaps the top-comment of pgstat_report_stat() should document the
startup process part, but I am not completely sure how helpful such an
addition would be on its own.

At the end, after looking again at the patch and the thread, I'm OK
with pgstat_update_dbstats() doing nothing if we don't have
MyDatabaseId, because there is nothing to do in this case based on the
current structure of the database-level stats, but the expectations
surrounding pgstat_update_dbstats() need some clarifications.  Take it
as a "I'm OK with the logic of the patch, but it can be improved" ;)
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

01 September 2023, 01:08:57

On Mon, Aug 14, 2023 at 09:11:50AM +0900, Michael Paquier wrote:
> FWIW, I am still confused about what the long-term plan should be for
> the startup process for these database-level stats, and how
> significant it would be to get access to them (mostly for pg_stat_io,
> right?  Still what does it mean for the startup process?).  I
> understand that we should not undo what e3cb1a58 has tried to improve
> post-feature freeze, still the POC patch of this thread does not touch
> standby.c.  So would it be better to just undo e3cb1a58?  Or rework
> the stat structures and track pgStatBlockWriteTime for each database,
> studying cheaper methods to achieve that?  If we do so, what's there
> to gain in terms of user experience?

After studying more this one, improving the set of static
PgStat_Counters for the startup process so as these can be tracked for
each database is what would make the most sense in the long term.  The
extra overhead when dealing with a lot of databases is something that
we'd need measurements for, but not touching that now won't make
things worse than they are currently.  So I've let that aside,
focusing on the core issue.

> At the end, after looking again at the patch and the thread, I'm OK
> with pgstat_update_dbstats() doing nothing if we don't have
> MyDatabaseId, because there is nothing to do in this case based on the
> current structure of the database-level stats, but the expectations
> surrounding pgstat_update_dbstats() need some clarifications.  Take it
> as a "I'm OK with the logic of the patch, but it can be improved" ;)

I have spent quite a few eye hours on the patch, and I would like to
apply the attached down to v15 at the beginning of next week, likely
Monday, to take care of the autovacuum scheduler issue.

If there are any comments or opinions, please feel free.
--
Michael

Attachment

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Nathan Bossart

Date:

01 September 2023, 02:56:13

On Fri, Sep 01, 2023 at 10:08:57AM +0900, Michael Paquier wrote:
> On Mon, Aug 14, 2023 at 09:11:50AM +0900, Michael Paquier wrote:
>> At the end, after looking again at the patch and the thread, I'm OK
>> with pgstat_update_dbstats() doing nothing if we don't have
>> MyDatabaseId, because there is nothing to do in this case based on the
>> current structure of the database-level stats, but the expectations
>> surrounding pgstat_update_dbstats() need some clarifications.  Take it
>> as a "I'm OK with the logic of the patch, but it can be improved" ;)
> 
> I have spent quite a few eye hours on the patch, and I would like to
> apply the attached down to v15 at the beginning of next week, likely
> Monday, to take care of the autovacuum scheduler issue.

I've spent much less time staring at the patch than you, but it looks
logically sound to me.  IMO the changes to InitPostgres() are arguably
worthwhile on their own.

> +    /*
> +     * Now that we rechecked, we are certain to be connected to a database and
> +     * thus can set MyDatabaseId.
> +     */
> +    MyDatabaseId = dboid;

Perhaps we should expand this comment to make it clear that some code
depends on this to avoid races with concurrent database drops/renames.

One other thing that comes to mind is whether this patch creates the
opposite problem for some other code, i.e., it depends on MyDatabaseId
being set before the lock is taken.  But that seems pretty unlikely, and
I'm not aware of anything like that off the top of my head.

Also, IIUC we could simply move the recheck to just after the lock
aquisition and then check MyProc->databaseId in pgstat_update_dbstats(),
but as I mentioned above, the changes to InitPostgres() are kind of nice
independently.  Plus, it's nice to be able to depend on MyDatabaseId in the
same way.

In short, LGTM.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

01 September 2023, 03:31:09

On Thu, Aug 31, 2023 at 07:56:13PM -0700, Nathan Bossart wrote:
> On Fri, Sep 01, 2023 at 10:08:57AM +0900, Michael Paquier wrote:
>> +    /*
>> +     * Now that we rechecked, we are certain to be connected to a database and
>> +     * thus can set MyDatabaseId.
>> +     */
>> +    MyDatabaseId = dboid;
>
> Perhaps we should expand this comment to make it clear that some code
> depends on this to avoid races with concurrent database drops/renames.

Yeah, this could be a bit more detailed, mentioning the dependency
with the pgstat shutdown callback.  How about appending the following
sentence in a second paragraph, say:
"Setting MyDatabaseId after rechecking the existence of the database
matters for instance with the shutdown callback of the shared
statistics, as pgstat_report_stat() will update database statistics
depending on MyDatabaseId."
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Nathan Bossart

Date:

01 September 2023, 15:20:24

On Fri, Sep 01, 2023 at 12:31:09PM +0900, Michael Paquier wrote:
> On Thu, Aug 31, 2023 at 07:56:13PM -0700, Nathan Bossart wrote:
>> On Fri, Sep 01, 2023 at 10:08:57AM +0900, Michael Paquier wrote:
>>> +    /*
>>> +     * Now that we rechecked, we are certain to be connected to a database and
>>> +     * thus can set MyDatabaseId.
>>> +     */
>>> +    MyDatabaseId = dboid;
>> 
>> Perhaps we should expand this comment to make it clear that some code
>> depends on this to avoid races with concurrent database drops/renames.
> 
> Yeah, this could be a bit more detailed, mentioning the dependency
> with the pgstat shutdown callback.  How about appending the following
> sentence in a second paragraph, say:
> "Setting MyDatabaseId after rechecking the existence of the database
> matters for instance with the shutdown callback of the shared
> statistics, as pgstat_report_stat() will update database statistics
> depending on MyDatabaseId."

Something along those lines seems fine to me.  FWIW this is what I had in
my head:

    It is important that MyDatabaseId only be set once we are sure that the
    target database can no longer be concurrently dropped or renamed.  Some
    code depends on this.  For example, without this guarantee,
    pgstat_update_dbstats() could create entries for databases that were
    just dropped, which will lead to problems down the road.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

02 September 2023, 00:08:22

On Fri, Sep 01, 2023 at 08:20:24AM -0700, Nathan Bossart wrote:
> Something along those lines seems fine to me.  FWIW this is what I had in
> my head:
>
>     It is important that MyDatabaseId only be set once we are sure that the
>     target database can no longer be concurrently dropped or renamed.  Some
>     code depends on this.  For example, without this guarantee,
>     pgstat_update_dbstats() could create entries for databases that were
>     just dropped, which will lead to problems down the road.

I'm mostly OK with this formulation as well, but without the second
sentence and adding as extra that pgstat_update_dbstats() could be
specifically called in the shutdown callback of the shared statistics.
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Nathan Bossart

Date:

02 September 2023, 15:21:34

On Sat, Sep 02, 2023 at 09:08:22AM +0900, Michael Paquier wrote:
> On Fri, Sep 01, 2023 at 08:20:24AM -0700, Nathan Bossart wrote:
>> Something along those lines seems fine to me.  FWIW this is what I had in
>> my head:
>> 
>>     It is important that MyDatabaseId only be set once we are sure that the
>>     target database can no longer be concurrently dropped or renamed.  Some
>>     code depends on this.  For example, without this guarantee,
>>     pgstat_update_dbstats() could create entries for databases that were
>>     just dropped, which will lead to problems down the road.
> 
> I'm mostly OK with this formulation as well, but without the second
> sentence and adding as extra that pgstat_update_dbstats() could be
> specifically called in the shutdown callback of the shared statistics.

WFM

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Michael Paquier

Date:

03 September 2023, 23:09:41

On Sat, Sep 02, 2023 at 08:21:34AM -0700, Nathan Bossart wrote:
> WFM

Okay, applied with something slightly-reworded to 15~.
--
Michael

Attachment

signature.asc

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Will Mortensen

Date:

04 September 2023, 01:55:20

Thanks all for the fix!

On Sun, Sep 3, 2023 at 4:09 PM Michael Paquier <michael@paquier.xyz> wrote:

On Sat, Sep 02, 2023 at 08:21:34AM -0700, Nathan Bossart wrote:
> WFM

Okay, applied with something slightly-reworded to 15~.
--
Michael

Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

From

Andres Freund

Date:

12 September 2023, 02:11:56

On 2023-09-04 08:09:41 +0900, Michael Paquier wrote:
> On Sat, Sep 02, 2023 at 08:21:34AM -0700, Nathan Bossart wrote:
> > WFM
> 
> Okay, applied with something slightly-reworded to 15~.

Thanks!