On Wed, Jun 14, 2023 at 03:30:12PM -0700, Andres Freund wrote:
> Attached is a rough prototype implementing this idea. Could you check if that
> fixes the issue?
It requires a few manual steps, but I have been able to stuck the
autovacuum launcher schedule. Nice investigation from the reporters.
I may be missing something here, but finishing with an inconsistent
database list (generated based on the pgstat database entries) in the
autovacuum launcher is not something that can happen only because of a
worker, right? A normal backend would call pgstat_update_dbstats()
once it exists, re-creating a fresh entry with the dropped database
OID. Is that right?
+ /*
+ * If we haven't connected to a database yet, don't attribute time to
+ * "shared state" (InvalidOid is used to track stats for shared relations
+ * etc).
+ */
+ if (!OidIsValid(MyDatabaseId))
+ return;
Hmm. pgstat_report_stat() is called by standby_redo() for a
XLOG_RUNNING_XACTS record so this would prevent the startup process
from doing any stats updates for the shared db state, no?
Looking at the most recent things in this area, like bc49d93 or
ac23b71, I don't immediately see why your new ordering suggestion
would not be OK with MyDatabaseId getting set before acquiring the
shared lock of the database, so as the stat entries don't get messed
up. The result feels cleaner in the initialization sequence,
additionally. So, nice.
--
Michael