Re: "PANIC: could not open critical system index 2662" - twice - Mailing list pgsql-general

From Andres Freund
Subject Re: "PANIC: could not open critical system index 2662" - twice
Date
Msg-id 20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de
Whole thread Raw
In response to Re: "PANIC: could not open critical system index 2662" - twice  (Andres Freund <andres@anarazel.de>)
Responses Re: "PANIC: could not open critical system index 2662" - twice
List pgsql-general
Hi,

On 2023-05-08 14:04:00 -0700, Andres Freund wrote:
> But perhaps a similar approach could be the solution? My gut says that the
> rought direction might allow us to keep dropdb() a single transaction.

I started to hack on the basic approach of committing after the catalog
changes. But then I started wondering if we're not tackling this all wrong.


We don't actually want to drop the catalog contents early as that prevents the
database from being dropped again, while potentially leaving behind contents
in the case of a crash / interrupt.

Similary, the fact that we only commit the transaction at the end of
createdb() leads to interrupts / crashes orphaning the contents of that
[partial] database.  We also hve similar issues with movedb(), I think.


This is a non-transactional operation. I think we should copy the approach of
the CONCURRENTLY operations. Namely add a new column to pg_database,
indicating whether the database contents are valid. An invalid database can be
dropped, but not connected to.

Then we could have createdb() commit before starting to create the target
database directory (with invalid = true, of course). After all the filesystem
level stuff is complete, set invalid = false.

For dropping a database we'd use heap_inplace_update() to set invalid = true
just before the DropDatabaseBuffers(), preventing any connections after that
point.

Making movedb() safe is probably a bit harder - I think it'd temporarily
require two pg_database entries?


Of course we can't add a new column in the back branches. IIRC we had a
similar issue with CIC some point, and just ended up misusing some other
column for the backbranches?  We could e.g. use datconnlimit == -2 for that
purpose (but would need to make sure that ALTER DATABASE can't unset it).


My current gut feeling is that we should use datconnlimit == -2 to prevent
connections after reaching DropDatabaseBuffers() in dropdb(), and use a new
column in 16, for both createdb() and dropdb().  In some ways handling
createdb() properly is a new feature, but it's also arguably a bug that we
leak the space - and I think the code will be better if we work on both
together.

Greetings,

Andres Freund



pgsql-general by date:

Previous
From: Michael Paquier
Date:
Subject: Re: "PANIC: could not open critical system index 2662" - twice
Next
From: Andres Freund
Date:
Subject: Re: "PANIC: could not open critical system index 2662" - twice