Home > mailing lists

Re: BUG #17744: Fail Assert while recoverying from pg_basebackup - Mailing list pgsql-bugs

From	Andres Freund
Subject	Re: BUG #17744: Fail Assert while recoverying from pg_basebackup
Date	February 1, 2023 18:32:52
Msg-id	20230201153252.l6kcfum7trdovw2b@alap3.anarazel.de Whole thread Raw
In response to	Re: BUG #17744: Fail Assert while recoverying from pg_basebackup (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses	Re: BUG #17744: Fail Assert while recoverying from pg_basebackup (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) Re: BUG #17744: Fail Assert while recoverying from pg_basebackup (Michael Paquier <michael@paquier.xyz>)
List	pgsql-bugs

Tree view

Hi,

On 2023-01-13 18:36:05 +0900, Kyotaro Horiguchi wrote:
> At Tue, 10 Jan 2023 07:45:45 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in 
> > #2  0x0000000000b378e9 in ExceptionalCondition (
> >     conditionName=0xd13697 "TransactionIdIsValid(initial)", 
> >     errorType=0xd12df4 "FailedAssertion", fileName=0xd12de8 "procarray.c",
> > 
> >     lineNumber=1750) at assert.c:69
> > #3  0x0000000000962195 in ComputeXidHorizons (h=0x7ffe93de25e0)
> >     at procarray.c:1750
> > #4  0x00000000009628a3 in GetOldestTransactionIdConsideredRunning ()
> >     at procarray.c:2050
> > #5  0x00000000005972bf in CreateRestartPoint (flags=256) at xlog.c:7153
> > #6  0x00000000008cae37 in CheckpointerMain () at checkpointer.c:464
> 
> The function requires a valid value in
> ShmemVariableCache->latestCompleteXid. But it is not initialized and
> maintained in this case.  The attached quick hack seems working, but
> of course more decent fix is needed.

I might be missing something, but I suspect the problem here is that we
shouldn't have been creating a restart point. Afaict, the setup
instructions provided don't configure a recovery.signal, so we'll just
perform crash recovery.

And I don't think it'd ever make sense to create a restart point during
crash recovery?

Except that in this case, it's not pure crash recovery, it's restoring
from a backup label. Due to which it actually might make sense to create
restart points?  If you're doing PITR or such you don't really gain
anything by doing checkpoints until you've reached consistency, unless
you want to optimize for the case that you might need to start/stop the
instance multiple times?

So maybe it's the right thing to create restart points? Really not sure.

If we do want to do restartpoints, we definitely shouldn't try to
TruncateSUBTRANS() in the crash-recovery-like-restartpoint case, we've
not even done StartupSUBTRANS(), because that's guarded by
ArchiveRecoveryRequested.

The most obvious (but wrong!), fix would be to change

    if (EnableHotStandby)
        TruncateSUBTRANS(GetOldestTransactionIdConsideredRunning());
to
    if (standbyState != STANDBY_DISABLED)
        TruncateSUBTRANS(GetOldestTransactionIdConsideredRunning());
except that doesn't work, because we don't have working access to
standbyState. Nor the other relevant variables. Gah.

We've really made a hash out of the state management for
xlog.c. ArchiveRecoveryRequested, InArchiveRecovery,
StandbyModeRequested, StandbyMode, EnableHotStandby,
LocalHotStandbyActive, ... :(.  We use InArchiveRecovery = true, even if
there's no archiving involved. Afaict ArchiveRecoveryRequested=false,
InArchiveRecovery=true isn't really something the comments around the
variables foresee.

Greetings,

Andres Freund

pgsql-bugs by date:

From: Tom Lane
Date: 01 February 2023, 17:51:11
Subject: Re: range_agg extremely slow compared to naive implementation in obscure circumstances

From: Tom Lane
Date: 01 February 2023, 23:01:34
Subject: Re: BUG #17767: psql: tab-completion causes warnings when standard_conforming_strings = off

Re: BUG #17744: Fail Assert while recoverying from pg_basebackup - Mailing list pgsql-bugs

Previous

Next