Re: prevent immature WAL streaming - Mailing list pgsql-hackers

From Robert Haas
Subject Re: prevent immature WAL streaming
Date
Msg-id CA+TgmobsoA3nm8kKnCoh7bMi35ntQ8hv1X5C4fFxdy0U+7kQpg@mail.gmail.com
Whole thread Raw
In response to Re: prevent immature WAL streaming  (Amul Sul <sulamul@gmail.com>)
Responses Re: prevent immature WAL streaming  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Wed, Oct 13, 2021 at 2:01 AM Amul Sul <sulamul@gmail.com> wrote:
> Instead of abortedRecPtr point, isn't enough to write
> overwrite-contrecord at XLogCtl->lastReplayedEndRecPtr?  I think both
> are pointing to the same location then can't we use
> lastReplayedEndRecPtr instead of abortedRecPtr to write
> overwrite-contrecord and remove need of extra global variable, like
> attached?

I think you mean missingContrecPtr, not abortedRecPtr. If I understand
correctly, abortedRecPtr is going to be the location in some WAL
segment which we replayed where a long record began, but
missingContrecPtr seems like it would have to point to the beginning
of the first segment we were unable to find to continue replay; and
thus it ought to be the same as lastReplayedEndRecPtr. But the
committed code doesn't seem to check that these are the same or verify
the relationship between them in any way, so I'm worried there is some
other case here. The comments in XLogReadRecord also suggest this:

         * We get here when a record that spans multiple pages needs to be
         * assembled, but something went wrong -- perhaps a contrecord piece
         * was lost.  If caller is WAL replay, it will know where the aborted

Saying that "perhaps" a contrecord piece was lost seems to imply that
other explanations are possible as well, but I'm not sure what.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pg14 psql broke \d datname.nspname.relname
Next
From: John Naylor
Date:
Subject: Re: [RFC] building postgres with meson