Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: logical decoding and replication of sequences
Date
Msg-id CA+hUKGJOnWuQqj42Q8xKfWN9NxGYUvEDEvEd_a+9Y9dF=78Oyw@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: logical decoding and replication of sequences
List pgsql-hackers
On Mon, Aug 8, 2022 at 8:56 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Mon, 08 Aug 2022 17:33:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> > If WaitForWALToBecomeAvailable returned by promotion, ReadPageInteral
> > misses the chance to inavlidate reader-state.  That state is not an
> > error while in StandbyMode.
>
> Mmm... Maybe I wanted to say:  (Still I'm not sure the rewrite works..)
>
> If WaitForWALToBecomeAvailable returned by promotion, ReadPageInteral
> would miss the chance to invalidate reader-state.  When XLogPageRead
> is called in blocking mode while in StandbyMode (that is, the
> traditional condition) , the function continues retrying until it
> succeeds, or returns XLRAD_FAIL if promote is triggered.  In other
> words, it was not supposed to return non-failure while the header
> validation is failing while in standby mode.  But while in nonblocking
> mode, the function can return non-failure with lastSourceFailed =
> true, which seems wrong.

New ideas:

0001:  Instead of figuring out when to invalidate the cache, let's
just invalidate it before every read attempt.  It is only marked valid
after success (ie state->readLen > 0).  No need to worry about error
cases.

0002:  While here, I don't like xlogrecovery.c clobbering
xlogreader.c's internal error state, so I think we should have a
function for that with a documented purpose.  It was also a little
inconsistent that it didn't clear a flag (but not buggy AFAICS; kinda
wondering if I should just get rid of that flag, but that's for
another day).

0003:  Thinking about your comments above made me realise that I don't
really want XLogReadPage() to be internally retrying for obscure
failures while reading ahead.  I think I prefer to give up on
prefetching as soon as anything tricky happens, and deal with
complexities once recovery catches up to that point.  I am still
thinking about this point.

Here's the patch set I'm testing.

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: standby promotion can create unreadable WAL
Next
From: Jelte Fennema
Date:
Subject: Re: [PATCH] Optimize json_lex_string by batching character copying