Re: .ready and .done files considered harmful - Mailing list pgsql-hackers

From Robert Haas
Subject Re: .ready and .done files considered harmful
Date
Msg-id CA+TgmoY_Y+ZKf_mxEhELvE1sjJbuZUwMq+P8eHPDe_JoVHrLig@mail.gmail.com
Whole thread Raw
In response to Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
On Thu, Sep 2, 2021 at 5:52 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> The pg_readyXlog() logic looks a bit like this:
>
>         1. Try to skip directory scan.  If that succeeds, we're done.
>         2. Do a directory scan.
>         3. If we found a regular WAL file, update PgArch and return
>            what we found.
>
> Let's say step 1 looks for WAL file 10, but 10.ready doesn't exist
> yet.  The following directory scan ends up finding 11.ready.  Just
> before we update the PgArch state, XLogArchiveNotify() is called and
> creates 10.ready.  However, pg_readyXlog() has already decided to
> return WAL segment 11 and update the state to look for 12 next.  If we
> just used '<', we won't force a directory scan, and segment 10 will
> not be archived until the next one happens.  If we use '<=', I don't
> think we have the same problem.

The latest post on this thread contained a link to this one, and it
made me want to rewind to this point in the discussion. Suppose we
have the following alternative scenario:

Let's say step 1 looks for WAL file 10, but 10.ready doesn't exist
yet.  The following directory scan ends up finding 12.ready.  Just
before we update the PgArch state, XLogArchiveNotify() is called and
creates 11.ready.  However, pg_readyXlog() has already decided to
return WAL segment 12 and update the state to look for 13 next.

Now, if I'm not mistaken, using <= doesn't help at all.

In my opinion, the problem here is that the natural way to ask "is
this file being archived out of order?" is to ask yourself "is the
file that I'm marking as ready for archiving now the one that
immediately follows the last one I marked as ready for archiving?" and
then invert the result. That is, if I last marked 10 as ready, and now
I'm marking 11 as ready, then it's in order, but if I'm now marking
anything else whatsoever, then it's out of order. But that's not what
this does. Instead of comparing what it's doing now to what it did
last, it compares what it did now to what the archiver did last.

And it's really not obvious that that's correct. I think that the
above argument actually demonstrates a flaw in the logic, but even if
not, or even if it's too small a flaw to be a problem in practice, it
seems a lot harder to reason about.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Multiple hosts in connection string failed to failover in non-hot standby mode
Next
From: Robert Haas
Date:
Subject: Re: Estimating HugePages Requirements?