Thread: Re: pgsql: Fix WAL replay in presence of an incomplete record

Re: pgsql: Fix WAL replay in presence of an incomplete record

From

Tom Lane

Date:

05 November 2021, 00:13:50

[ I'm working on the release notes ]

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Fix WAL replay in presence of an incomplete record
> ...
> Because a new type of WAL record is added, users should be careful to
> upgrade standbys first, primaries later. Otherwise they risk the standby
> being unable to start if the primary happens to write such a record.

Is there really any point in issuing such advice?  IIUC, the standbys
would be unable to proceed anyway in case of a primary crash at the
wrong time, because an un-updated primary would send them inconsistent
WAL.  If anything, it seems like it might be marginally better to
update the primary first, reducing the window for it to send WAL that
the standbys will *never* be able to handle.  Then, if it crashes, at
least the WAL contains something the standbys can process once you
update them.

Or am I missing something?

            regards, tom lane

Re: pgsql: Fix WAL replay in presence of an incomplete record

From

Alvaro Herrera

Date:

05 November 2021, 12:06:50

On 2021-Nov-04, Tom Lane wrote:

> Is there really any point in issuing such advice?  IIUC, the standbys
> would be unable to proceed anyway in case of a primary crash at the
> wrong time, because an un-updated primary would send them inconsistent
> WAL.  If anything, it seems like it might be marginally better to
> update the primary first, reducing the window for it to send WAL that
> the standbys will *never* be able to handle.  Then, if it crashes, at
> least the WAL contains something the standbys can process once you
> update them.

Yes -- in production settings, it is better to be able to shut down the
standbys in a scheduled manner, than find out after updating the primary
that your standbys are suddenly inaccessible until you take the further
action of updating them.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Si no sabes adonde vas, es muy probable que acabes en otra parte.

Re: pgsql: Fix WAL replay in presence of an incomplete record

From

Alvaro Herrera

Date:

05 November 2021, 12:28:16

On 2021-Nov-05, Alvaro Herrera wrote:

> On 2021-Nov-04, Tom Lane wrote:
> 
> > the standbys
> > would be unable to proceed anyway in case of a primary crash at the
> > wrong time, because an un-updated primary would send them inconsistent
> > WAL.  If anything, it seems like it might be marginally better to
> > update the primary first, reducing the window for it to send WAL that
> > the standbys will *never* be able to handle.  Then, if it crashes, at
> > least the WAL contains something the standbys can process once you
> > update them.

I suppose the strategy is useless if the primary never crashes.  If the
situation does occur, users can handle it the same way they've handled
it thus far: manually delete the segment from the standby and restart.
At least they know what to do and may even have already automated it.
The other situation is new and would need somebody, possibly taken
abruptly from their sleep, to try to understand why their standbys
refuse to proceed replication in a novel way.

-- 
Álvaro Herrera              Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Porque Kim no hacía nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)