Thread: Re: pgsql: Fix WAL replay in presence of an incomplete record
[ I'm working on the release notes ] Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Fix WAL replay in presence of an incomplete record > ... > Because a new type of WAL record is added, users should be careful to > upgrade standbys first, primaries later. Otherwise they risk the standby > being unable to start if the primary happens to write such a record. Is there really any point in issuing such advice? IIUC, the standbys would be unable to proceed anyway in case of a primary crash at the wrong time, because an un-updated primary would send them inconsistent WAL. If anything, it seems like it might be marginally better to update the primary first, reducing the window for it to send WAL that the standbys will *never* be able to handle. Then, if it crashes, at least the WAL contains something the standbys can process once you update them. Or am I missing something? regards, tom lane
On 2021-Nov-04, Tom Lane wrote: > Is there really any point in issuing such advice? IIUC, the standbys > would be unable to proceed anyway in case of a primary crash at the > wrong time, because an un-updated primary would send them inconsistent > WAL. If anything, it seems like it might be marginally better to > update the primary first, reducing the window for it to send WAL that > the standbys will *never* be able to handle. Then, if it crashes, at > least the WAL contains something the standbys can process once you > update them. Yes -- in production settings, it is better to be able to shut down the standbys in a scheduled manner, than find out after updating the primary that your standbys are suddenly inaccessible until you take the further action of updating them. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ Si no sabes adonde vas, es muy probable que acabes en otra parte.
On 2021-Nov-05, Alvaro Herrera wrote: > On 2021-Nov-04, Tom Lane wrote: > > > the standbys > > would be unable to proceed anyway in case of a primary crash at the > > wrong time, because an un-updated primary would send them inconsistent > > WAL. If anything, it seems like it might be marginally better to > > update the primary first, reducing the window for it to send WAL that > > the standbys will *never* be able to handle. Then, if it crashes, at > > least the WAL contains something the standbys can process once you > > update them. I suppose the strategy is useless if the primary never crashes. If the situation does occur, users can handle it the same way they've handled it thus far: manually delete the segment from the standby and restart. At least they know what to do and may even have already automated it. The other situation is new and would need somebody, possibly taken abruptly from their sleep, to try to understand why their standbys refuse to proceed replication in a novel way. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "Porque Kim no hacía nada, pero, eso sí, con extraordinario éxito" ("Kim", Kipling)