Home > mailing lists

Re: Fail hard if xlogreader.c fails on out-of-memory - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: Fail hard if xlogreader.c fails on out-of-memory
Date	September 26, 2023 23:14:15
Msg-id	ZRNlx1__idb_R_-S@paquier.xyz Whole thread Raw
In response to	Re: Fail hard if xlogreader.c fails on out-of-memory (Thomas Munro <thomas.munro@gmail.com>)
List	pgsql-hackers

Tree view

On Wed, Sep 27, 2023 at 11:06:37AM +1300, Thomas Munro wrote:
> I don't have an opinion yet on your other thread about making this
> stuff configurable for replicas, but for the simple crash recovery
> case shown here, hard failure makes sense to me.

Also, if we conclude that we're OK with just failing hard all the time
for crash recovery and archive recovery on OOM, the other patch is not
really required.  That would be disruptive for standbys in some cases,
still perhaps OK in the long-term.  I am wondering if people have lost
data because of this problem on production systems, actually..  It
would not be possible to know that it happened until you see a page on
disk that has a somewhat valid LSN, still an LSN older than the
position currently being inserted, and that could show up in various
forms.  Even that could get hidden quickly if WAL is written at a fast
pace after a crash recovery.  A standby promotion at an LSN older
would be unlikely as monitoring solutions discard standbys lagging
behind N bytes.

> *A more detailed analysis would talk about sectors (page header is
> atomic), and consider whether we're only trying to defend ourselves
> against recycled pages written by PostgreSQL (yes), arbitrary random
> data (no, but it's probably still pretty good) or someone trying to
> trick us (no, and we don't stand a chance).

WAL would not be the only part of the system that would get borked if
arbitrary bytes can be inserted into what's read from disk, random or
not.
--
Michael

Attachment

signature.asc

pgsql-hackers by date:

From: Jeff Davis
Date: 26 September 2023, 23:13:32
Subject: Re: Is this a problem in GenericXLogFinish()?

From: Michael Paquier
Date: 26 September 2023, 23:33:03
Subject: Re: pg_rewind with cascade standby doesn't work well

Re: Fail hard if xlogreader.c fails on out-of-memory - Mailing list pgsql-hackers

Attachment

Previous

Next