Re: VM corruption on standby - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: VM corruption on standby
Date
Msg-id E9637363-7B73-43CD-AFBF-3DD651E5BD13@yandex-team.ru
Whole thread Raw
In response to Re: VM corruption on standby  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

> On 20 Aug 2025, at 00:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andrey Borodin <x4mmm@yandex-team.ru> writes:
>> I believe there is a bug with PageIsAllVisible(page) && visibilitymap_clear(). But I cannot prove it with an
injectionpoint test. Because injections points rely on CondVar, that per se creates corruption in critical section. So
I'mreading this discussion and wonder if CondVar will be fixed in some clever way or I'd better invent new injection
pointwait mechanism. 
>
> Yeah, I was coming to similar conclusions in the reply I just sent:
> we don't really want a policy that we can't put injection-point-based
> delays inside critical sections.  So that infrastructure is leaving
> something to be desired.
>
> Having said that, the test script is also doing something we tell
> people not to do, namely SIGKILL the postmaster.  Could we use
> SIGQUIT (immediate shutdown) instead?

I'm working backwards from corruptions I see on our production.
And almost always I see stormbringers like OOM, power outage or Debian scripts that (I think) do kill -9 when `service
postgresqlstop` takes too long. 


Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: analyze-in-stages post upgrade questions
Next
From: Fujii Masao
Date:
Subject: Re: Don't treat virtual generated columns as missing statistics in vacuumdb --missing-stats-only