Re: broken reading on standby (PostgreSQL 16.2) - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: broken reading on standby (PostgreSQL 16.2)
Date
Msg-id CAFj8pRA6oEV2diydiuWyKGQ-jxnv46==hjbPjXiyMqMYHrwNag@mail.gmail.com
Whole thread Raw
In response to Re: broken reading on standby (PostgreSQL 16.2)  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: broken reading on standby (PostgreSQL 16.2)
List pgsql-hackers


čt 25. 4. 2024 v 8:52 odesílatel Andrey M. Borodin <x4mmm@yandex-team.ru> napsal:


> On 25 Apr 2024, at 11:12, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> yesterday, I had to fix strange issue on standby server

It’s not just broken reading, if this standby is promoted in HA cluster - this would lead to data loss.
Recently I’ve observed some lost heap updates ofter OOM-ing cluster on 14.11. This might be unrelated most probably, but I’ll post a link here, just in case [0]. In February and March we had 3 clusters with similar problem, and this is unusually big number for us in just 2 months.

Can you check LSN of blocks with corrupted tuples with pageinpsect on primary and on standby? I suspect they are frozen on primary, but have usual xmin on standby.

Unfortunately, I have not direct access to backup, so I am not able to test it. But VACUUM FREEZE DISABLE_PAGE_SKIPPING on master didn't help


 


Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/67EADE8F-AEA6-4B73-8E38-A69E5D48BAFE%40yandex-team.ru#1266dd8b898ba02686c2911e0a50ab47

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Add missing ConditionVariableCancelSleep() in slot.c
Next
From: Heikki Linnakangas
Date:
Subject: Re: Experiments with Postgres and SSL