Re: [HACKERS] Broken hint bits (freeze) - Mailing list pgsql-hackers

From Vladimir Borodin
Subject Re: [HACKERS] Broken hint bits (freeze)
Date
Msg-id D7B95626-BF11-4E7E-AF10-0AB4B5BE9E79@simply.name
Whole thread Raw
In response to Re: [HACKERS] Broken hint bits (freeze)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

24 мая 2017 г., в 15:44, Robert Haas <robertmhaas@gmail.com> написал(а):

On Wed, May 24, 2017 at 7:27 AM, Dmitriy Sarafannikov
<dsarafannikov@yandex.ru> wrote:
It seems like replica did not replayed corresponding WAL records.
Any thoughts?

heap_xlog_freeze_page() is a pretty simple function.  It's not
impossible that it could have a bug that causes it to incorrectly skip
records, but it's not clear why that wouldn't affect many other replay
routines equally, since the pattern of using the return value of
XLogReadBufferForRedo() to decide what to do is widespread.

Can you prove that other WAL records generated around the same time as
the freeze record *were* replayed on the master?  If so, that proves
that this isn't just a case of the WAL never reaching the standby.
Can you look at the segment that contains the relevant freeze record
with pg_xlogdump?  Maybe that record is messed up somehow.

Not yet. Most of such cases are long before our recovery window so corresponding WALs have been deleted. We have already tuned retention policy and we are now looking for a fresh case.


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
May the force be with you…

pgsql-hackers by date:

Previous
From: amul sul
Date:
Subject: Re: [HACKERS] [POC] hash partitioning
Next
From: Jeevan Ladhe
Date:
Subject: Re: [HACKERS] Adding support for Default partition in partitioning