Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple - Mailing list pgsql-hackers

From Wong, Yi Wen
Subject Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple
Date
Msg-id 1507324152711.78083@amazon.com
Whole thread Raw
In response to Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
List pgsql-hackers
>On Fri, Oct 6, 2017 at 11:34 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>>> I don't know if it's really the freeze map at fault or something else.
>>
>> Ideally, it would be possible to effectively disable the new freeze
>> map stuff in a minimal way, for testing purposes. Perhaps the authors
> of that patch, CC'd, can suggest a way to do that.

>Actually, the simplest thing might be to just use pg_visibility's
>pg_check_frozen() to check that the visibility/freeze map accurately
>summarizes the all-frozen status of tuples in the heap. If that
>doesn't indicate that there is corruption, we can be fairly confident
>that the problem is elsewhere. The metadata in the visibility/freeze
>map should be accurate when a bit is set to indicate that an entire
>heap page is all-frozen (or, separately, all-visible). We can hardly
>expect it to have better information that the authoritative source of
>truth, the heap itself.

>The more I think about it, the more I tend to doubt that the remaining
>problems are with the freeze map. If the freeze map was wrong, and
>incorrectly said that a page was all-frozen, then surely the outward
>symptoms would take a long time to show up, as they always do when we
>accidentally fail to freeze a tuple before a relfrozenxid cutoff. ISTM
>that that's the only meaningful way that the freeze map can be wrong
>-- it only promises to be accurate when it says that no further
>freezing is needed for a page/bit.

Yesterday, I've been spending time with pg_visibility on the pages when I reproduce the issue in 9.6.
None of the all-frozen or all-visible bits are necessarily set in problematic pages.

ERROR:  failed to find parent tuple for heap-only tuple at (0,182) in table "accounts"
blkno | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------    0 | f           | f          | f
(1 row)

Even when the bits were set, I haven't found issues with the pg_check_xxx functions in the dozens of times I've run
them.

postgres=# select * from pg_visibility('accounts'::regclass);blkno | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------    0 | f           | f          | f    1 | t           | t
|t 
(2 rows)

postgres=# select pg_check_visible('accounts'::regclass);pg_check_visible
------------------
(0 rows)

postgres=# select pg_check_frozen('accounts'::regclass);pg_check_frozen
-----------------
(0 rows)

Yi Wen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Badrul Chowdhury
Date:
Subject: Re: [HACKERS] Re: protocol version negotiation (Re: LibpqPGRES_COPY_BOTH - version compatibility)
Next
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple