Re: [HACKERS] A design for amcheck heapam verification - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [HACKERS] A design for amcheck heapam verification
Date
Msg-id CANP8+jKDV3fxucH_GJqS_0=2cP0NVYy83ysn_epeGNScY4X2sA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] A design for amcheck heapam verification  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] A design for amcheck heapam verification  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On 29 March 2018 at 01:49, Peter Geoghegan <pg@bowt.ie> wrote:

>>> IndexBuildHeapRangeScan() doesn't mention anything about CIC's heap
>>> ShareUpdateExclusiveLock (it just says SharedLock), because that lock
>>> strength doesn't have anything to do with IndexBuildHeapRangeScan()
>>> when it operates with an MVCC snapshot. I think that this means that
>>> this patch doesn't need to update comments within
>>> IndexBuildHeapRangeScan(). Perhaps that's a good idea, but it seems
>>> independent.
>>
>>
>> Ok, I agree. But note that we are now invoking that code with
>> AccessShareLock() whereas the existing callers either use ShareLock or
>> ShareUpdateExclusiveLock. That's probably does not matter, but it's a change
>> worth noting.
>
> Fair point, even though the ShareUpdateExclusiveLock case isn't
> actually acknowledged by IndexBuildHeapRangeScan(). Can we leave this
> one up to the committer, too? I find it very hard to argue either for
> or against this, and I want to avoid "analysis paralysis" at this
> important time.

The above discussion doesn't make sense to me, hope someone will explain.

I understand we are adding a check to verify heap against index and
this will take longer than before. When it runs does it report
progress of the run via pg_stat_activity, so we can monitor how long
it will take?

Locking is also an important concern.

If we need a ShareLock to run the extended check and the check runs
for a long time, when would we decide to run that? This sounds like it
will cause a production outage, so what are the pre-conditions that
would lead us to say "we'd better run this". For example, if running
this is known to be signficantly faster than running CREATE INDEX,
that might be an argument for someone to run this first if index
corruption is suspected.

If it detects an issue, can it fix the issue for the index by
injecting correct entries? If not then we will have to run CREATE
INDEX afterwards anyway, which makes it more likely that people would
just run CREATE INDEX and not bother with the check.

So my initial questions are about when we would run this and making
sure that is documented.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Commit fest 2017-11
Next
From: Thomas Munro
Date:
Subject: Re: Parallel safety of binary_upgrade_create_empty_extension