Re: [HACKERS] A design for amcheck heapam verification - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: [HACKERS] A design for amcheck heapam verification
Date
Msg-id 049AE496-791B-4C0E-8ACB-43832F9FA2B8@yandex-team.ru
Whole thread Raw
In response to Re: [HACKERS] A design for amcheck heapam verification  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] A design for amcheck heapam verification
Re: [HACKERS] A design for amcheck heapam verification
List pgsql-hackers
Hello!

I like heapam verification functionality and use it right now. So, I'm planning to provide review for this patch,
probably,this week. 

From my current use I have some thoughts on interface. Here's what I get.

# select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f',true);
ERROR:  XX001: heap tuple (45,21) from table "messagefiltervalue" lacks matching index tuple within index
"messagefiltervalue_group_id_59490523e6ee451f"
HINT:  Retrying verification using the function bt_index_parent_check() might provide a more specific error.
LOCATION:  bt_tuple_present_callback, verify_nbtree.c:1316
Time: 45.668 ms

# select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f');
bt_index_check
----------------

(1 row)
Time: 32.873 ms

# select bt_index_parent_check('messagefiltervalue_group_id_59490523e6ee451f');
ERROR:  XX002: down-link lower bound invariant violated for index "messagefiltervalue_group_id_59490523e6ee451f"
DETAIL:  Parent block=6259 child index tid=(1747,2) parent page lsn=4A0/728F5DA8.
LOCATION:  bt_downlink_check, verify_nbtree.c:1188
Time: 391194.113 ms


Seems like new check is working 4 orders of magnitudes faster then bt_index_parent_check() and still finds my specific
errorthat bt_index_check() missed.  
From this output I see that there is corruption, but cannot understand:
1. What is the scale of corruption
2. Are these corruptions related or not

I think an interface to list all or top N error could be useful.

> 14 дек. 2017 г., в 0:02, Peter Geoghegan <pg@bowt.ie> написал(а):
>>
>> This could also test the reproducibility of the tests with a fixed
>> seed number and at least two rounds, a low number of elements could be
>> more appropriate to limit the run time.
>
> The runtime is already dominated by pg_regress overhead. As it says in
> the README, using a fixed seed in the test harness is pointless,
> because it won't behave in a fixed way across platforms. As long as we
> cannot ensure deterministic behavior, we may as well fully embrace
> non-determinism.
I think that determinism across platforms is not that important as determinism across runs.


Thanks for the amcheck! It is very useful.

Best regards, Andrey Borodin.

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Creating backup history files for backups taken from standbys
Next
From: Tatsuro Yamada
Date:
Subject: Minor code improvement to estimate_path_cost_size in postgres_fdw