Re: [HACKERS] A design for amcheck heapam verification - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] A design for amcheck heapam verification
Date
Msg-id CAH2-WzmFxtPmepZgPrkXkHK3hc0juFOdf2ZtYxt05iDtTuetFQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] A design for amcheck heapam verification  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: [HACKERS] A design for amcheck heapam verification
List pgsql-hackers
Hi,

On Fri, Jan 12, 2018 at 1:41 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> I've looked into the code and here's my review.
>
> The new functionality works and is useful right now. I believe it should be shipped in the Postgres box (currently, I
installit with packet managers). 
> The code is nice and well documented.

Thanks.

> I'd be happy, if I could pass argument like ErrorLevel, which would help me to estimate scale of corruption. Or any
otherway to list more than one error. But, clearly, this functionality is not necessary for this patch. 

My previous remarks apply here, too. I don't know how to rate severity
among error messages.

> Also, I couldn't find where is check that ensures that there is a heap tuple for every B-tree leaf tuple. Is it
there?

No, there isn't, because that's inherently race-prone. This is not so
bad. If you don't end up discovering a problem the other way around
(going from the heap to the index/Bloom filter), then the only way
that an index tuple pointing to the wrong place can go undetected is
because there is a duplicate heap TID in the index, or the heap TID
doesn't exist in any form.

I actually prototyped a patch that makes bt_index_parent_check()
detect duplicate heap TIDs in an index, by collecting heap TIDs from
the index, sorting them, and scanning the sorted array. That seems
like material for another patch, though.

> I've found few neatpicks in bloom filter:
> 1. Choice of sdbmhash does not seems like a best option from my point of view.

I don't mind changing the second hash function, but I want to
emphasize that this shouldn't be expected to make anything more than a
very small difference. The bloom filter probes are slow because the
memory accesses are random.

There are many hash functions that would be suitable here, and not too
many reasons to prefer one over the other. My choice was based on a
desire for simplicity, and for something that seemed to be in
widespread usage for a long time.

> +       bitset_bytes = Max(1024L * 1024L, bitset_bytes);
> bloom_work_mem was supposed to be the limit? Why we do not honor this limit?

The minimum maintenance_work_mem setting is 1MB anyway.

> 3.
> +       filter = palloc0(offsetof(bloom_filter, bitset) +
> +                                        sizeof(unsigned char) * bitset_bytes);
> sizeof(unsigned char) == 1 by C standard

I know. That's just what I prefer to do as a matter of style.

> 4. function my_bloom_power() returns bit numers, then it's result is powered INT64CONST(1) << bloom_power; back. I'd
stikwith removing bits in a loop by while(target_bitset_bits & (target_bitset_bits-1)) {
target_bitset_bits&=target_bitset_bits-1;} or something like that. Or, if we use code like sdbm hash, fallback to
bithacks:) https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2 

my_bloom_power() is only called once per Bloom filter. So again, I
think that this is not very important. Thomas Munro talked about using
popcount() in another function, which is from the book Hacker's
Delight. While I'm sure that these techniques have their use, they
just don't seem to make sense when the alternative is simpler code
that is only going to be executed at most once per verification
operation anyway.

>     for (i = 0; i < filter->k_hash_funcs; i++)
>     {
>         /* Accumulate hash value for caller */
>         hashes[i] = (hasha + i * hashb + i) % filter->bitset_bits;
>     }
> }
> It produces almost same result (hashes 1..k-1 are +1'd), but uses a lot less % operations. Potential overflow is
governedby uint64 type. 

I prefer to be conservative, and to stick to what is described by
Dillinger & Manolios as much as possible.

> That's all what I've found. I do not know did the patch had all necessary reviewers attention. Please, feel free to
changestatus if you think that patch is ready. From my point of view, the patch is in a good shape. 

Michael said he'd do more review. I generally feel this is close, though.

Thanks for the review
--
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] INSERT .. ON CONFLICT DO SELECT [FOR ..]