Re: pg_amcheck contrib application - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pg_amcheck contrib application
Date
Msg-id CA+TgmobpeiMh7Y+vve7eRv8Rru=7FJ_Z2nM8EMxew+sWBr58Og@mail.gmail.com
Whole thread Raw
In response to Re: pg_amcheck contrib application  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_amcheck contrib application  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
On Thu, Apr 1, 2021 at 1:41 PM Robert Haas <robertmhaas@gmail.com> wrote:
> OK, committed. We still need to deal with what you had as 0003
> upthread, so I guess the next step is for me to spend some time
> reviewing that one.

I did this, and it was a bit depressing. It appears that we now have
duplicate checks for xmin and xmax being out of the valid range.
Somehow you have the removal of those duplicate checks in 0003, but
why in the world didn't you put them into one of the previous patches
and just make 0003 about fixing the
holding-buffer-lock-while-following-TOAST-pointers problem? (And, gah,
why did I not look carefully enough to notice that you hadn't done
that?)

Other than that, I notice a few other things:

- There are a total of two (2) calls in the current source code to
palloc0fast, and hundreds of calls to palloc0. So I think you should
forget about using the fast variant and just do what almost every
other caller does.

- If you want to make this code faster, a better idea would be to
avoid doing all of this allocate and free work and just allocate an
array that's guaranteed to be big enough, and then keep track of how
many elements of that array are actually in use.

- #ifdef DECOMPRESSION_CORRUPTION_CHECKING is not a useful way of
introducing such a feature. Either we do it for real and expose it via
SQL and pg_amcheck as an optional behavior, or we rip it out and
revisit it later. Given the nearness of feature freeze, my vote is for
the latter.

- I'd remove the USE_LZ4 bit, too. Let's not define the presence of
LZ4 data in a non-LZ4-enabled cluster as corruption. If we do, then
people will expect to be able to use this to find places where they
are dependent on LZ4 if they want to move away from it -- and if we
don't recurse into composite datums, that will not actually work.

- check_toast_tuple() has an odd and slightly unclear calling
convention for which there are no comments. I wonder if it would be
better to reverse things and make bool *error the return value and
what is now the return value into a pointer argument, but whatever we
do I think it needs a few words in the comments. We don't need to
slavishly explain every argument -- I think toasttup and ctx and tctx
are reasonably clear -- but this is not.

- To me it would be more logical to reverse the order of the
toast_pointer.va_toastrelid != ctx->rel->rd_rel->reltoastrelid and
VARATT_EXTERNAL_GET_EXTSIZE(toast_pointer) > toast_pointer.va_rawsize
- VARHDRSZ checks. Whether we're pointing at the correct relation
feels more fundamental.

- If we moved the toplevel foreach loop in check_toasted_attributes()
out to the caller, say renaming the function to just
check_toasted_attribute(), we'd save a level of indentation in that
whole function and probably add a tad of clarity, too. You wouldn't
feel the need to Assert(ctx.toasted_attributes == NIL) in the caller
if the caller had just done list_free(ctx->toasted_attributes);
ctx->toasted_attributes = NIL.

- Is there a reason we need a cross-check on both the number of chunks
and on the total size? It seems to me that we should check that each
individual chunk has the size we expect, and that the total number of
chunks is what we expect. The overall size is then necessarily
correct.

- Why are all the changes to the tests in this patch? What do they
have to do with getting the TOAST checks out from under the buffer
lock? I really need you to structure the patch series so that each
patch is about one topic and, equally, so that each topic is only
covered by one patch. Otherwise it's just way too confusing.

- I think some of these messages need a bit of word-smithing, too, but
we can leave that for when we're closer to being done with this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: New predefined roles- 'pg_read/write_all_data'
Next
From: Heikki Linnakangas
Date:
Subject: Re: Force lookahead in COPY FROM parsing