Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index. - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Date
Msg-id CAPpHfdu4jF3TrML0rcNczYcf__xXRycKzn2Vi+kKgiYNNC0w-A@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.  (Mark Dilger <mark.dilger@enterprisedb.com>)
Responses Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
List pgsql-hackers
On Sat, May 11, 2024 at 4:13 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> > On May 10, 2024, at 12:05 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > The only bt_target_page_check() caller is
> > bt_check_level_from_leftmost(), which overrides state->target in the
> > next iteration anyway.  I think the patch is just refactoring to
> > eliminate the confusion pointer by Peter Geoghegan upthread.
>
> I find your argument unconvincing.
>
> After bt_target_page_check() returns at line 919, and before bt_check_level_from_leftmost() overrides state->target
inthe next iteration, bt_check_level_from_leftmost() conditionally fetches an item from the page referenced by
state->target. See line 963. 
>
> I'm left with four possibilities:
>
>
> 1)  bt_target_page_check() never gets to the code that uses "rightpage" rather than "state->target" in the same
iterationwhere bt_check_level_from_leftmost() conditionally fetches an item from state->target, so the change you're
makingdoesn't matter. 
>
> 2)  The code prior to v2-0003 was wrong, having changed state->target in an inappropriate way, causing the wrong
thingto happen at what is now line 963.  The patch fixes the bug, because state->target no longer gets overwritten
whereyou are now using "rightpage" for the value. 
>
> 3)  The code used to work, having set up state->target correctly in the place where you are now using "rightpage",
butv2-0003 has broken that. 
>
> 4)  It's been broken all along and your patch just changes from wrong to wrong.
>
>
> If you believe (1) is true, then I'm complaining that you are relying far to much on action at a distance, and that
youare not documenting it.  Even with documentation of this interrelationship, I'd be unhappy with how brittle the code
is. I cannot easily discern that the two don't ever happen in the same iteration, and I'm not at all convinced one way
orthe other.  I tried to set up some Asserts about that, but none of the test cases actually reach the new code, so
addingAsserts doesn't help to investigate the question. 
>
> If (2) is true, then I'm complaining that the commit message doesn't mention the fact that this is a bug fix.  Bug
fixesshould be clearly documented as such, otherwise future work might assume the commit can be reverted with only
stylisticconsequences. 
>
> If (3) is true, then I'm complaining that the patch is flat busted.
>
> If (4) is true, then maybe we should revert the entire feature, or have a discussion of mitigation efforts that are
needed.
>
> Regardless of which of 1..4 you pick, I think it could all do with more regression test coverage.
>
>
> For reference, I said something similar earlier today in another email to this thread:
>
> This patch introduces a change that stores a new page into variable "rightpage" rather than overwriting
"state->target",which the old implementation most certainly did.  That means that after returning from
bt_target_page_check()into the calling function bt_check_level_from_leftmost() the value in state->target is not what
itwould have been prior to this patch.  Now, that'd be irrelevant if nobody goes on to consult that value, but just 44
linesfurther down in bt_check_level_from_leftmost() state->target is clearly used.  So the behavior at that point is
changingbetween the old and new versions of the code, and I think I'm within reason to ask if it was wrong before the
patch,wrong after the patch, or something else?  Is this a bug being introduced, being fixed, or ... ? 

Thank you for your analysis.  I'm inclined to believe in 2, but not
yet completely sure.  It's really pity that our tests don't cover
this.  I'm investigating this area.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: 039_end_of_wal: error in "xl_tot_len zero" test
Next
From: Heikki Linnakangas
Date:
Subject: Re: Direct SSL connection with ALPN and HBA rules