Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger - Mailing list pgsql-bugs

From Alexander Lakhin
Subject Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Date
Msg-id bc6d1314-4e7a-2bde-1f01-a14de4909efd@gmail.com
Whole thread Raw
In response to Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
List pgsql-bugs
30.04.2023 01:24, Tom Lane wrote:
> Alexander Lakhin <exclusion@gmail.com> writes:
>> Could you please confirm the issue, Richard's and my conclusions, and the
>> correctness of the patch, in light of the upcoming May releases?
>> Maybe I'm wrong, but I think that this bug can lead to data corruption in
>> databases where BRU triggers are used.
> Clearly it could, though the probability seems low since the just-released
> buffer would have to get recycled very quickly.

So I've made several mistakes when trying to reach the solution walking on
thin ice. I'd initially observed the issue with asan (and perhaps that had
happened under some specific conditions (memory pressure? (I can't say
now)), but when I produced and simplified the repro for the bug report,
I was focused on valgrind, so the final repro doesn't demonstrate a real
memory corruption and it's not clear now, which conditions needed to cause it.
So I really can't estimate the probability of the corruption.

> I am not entirely comfortable with blaming this on 86dc90056 though,
> even though "git bisect" seems to finger that.  The previous coding
> there with ExecFilterJunk() also produced a virtual tuple, as you
> can see by examining ExecFilterJunk, so why didn't the problem
> manifest before?  I think the answer may be that before 86dc90056,
> we were effectively relying on the underlying SeqScan node to keep
> the buffer pinned, but now we're re-fetching the tuple and no pin
> is held by lower plan nodes.  I'm not entirely sure about that though;
> ISTM a SeqScan ought to hold pin on its current tuple in any case.
> However, if the UPDATE plan were more complicated (involving a join)
> then it's quite believable that we'd get here with no relevant pin
> being held.

Yeah, also the environment changed since 86dc90056, so I couldn't just revert
that commit to test the ExecFilterJunk() behavior on master.

All the questions you raised require a more thorough investigation.

> The practical impact of that concern is that I'm not convinced that
> the proposed patch fixes all variants of the issue.  Maybe we need
> to materialize even if newslot != epqslot_clean.  Maybe we need to
> do it even if there wasn't a concurrent update.  Maybe we need to
> do it in pre-v14 branches, too.  It's certainly not obvious that this
> function ought to assume anything about which slots are pointing at
> what.
>
> Also, if we do materialize here, does this code a little further down
> become redundant?
>
>             if (should_free_trig && newtuple == trigtuple)
>                 ExecMaterializeSlot(newslot);
>
> A completely different way of looking at it is that we should not
> treat this as ExecBRUpdateTriggers's bug at all.  The slot mechanisms
> are supposed to protect the data referenced by a slot, so why is that
> failing to happen in this example?  The correct fix might involve
> newslot acquiring a buffer pin, for example.

Yes, it was tough to decide how to fix it (and the correct buffer pinning
seemed more plausible), but 75e03eabe gave me hope that it could be fixed
by the simple one-line change.

> Bottom line is that I'm afraid the present patch is band-aiding a
> specific problem without solving the general case.

I agree with your analysis and conclusions, so I'm going to turn the alarm
mode off and research the issue deeper.

Thank you for the detailed answer!

Best regards,
Alexander



pgsql-bugs by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array
Next
From: Tom Lane
Date:
Subject: Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array