Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger - Mailing list pgsql-bugs

From Alexander Lakhin
Subject Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Date
Msg-id 94ca869c-472b-f1cf-3d7c-aa8530f5475d@gmail.com
Whole thread Raw
In response to Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-bugs
Hello Tom,

12.01.2024 00:43, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Jan 10, 2024 at 10:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>>> I was discouraged by that vast distance and implicit buffer usage too, but
>>> I found no other feasible way to fix it. On the other hand, 75e03eabe and
>>> Andres's words upthread made me believe that it's an acceptable solution.
>> I agree that it's potentially acceptable. I just wonder if Tom or
>> someone else is going to want to propose a bigger change to avoid some
>> of this messiness. I don't know what that would look like, though.
> I'm still feeling itchy about it.  Maybe the problem could be fixed
> better if we didn't re-use slots with such abandon in this code,
> but I don't have a concrete proposal.  Also, that'd likely be a
> nontrivial rewrite bringing its own possibilities of adding bugs.

I'm not too excited by this approach either, but maybe it's the only
solution which can be implemented for now.
I have rechecked other ExecGetUpdateNewTuple() calls and confirmed that all
of these materialize newslot returned.
Namely,
1) ExecCrossPartitionUpdate() called by ExecUpdateAct(), where in case of a
  retry we have:
     /* ensure slot is independent, consider e.g. EPQ */
     ExecMaterializeSlot(slot);

2) ExecModifyTable() passes the slot returned to ExecUpdate(), which begins
  with a call to ExecUpdatePrologue(), which materializes the slot.

3) ExecUpdate() itself calls ExecGetUpdateNewTuple() in a loop (redo_act),
  where ExecUpdateAct() called, which also materializes the slot.

So this place in ExecBRUpdateTriggers() is the only one where we have no
  materialization after ExecGetUpdateNewTuple().

Maybe another option is to always materialize the slot inside
  ExecGetUpdateNewTuple(), but I'm afraid such a change is not suitable for
  back-patching...

> Some concrete thoughts:
>
> * Maybe better to Assert(newslot == epqslot_clean) instead of
> what you did here?  Not sure.

Me too. Other ExecGetUpdateNewTuple() calls don't have such Asserts nearby,
but I haven't studied yet whether they could have ones...

> * Why is the ExecMaterializeSlot step needed if we don't have
> an epqslot_candidate?

I had decided to move that step out of "if (epqslot_candidate != NULL)"
in v3 to play safe when removing ExecMaterializeSlot(newslot) brought by
75e03eabe (more details upthread [1]). I thought that if that commit stated
that the slot needs materialization (it was time before 86dc90056, which
raised the current issue), then the materialization must be preserved.
(The same commit can be found in zheap repository [2], but I found no other
explanation of it's purpose...)
Though looking at the current core code, I'm inclined to perform
ExecMaterializeSlot() only when we have epqslot_candidate, if we can afford
not to bother about that zheap-specific or similar scenario...

[1] https://www.postgresql.org/message-id/e989408c-1838-61cd-c968-1fcf47c6fbba%40gmail.com
[2] https://github.com/EnterpriseDB/zheap/commit/75e03eabe

Best regards,
Alexander



pgsql-bugs by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: "unexpected duplicate for tablespace" problem in logical replication
Next
From: PG Bug reporting form
Date:
Subject: BUG #18287: pg_restore with -C and -c options does not do what is said in the documentation