Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger - Mailing list pgsql-bugs

From Robert Haas
Subject Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Date
Msg-id CA+TgmoYkOp=jDg15ugJDwxqoA6drAPBZmOJkhZJ11HZWhEYx6w@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger  (Alexander Lakhin <exclusion@gmail.com>)
Responses Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
List pgsql-bugs
On Wed, Jan 10, 2024 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> As far as I can see, oldslot and newslot hold different pointer values;
> e. g., with debug logging added (see attached (please forgive me dirty
> hacks inside)), I get:
> LOG:  ExecBRUpdateTriggers after ExecGetUpdateNewTuple()  1; oldslot: 0x758e3c8, slot->tts_tupleDescriptor:
0x7362fc8,
> slot->buffer: 1810, slot->buffer->refcount: 1, slot->tts_values: 0x758e438, i: 1, slot->tts_values[ii]: 0x9982c74
> STATEMENT:  UPDATE bruttest SET cnt = cnt + 1;
> LOG:  ExecBRUpdateTriggers after ExecGetUpdateNewTuple()  1; newslot: 0x747fb48, slot->tts_tupleDescriptor:
0x7362fc8,
> slot->buffer: 0, slot->buffer->refcount: 0, slot->tts_values: 0x747fbb8, i: 1, slot->tts_values[ii]: 0x9982c74
> STATEMENT:  UPDATE bruttest SET cnt = cnt + 1;
> ==00:00:00:05.052 3934530== Invalid read of size 1
> ==00:00:00:05.052 3934530==    at 0x1EECAC: heap_compute_data_size (heaptuple.c:244)
> ...
> ==00:00:00:05.052 3934530==  Address 0x9982c74 is in a rw- anonymous segment
> ==00:00:00:05.052 3934530==
> (when executing `make check` for the isolation test [1] under Valgrind)
>
> But as we can see, both slots use the same buffer (have pointers to the
> same attribute values) as a result of the following operations:
>                   EEOP_SCAN_FETCHSOME:
>                       slot_getsomeattrs(scanslot, op->d.fetch.last_var);
>                   ...
>                   EEOP_ASSIGN_SCAN_VAR:
>                       resultslot->tts_values[resultnum] = scanslot->tts_values[attnum]
> (More details upthread in [2])
>
> In the meantime, newslot doesn't hold a pin at all.

OK, thanks for the further explanation. I think I'm starting to
understand what's going on here. Maybe you already understand all of
this, but in my own words:

ExecGetUpdateNewTuple() calls ExecProject() which documents that the
results are short-lived and you should materialize the slot if that's
a problem. ExecModifyTable() does slot = ExecGetUpdateNewTuple(...),
does not materialize it, and then passes that slot as the penultimate
argument to ExecUpdate(), which calls ExecUpdatePrologue(), which
calls ExecBRUpdateTriggers(), and the value returned by
ExecModifyTable()'s call to ExecGetUpdateNewTuple() is now stored in
newslot, but it still hasn't been materialized. So any operation
that's performed on oldslot at this point invalidates newslot.

But ExecBRUpdateTriggers() now calls GetTupleForTrigger() to clobber
oldslot, so now newslot is also junk. That's technically OK, because
we're not going to access the contents of newslot any more. We're
instead going to refresh them:

        if (epqslot_candidate != NULL)
        {
            TupleTableSlot *epqslot_clean;

            epqslot_clean = ExecGetUpdateNewTuple(relinfo, epqslot_candidate,
                                                  oldslot);

            if (newslot != epqslot_clean)
                ExecCopySlot(newslot, epqslot_clean);
        }

Since the slot returned by this call to ExecGetUpdateNewTuple() is a
function of the same projection info as when it was called from
ExecModifyTable(), the slot returned here must be the same slot that
was returned there, namely newslot. So, as you say, newslot ==
epqslot_clean, and we've just clobbered it with updated values.
Therefore we're back in the situation where oldslot is fully valid and
newslot is only valid until somebody does something to oldslot. Thus,
the portion of your patch that simplifies this code doesn't fix any
bug, but rather just simplifies some unnecessarily complex logic. But
the next thing that happens in the existing code is:

        trigtuple = ExecFetchSlotHeapTuple(oldslot, true, &should_free_trig);

That is going to again invalidate newslot, because it can materialize
oldslot. That's a problem if we ever use newslot after this point, and
we do, unless we materialize newslot first, so your patch adds
ExecMaterializeSlot(newslot) right before this, fixing the problem.

What I don't really understand is why failure to do this causes an
invalid memory access in ExecFetchSlotHeapTuple(). According to the
above analysis, the invalid memory access ought to occur the first
time something touches newslot after this call to
ExecFetchSlotHeapTuple(), but if I understand your valgrind trace
correctly, it's showing that ExecFetchSlotHeapTuple() is touching
something it shouldn't. Do you understand what's happening here?

I think the question we should be asking ourselves here is whether
we're really OK with having such a vast distance between the point
where newslot is first populated and where materialization happens. It
seems like playing with fire to populate a slot with ephemeral data in
ExecModifyTable() and then press the materialize button four stack
frames down. There's a lot of opportunity for things to go wrong
there. On the other hand, this is very performance-sensitive code, so
maybe materializing here is the only workable solution. If that's the
case, though, we should probably add comments making the lifetime of
various slots clearer, because neither Tom nor I was able to
understand what the heck was happening here on first reading, which is
probably not a good sign.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18282: repomd.xml signature could not be verified for pgdg-common
Next
From: Noah Misch
Date:
Subject: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()