Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Date
Msg-id 13122.1574375845@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
List pgsql-bugs
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> FWIW my hunch is the bug is somewhere in this chunk of code from
> apply_heap_update:

>          oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>          ExecCopySlot(remoteslot, localslot);
>          slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
>          MemoryContextSwitchTo(oldctx);

My first reaction to looking at this was that ExecCopySlot() could
be dropped entirely, because the first thing that slot_modify_cstrings
does is

    slot_getallattrs(slot);
    ExecClearTuple(slot);

That slot_getallattrs call seems 100% useless as well, because
surely ExecClearTuple is just going to drop all the data.

On closer inspection, though, it looks like the author of this
code is relying on ExecClearTuple to *not* destroy the data,
because the subsequent loop might overwrite only some of the
slot's columns, before it does

    ExecStoreVirtualTuple(slot);

which supposes that all the columns are valid.

But of course this is 100% broken, because whatever ExecClearTuple
may have done or not done to the slot's datum/isnull arrays, it
certainly pfree'd the slot's physical tuple.  So any pass-by-ref
datums are now dangling pointers.

I imagine the only reason this code has gotten past the valgrind
animals is that we're not testing cases where non-replaced columns
in the subscriber table are of pass-by-ref types.

I draw a direct line between the existence of this bug and the lack
of commentary in slot_modify_cstrings about what it's doing.  If the
author had troubled to write a comment explaining these machinations,
he might've noticed the bug, or at least reviewers might have.

            regards, tom lane



pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
Next
From: Tom Lane
Date:
Subject: Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker