Thread: [BUG]: segfault during update

[BUG]: segfault during update

From
"Drouvot, Bertrand"
Date:

Hi hackers,

Here is a scenario that produces segfault during update (on version 12 and 13):

create table bdttrig (created_at timestamp without time zone not null default now(),col1 bool not null default false, col2 text not null default 'def', col3 char(6) not null default 'bdtbdt');
CREATE TABLE

CREATE or replace FUNCTION trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
IF (NEW != OLD) THEN
NEW.created_at = OLD.created_at;
RETURN NEW;
END IF;
RETURN OLD;
END;
$$
;

CREATE FUNCTION

create trigger bdt_trigger before update on bdttrig for each row EXECUTE function trigger_function();

CREATE TRIGGER

insert into bdttrig(col1) values (true);

INSERT 0 1

alter table bdttrig add column col4 text  not null default 'default';

ALTER TABLE

alter table bdttrig drop column col2;

ALTER TABLE

update bdttrig  set col1 = true;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

I did some digging and found out and tested that the issue is fixed by this commit: https://github.com/postgres/postgres/commit/20d3fe9009ddbbbb3da3a2da298f922054b43f8c

So, we would need to back port this commit on 12 and 13.

Thanks

Bertrand

Re: [BUG]: segfault during update

From
Tom Lane
Date:
"Drouvot, Bertrand" <bdrouvot@amazon.com> writes:
> Here is a scenario that produces segfault during update (on version 12 
> and 13):

Hm.  So the point about failing to reproduce dropped columns is more
critical than I thought.  I wonder how come we get away with that before
v12?

> So, we would need to back port this commit on 12 and 13.

Yeah, this is sufficient reason why we must use the more invasive
patch on those branches.  What I'm wondering now is if there's a
way to break even-older branches based on failure to handle dropped
columns here.

            regards, tom lane



Re: [BUG]: segfault during update

From
Tom Lane
Date:
I wrote:
> Yeah, this is sufficient reason why we must use the more invasive
> patch on those branches.  What I'm wondering now is if there's a
> way to break even-older branches based on failure to handle dropped
> columns here.

After tracing through the example in v11, I see why those branches
are not broken: when ExecBRUpdateTriggers decides to return the
trigger-returned tuple, it sticks it into a target slot like this:

        /*
         * Return the modified tuple using the es_trig_tuple_slot.  We assume
         * the tuple was allocated in per-tuple memory context, and therefore
         * will go away by itself. The tuple table slot should not try to
         * clear it.
         */
        TupleTableSlot *newslot = estate->es_trig_tuple_slot;
        TupleDesc    tupdesc = RelationGetDescr(relinfo->ri_RelationDesc);

        if (newslot->tts_tupleDescriptor != tupdesc)
            ExecSetSlotDescriptor(newslot, tupdesc);
        ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);

So the slot that ExecConstraints et al will be working with contains
the relation's actual tuple descriptor, not the approximate descr
obtained by looking at the plan tlist.

This logic is entirely gone in v12, which confirms my instinct that
there was something about Andres' slot-manipulation changes that
broke this scenario.  In v12 we end up using the junkfilter's output
slot, which does not have a sufficiently accurate tupdesc to deal with
an on-disk tuple rather than one constructed by the executor.

So I'll go see about back-patching 20d3fe900.

            regards, tom lane



Re: [BUG]: segfault during update

From
"Drouvot, Bertrand"
Date:
Hi,

On 11/8/20 6:46 PM, Tom Lane wrote:
> I wrote:
>> Yeah, this is sufficient reason why we must use the more invasive
>> patch on those branches.  What I'm wondering now is if there's a
>> way to break even-older branches based on failure to handle dropped
>> columns here.
> After tracing through the example in v11, I see why those branches
> are not broken: when ExecBRUpdateTriggers decides to return the
> trigger-returned tuple, it sticks it into a target slot like this:
>
>          /*
>           * Return the modified tuple using the es_trig_tuple_slot.  We assume
>           * the tuple was allocated in per-tuple memory context, and therefore
>           * will go away by itself. The tuple table slot should not try to
>           * clear it.
>           */
>          TupleTableSlot *newslot = estate->es_trig_tuple_slot;
>          TupleDesc    tupdesc = RelationGetDescr(relinfo->ri_RelationDesc);
>
>          if (newslot->tts_tupleDescriptor != tupdesc)
>              ExecSetSlotDescriptor(newslot, tupdesc);
>          ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);
>
> So the slot that ExecConstraints et al will be working with contains
> the relation's actual tuple descriptor, not the approximate descr
> obtained by looking at the plan tlist.
>
> This logic is entirely gone in v12, which confirms my instinct that
> there was something about Andres' slot-manipulation changes that
> broke this scenario.  In v12 we end up using the junkfilter's output
> slot, which does not have a sufficiently accurate tupdesc to deal with
> an on-disk tuple rather than one constructed by the executor.
>
> So I'll go see about back-patching 20d3fe900.

Thanks for the back-patching!

Bertrand




Re: [BUG]: segfault during update

From
Andres Freund
Date:
Hi,

On 2020-11-08 12:46:44 -0500, Tom Lane wrote:
> This logic is entirely gone in v12, which confirms my instinct that
> there was something about Andres' slot-manipulation changes that
> broke this scenario.

Entirely possible :(. In my defense, it wasn't exactly obvious or
documented that the fast default code relied on this
ExecSetSlotDescriptor()...


> In v12 we end up using the junkfilter's output
> slot, which does not have a sufficiently accurate tupdesc to deal with
> an on-disk tuple rather than one constructed by the executor.

I really wonder if we ought to redesign the column default logic to
really be a property of the Attr, instead of the constraint stuff it is
now.

> So I'll go see about back-patching 20d3fe900.

Thanks.

Is it worth adding Bertrand's testcase to the regression suite in some
form?

Greetings,

Andres Freund



Re: [BUG]: segfault during update

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-11-08 12:46:44 -0500, Tom Lane wrote:
>> In v12 we end up using the junkfilter's output
>> slot, which does not have a sufficiently accurate tupdesc to deal with
>> an on-disk tuple rather than one constructed by the executor.

> I really wonder if we ought to redesign the column default logic to
> really be a property of the Attr, instead of the constraint stuff it is
> now.

Yeah, I don't much like treating it as a constraint either.  Not quite
sure that it's worth the work to move it somewhere else, though.

> Is it worth adding Bertrand's testcase to the regression suite in some
> form?

I did add an equivalent test case.

            regards, tom lane