Thread: [BUG]: segfault during update
Hi hackers,
Here is a scenario that produces segfault during update (on version 12 and 13):
create table bdttrig (created_at timestamp without time zone not null default now(),col1 bool not null default false, col2 text not null default 'def', col3 char(6) not null default 'bdtbdt');
CREATE TABLE
CREATE or replace FUNCTION trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
IF (NEW != OLD) THEN
NEW.created_at = OLD.created_at;
RETURN NEW;
END IF;
RETURN OLD;
END;
$$
;
CREATE FUNCTION
create trigger bdt_trigger before update on bdttrig for each row EXECUTE function trigger_function();
CREATE TRIGGER
insert into bdttrig(col1) values (true);
INSERT 0 1
alter table bdttrig add column col4 text not null default 'default';
ALTER TABLE
alter table bdttrig drop column col2;
ALTER TABLE
update bdttrig set col1 = true;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
I did some digging and found out and tested that the issue is fixed by this commit: https://github.com/postgres/postgres/commit/20d3fe9009ddbbbb3da3a2da298f922054b43f8c
So, we would need to back port this commit on 12 and 13.
Thanks
Bertrand
"Drouvot, Bertrand" <bdrouvot@amazon.com> writes: > Here is a scenario that produces segfault during update (on version 12 > and 13): Hm. So the point about failing to reproduce dropped columns is more critical than I thought. I wonder how come we get away with that before v12? > So, we would need to back port this commit on 12 and 13. Yeah, this is sufficient reason why we must use the more invasive patch on those branches. What I'm wondering now is if there's a way to break even-older branches based on failure to handle dropped columns here. regards, tom lane
I wrote: > Yeah, this is sufficient reason why we must use the more invasive > patch on those branches. What I'm wondering now is if there's a > way to break even-older branches based on failure to handle dropped > columns here. After tracing through the example in v11, I see why those branches are not broken: when ExecBRUpdateTriggers decides to return the trigger-returned tuple, it sticks it into a target slot like this: /* * Return the modified tuple using the es_trig_tuple_slot. We assume * the tuple was allocated in per-tuple memory context, and therefore * will go away by itself. The tuple table slot should not try to * clear it. */ TupleTableSlot *newslot = estate->es_trig_tuple_slot; TupleDesc tupdesc = RelationGetDescr(relinfo->ri_RelationDesc); if (newslot->tts_tupleDescriptor != tupdesc) ExecSetSlotDescriptor(newslot, tupdesc); ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); So the slot that ExecConstraints et al will be working with contains the relation's actual tuple descriptor, not the approximate descr obtained by looking at the plan tlist. This logic is entirely gone in v12, which confirms my instinct that there was something about Andres' slot-manipulation changes that broke this scenario. In v12 we end up using the junkfilter's output slot, which does not have a sufficiently accurate tupdesc to deal with an on-disk tuple rather than one constructed by the executor. So I'll go see about back-patching 20d3fe900. regards, tom lane
Hi, On 11/8/20 6:46 PM, Tom Lane wrote: > I wrote: >> Yeah, this is sufficient reason why we must use the more invasive >> patch on those branches. What I'm wondering now is if there's a >> way to break even-older branches based on failure to handle dropped >> columns here. > After tracing through the example in v11, I see why those branches > are not broken: when ExecBRUpdateTriggers decides to return the > trigger-returned tuple, it sticks it into a target slot like this: > > /* > * Return the modified tuple using the es_trig_tuple_slot. We assume > * the tuple was allocated in per-tuple memory context, and therefore > * will go away by itself. The tuple table slot should not try to > * clear it. > */ > TupleTableSlot *newslot = estate->es_trig_tuple_slot; > TupleDesc tupdesc = RelationGetDescr(relinfo->ri_RelationDesc); > > if (newslot->tts_tupleDescriptor != tupdesc) > ExecSetSlotDescriptor(newslot, tupdesc); > ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); > > So the slot that ExecConstraints et al will be working with contains > the relation's actual tuple descriptor, not the approximate descr > obtained by looking at the plan tlist. > > This logic is entirely gone in v12, which confirms my instinct that > there was something about Andres' slot-manipulation changes that > broke this scenario. In v12 we end up using the junkfilter's output > slot, which does not have a sufficiently accurate tupdesc to deal with > an on-disk tuple rather than one constructed by the executor. > > So I'll go see about back-patching 20d3fe900. Thanks for the back-patching! Bertrand
Hi, On 2020-11-08 12:46:44 -0500, Tom Lane wrote: > This logic is entirely gone in v12, which confirms my instinct that > there was something about Andres' slot-manipulation changes that > broke this scenario. Entirely possible :(. In my defense, it wasn't exactly obvious or documented that the fast default code relied on this ExecSetSlotDescriptor()... > In v12 we end up using the junkfilter's output > slot, which does not have a sufficiently accurate tupdesc to deal with > an on-disk tuple rather than one constructed by the executor. I really wonder if we ought to redesign the column default logic to really be a property of the Attr, instead of the constraint stuff it is now. > So I'll go see about back-patching 20d3fe900. Thanks. Is it worth adding Bertrand's testcase to the regression suite in some form? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2020-11-08 12:46:44 -0500, Tom Lane wrote: >> In v12 we end up using the junkfilter's output >> slot, which does not have a sufficiently accurate tupdesc to deal with >> an on-disk tuple rather than one constructed by the executor. > I really wonder if we ought to redesign the column default logic to > really be a property of the Attr, instead of the constraint stuff it is > now. Yeah, I don't much like treating it as a constraint either. Not quite sure that it's worth the work to move it somewhere else, though. > Is it worth adding Bertrand's testcase to the regression suite in some > form? I did add an equivalent test case. regards, tom lane