Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006 - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006
Date
Msg-id 20200323185330.hbcizqtmtrappvoj@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006  (Andres Freund <andres@anarazel.de>)
Responses Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006
List pgsql-bugs
Hi,

On 2020-03-21 23:52:17 -0700, Andres Freund wrote:
> On 2020-03-17 19:49:43 +0900, Amit Langote wrote:
> > On Sun, Mar 15, 2020 at 9:36 AM Andres Freund <andres@anarazel.de> wrote:
> > > I don't think it's ok for ExecConstraint() to overwrite the tuple
> > > descriptor of a slot "owned" by nodeResult.c. Am I missing something, or
> > > is that broken?
> > 
> > Looking into it, that partitioning code in ExecConstraint() would not
> > "normally" overwrite a slot that is not managed by partitioning.
> > Normally, there would be a dedicated "partition" slot that would be
> > used, but only if tuple conversion from root parent to leaf partition
> > is necessary before routing the tuple.
> 
> > What is not "normal" in this case is that tuple conversion is deemed
> > necessary when converting from leaf partition to root parent whereas
> > not in the other direction.  It's because one of the partition's
> > attributes has atthasmissing set to true, which triggers this code in
> > convert_tuples_by_name():
> 
> Hm. I don't think we generally reach this path only with a "partition"
> slot? I'm looking at 11, for the purpose of this bug. The INSERT case
> "normally" is only reached with the slot returned by
> ExecPrepareTupleRouting(), true. But ExecConstraint() is also
> e.g. called from ExecUpdate() - and as far as I can tell that will be
> either the slot from the plan, or the slot from the junkfilter.

Ah - it looks like we'll usually not (never?) have to do the conversion
for ExecUpdate(), because it'll execute ExecConstraints() with the leaf
partition's ResultRelInfo.

I'm a bit confused as to what this whole thing is supposed to be doing,
tbh. The explanator comment says:
                /*
                 * If the tuple has been routed, it's been converted to the
                 * partition's rowtype, which might differ from the root
                 * table's.  We must convert it back to the root table's
                 * rowtype so that val_desc shown error message matches the
                 * input tuple.
                 */

but it's not clear at all why this is a good thing to be doing. For one,
the violation very well might be dependent on the child table's
definition. For another, we actually display the child table in the
error message:
SCHEMA NAME:  public
TABLE NAME:  other_partition
CONSTRAINT NAME:  dummy_check
so it's not clear why we'd want to show the tuple in the "root" format.

And lastly, given that we're not actually showing the input tuple
(i.e. planSlot), but rather the tuple already modified by default
values, triggers, etc, I fail to understand why this is something we
should do at all.


> > > It seems fairly insane how many places have this approximate code,
> > > btw. In 11 we have a copy of nearly the same ~40 lines each in at least
> > > ExecPartitionCheckEmitError(), ExecConstraints (twice) and
> > > ExecWitchCheckOption().
> > 
> > That is certainly bad.  I think we should get that code in one place
> > and inside ExecBuildSlotValueDescription() seems like a good place,
> > because that's what needs a converted tuple to build a string from it.
> > I have tried that in the attached -- need different patches for
> > different branches as there have been a lot of small changes in this
> > code over releases.
> 
> That certainly looks better.

One thing that concerns me with this change is that now
ExecBuildSlotValueDescription() creates a slot each time its
called. That's fine and dandy for routines that immediately throw an
error, but it's not too hard to imagine ExecBuildSlotValueDescription()
being used for other things too.


Btw, I don't think this is something we can apply to the back branches
without very good reasons - changing ExecBuildSlotValueDescription()
could break extensions. I'm only mentioning this because your cleanup
patch doesn't seem to be based on HEAD.


If we actually want to keep this conversion, I wonder if the right
answer would be to change those routines to pass in
mtstate->mt_root_tuple_slot instead (and ensure it's created when
needed). One way to achieve that would be to move
ModifyTableState->mt_root_tuple_slot to ResultRelInfo - and potentially
create it on demand?

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #16311: Zero sized WAL archive logs on DAS
Next
From: Andres Freund
Date:
Subject: Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006