Re: [HACKERS] transition table behavior with inheritance appearsbroken (was: Declarative partitioning - another take) - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] transition table behavior with inheritance appearsbroken (was: Declarative partitioning - another take)
Date
Msg-id CAEepm=0U2K4GcWShj0hi5NqF0T4y1g2Gf1k1r7H8Hb=FvCh2gQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] transition table behavior with inheritance appearsbroken (was: Declarative partitioning - another take)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] transition table behavior with inheritance appearsbroken (was: Declarative partitioning - another take)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, May 4, 2017 at 4:02 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> I suspect that most users would find it more useful to capture all of
>> the rows that the statement actually touched, regardless of whether
>> they hit the named table or an inheritance child.
>
> Yes, agreed.  For the plain inheritance cases each row would need to
> have an indicator of which relation it comes from (tableoid); I'm not
> sure if such a thing would be useful in the partitioning case.

On Thu, May 4, 2017 at 4:26 AM, David Fetter <david@fetter.org> wrote:
> +1 on the not-duct-tape view of partitioned tables.

Hmm.  Ok.  Are we talking about PG10 or PG11 here?  Does this approach
makes sense?

1.  Remove the prohibition on creating transition-capturing triggers
on a partitioned table.

2.  Make sure that the ExecAR* functions call AfterTriggerSaveEvent
when modifying partition tables if the explicitly named parent
partitioned table has after triggers with transition tables.  Not sure
how exactly how but doesn't seem difficult.

3.  Convert tuples to the TupleDesc of the relation that owns the
statement trigger (ie the partitioned table) when inserting them into
the tuplestore.  One way to do that might be to build an array of
TupleConversionMap objects that does the opposite of the conversions
done by tup_conv_maps.  While tup_conv_maps is for converting tuples
to the layout needed for a partition, tup_unconv_maps (or better name)
would be for converting the old and new tuples to the TupleDesc of the
partitioned table.  Then the appropriate TupleConversionMap could be
passed into the ExecAR* functions as a new argument 'transition_map'.
AfterTriggerSaveEvent would use 'oldtup' and 'newtup' directly for ROW
triggers, but convert using the passed in map if it needs to insert
them into the transition tuplestores.

The same thing could work for inheritance, if tupconvert.c had a new
kind of conversion that allows slicing of tuples (converting a wider
child table's tuples to the parent's subset of columns) rather the
just conversion between logically equivalent TupleDescs.

To avoid the whiff of duct tape, we'd probably also want to make ROW
triggers created on the partitioned table(s) containing partition to
fire too, with appropriate TypeConversionMap treatment.  Not sure what
exactly is involved there.

On the other hand, doing that with inheritance hierarchies would be an
incompatible behavioural change, which I guess people don't want -- am
I right?

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Thomas Kellerer
Date:
Subject: Re: [HACKERS] CTE inlining
Next
From: Prabhat Sahu
Date:
Subject: Re: [HACKERS] delta relations in AFTER triggers