Thread: Another way to fix inherited UPDATE/DELETE

Another way to fix inherited UPDATE/DELETE

From
Tom Lane
Date:
While contemplating the wreckage of 
https://commitfest.postgresql.org/22/1778/
I had the beginnings of an idea of another way to fix that problem.

The issue largely arises from the fact that for UPDATE, we expect
the plan tree to emit a tuple that's ready to be stored back into
the target rel ... well, almost, because it also has a CTID or some
other row-identity column, so we have to do some work on it anyway.
But the point is this means we potentially need a different
targetlist for each child table in an inherited UPDATE.

What if we dropped that idea, and instead defined the plan tree as
returning only the columns that are updated by SET, plus the row
identity?  It would then be the ModifyTable node's job to fetch the
original tuple using the row identity (which it must do anyway) and
form the new tuple by combining the updated columns from the plan
output with the non-updated columns from the original tuple.

DELETE would be even simpler, since it only needs the row identity
and nothing else.

Having done that, we could toss inheritance_planner into the oblivion
it so richly deserves, and just treat all types of inheritance or
partitioning queries as expand-at-the-bottom, as SELECT has always
done it.

Arguably, this would be more efficient even for non-inheritance join
situations, as less data (typically) would need to propagate through the
join tree.  I'm not sure exactly how it'd shake out for trivial updates;
we might be paying for two tuple deconstructions not one, though perhaps
there's a way to finesse that.  (One easy way would be to stick to the
old approach when there is no inheritance going on.)

In the case of a standard inheritance or partition tree, this seems to
go through really easily, since all the children could share the same
returned CTID column (I guess you'd also need a TABLEOID column so you
could figure out which table to direct the update back into).  It gets
a bit harder if the tree contains some foreign tables, because they might
have different concepts of row identity, but I'd think in most cases you
could still combine those into a small number of output columns.

I have no idea how this might play with the pluggable-storage work.

Obviously this'd be a major rewrite with no chance of making it into v12,
but it doesn't sound too big to get done during v13.

Thoughts?

            regards, tom lane


Re: Another way to fix inherited UPDATE/DELETE

From
Andres Freund
Date:
Hi,

On 2019-02-19 16:48:55 -0500, Tom Lane wrote:
> I have no idea how this might play with the pluggable-storage work.

I don't think it'd have a meaningful impact, except for needing changes
to an overlapping set of lines. But given the different timeframes, I'd
not expect a problem with that.

Greetings,

Andres Freund


Re: Another way to fix inherited UPDATE/DELETE

From
David Rowley
Date:
On Wed, 20 Feb 2019 at 10:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What if we dropped that idea, and instead defined the plan tree as
> returning only the columns that are updated by SET, plus the row
> identity?  It would then be the ModifyTable node's job to fetch the
> original tuple using the row identity (which it must do anyway) and
> form the new tuple by combining the updated columns from the plan
> output with the non-updated columns from the original tuple.
>
> DELETE would be even simpler, since it only needs the row identity
> and nothing else.

While I didn't look at the patch in great detail, I think this is how
Pavan must have made MERGE work for partitioned targets. I recall
seeing the tableoid being added to the target list and a lookup of the
ResultRelInfo by tableoid.

Maybe Pavan can provide more useful details than I can.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Another way to fix inherited UPDATE/DELETE

From
Amit Langote
Date:
Hi,

On 2019/02/20 6:48, Tom Lane wrote:
> While contemplating the wreckage of 
> https://commitfest.postgresql.org/22/1778/
> I had the beginnings of an idea of another way to fix that problem.
>
> The issue largely arises from the fact that for UPDATE, we expect
> the plan tree to emit a tuple that's ready to be stored back into
> the target rel ... well, almost, because it also has a CTID or some
> other row-identity column, so we have to do some work on it anyway.
> But the point is this means we potentially need a different
> targetlist for each child table in an inherited UPDATE.
> 
> What if we dropped that idea, and instead defined the plan tree as
> returning only the columns that are updated by SET, plus the row
> identity?  It would then be the ModifyTable node's job to fetch the
> original tuple using the row identity (which it must do anyway) and
> form the new tuple by combining the updated columns from the plan
> output with the non-updated columns from the original tuple.
> 
> DELETE would be even simpler, since it only needs the row identity
> and nothing else.

I had bookmarked link to an archived email of yours from about 5 years
ago, in which you described a similar attack plan for UPDATE planning:

https://www.postgresql.org/message-id/1598.1399826841%40sss.pgh.pa.us

It's been kind of in the back of my mind for a while, even considered
implementing it based on your sketch back then, but didn't have solutions
for some issues surrounding optimization of updates of foreign partitions
(see below).  Maybe I should've mentioned that on this thread at some point.

> Having done that, we could toss inheritance_planner into the oblivion
> it so richly deserves, and just treat all types of inheritance or
> partitioning queries as expand-at-the-bottom, as SELECT has always
> done it.
> 
> Arguably, this would be more efficient even for non-inheritance join
> situations, as less data (typically) would need to propagate through the
> join tree.  I'm not sure exactly how it'd shake out for trivial updates;
> we might be paying for two tuple deconstructions not one, though perhaps
> there's a way to finesse that.  (One easy way would be to stick to the
> old approach when there is no inheritance going on.)
> 
> In the case of a standard inheritance or partition tree, this seems to
> go through really easily, since all the children could share the same
> returned CTID column (I guess you'd also need a TABLEOID column so you
> could figure out which table to direct the update back into).  It gets
> a bit harder if the tree contains some foreign tables, because they might
> have different concepts of row identity, but I'd think in most cases you
> could still combine those into a small number of output columns.

Regarding child target relations that are foreign tables, the
expand-target-inheritance-at-the-bottom approach perhaps leaves no way to
allow pushing the update (possibly with joins) to remote side?

-- no inheritance
explain (costs off, verbose) update ffoo f set a = f.a + 1 from fbar b
where f.a = b.a;
                                              QUERY PLAN

──────────────────────────────────────────────────────────────────────────────────────────────────────
 Update on public.ffoo f
   ->  Foreign Update
         Remote SQL: UPDATE public.foo r1 SET a = (r1.a + 1) FROM
public.bar r2 WHERE ((r1.a = r2.a))
(3 rows)

-- inheritance
explain (costs off, verbose) update p set aa = aa + 1 from ffoo f where
p.aa = f.a;
                                                QUERY PLAN

───────────────────────────────────────────────────────────────────────────────────────────────────────────
 Update on public.p
   Update on public.p1
   Update on public.p2
   Foreign Update on public.p3
   ->  Nested Loop
         Output: (p1.aa + 1), p1.ctid, f.*
         ->  Seq Scan on public.p1
               Output: p1.aa, p1.ctid
         ->  Foreign Scan on public.ffoo f
               Output: f.*, f.a
               Remote SQL: SELECT a FROM public.foo WHERE (($1::integer = a))
   ->  Nested Loop
         Output: (p2.aa + 1), p2.ctid, f.*
         ->  Seq Scan on public.p2
               Output: p2.aa, p2.ctid
         ->  Foreign Scan on public.ffoo f
               Output: f.*, f.a
               Remote SQL: SELECT a FROM public.foo WHERE (($1::integer = a))
   ->  Foreign Update
         Remote SQL: UPDATE public.base3 r5 SET aa = (r5.aa + 1) FROM
public.foo r2 WHERE ((r5.aa = r2.a))
(20 rows)

Does that seem salvageable?

Thanks,
Amit



Re: Another way to fix inherited UPDATE/DELETE

From
Amit Langote
Date:
On 2019/02/20 10:55, Amit Langote wrote:
> Maybe I should've mentioned that on this thread at some point.

I meant the other thread where we're discussing my patches.

Thanks,
Amit



Re: Another way to fix inherited UPDATE/DELETE

From
Tom Lane
Date:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2019/02/20 6:48, Tom Lane wrote:
>> What if we dropped that idea, and instead defined the plan tree as
>> returning only the columns that are updated by SET, plus the row
>> identity?  It would then be the ModifyTable node's job to fetch the
>> original tuple using the row identity (which it must do anyway) and
>> form the new tuple by combining the updated columns from the plan
>> output with the non-updated columns from the original tuple.

> Regarding child target relations that are foreign tables, the
> expand-target-inheritance-at-the-bottom approach perhaps leaves no way to
> allow pushing the update (possibly with joins) to remote side?

That's something we'd need to think about.  Obviously, anything
along this line breaks the existing FDW update APIs, but let's assume
that's acceptable.  Is it impossible, or even hard, for an FDW to
support this definition of UPDATE rather than the existing one?
I don't think so --- it seems like it's just different --- but
I might well be missing something.

            regards, tom lane


Re: Another way to fix inherited UPDATE/DELETE

From
Pavan Deolasee
Date:


On Wed, Feb 20, 2019 at 4:23 AM David Rowley <david.rowley@2ndquadrant.com> wrote:
On Wed, 20 Feb 2019 at 10:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What if we dropped that idea, and instead defined the plan tree as
> returning only the columns that are updated by SET, plus the row
> identity?  It would then be the ModifyTable node's job to fetch the
> original tuple using the row identity (which it must do anyway) and
> form the new tuple by combining the updated columns from the plan
> output with the non-updated columns from the original tuple.
>
> DELETE would be even simpler, since it only needs the row identity
> and nothing else.

While I didn't look at the patch in great detail, I think this is how
Pavan must have made MERGE work for partitioned targets. I recall
seeing the tableoid being added to the target list and a lookup of the
ResultRelInfo by tableoid.

Maybe Pavan can provide more useful details than I can.

Yes, that's the approach I took in MERGE, primarily because of the hurdles I faced in handling partitioned tables, which take entirely different route for UPDATE/DELETE vs INSERT and in MERGE we had to do all three together. But the approach also showed significant performance improvements. UPDATE/DELETE via MERGE is far quicker as compared to regular UPDATE/DELETE when there are non-trivial number of partitions. That's also a reason why I recommended doing the same for regular UPDATE/DELETE, but that got lost in the MERGE discussions. So +1 for the approach.

We will need to consider how this affects EvalPlanQual which currently doesn't have to do anything special for partitioned tables. I solved that via tracking the expanded-at-the-bottom child in a separate mergeTargetRelation, but that approach has been criticised. May be Tom's idea doesn't have the same problem or most likely he will have a far better approach to address that. 

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Another way to fix inherited UPDATE/DELETE

From
Tom Lane
Date:
Pavan Deolasee <pavan.deolasee@gmail.com> writes:
> We will need to consider how this affects EvalPlanQual which currently
> doesn't have to do anything special for partitioned tables. I solved that
> via tracking the expanded-at-the-bottom child in a separate
> mergeTargetRelation, but that approach has been criticised. May be Tom's
> idea doesn't have the same problem or most likely he will have a far better
> approach to address that.

I did spend a few seconds thinking about that, and my gut says that
this wouldn't change anything interesting for EPQ.  But the devil
is in the details as always, so maybe working out the patch would
find problems ...

            regards, tom lane


Re: Another way to fix inherited UPDATE/DELETE

From
Amit Langote
Date:
On 2019/02/20 13:54, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 2019/02/20 6:48, Tom Lane wrote:
>>> What if we dropped that idea, and instead defined the plan tree as
>>> returning only the columns that are updated by SET, plus the row
>>> identity?  It would then be the ModifyTable node's job to fetch the
>>> original tuple using the row identity (which it must do anyway) and
>>> form the new tuple by combining the updated columns from the plan
>>> output with the non-updated columns from the original tuple.
> 
>> Regarding child target relations that are foreign tables, the
>> expand-target-inheritance-at-the-bottom approach perhaps leaves no way to
>> allow pushing the update (possibly with joins) to remote side?
> 
> That's something we'd need to think about.  Obviously, anything
> along this line breaks the existing FDW update APIs, but let's assume
> that's acceptable.  Is it impossible, or even hard, for an FDW to
> support this definition of UPDATE rather than the existing one?
> I don't think so --- it seems like it's just different --- but
> I might well be missing something.

IIUC, in the new approach, only the root of the inheritance tree (target
table specified in the query) will appear in the query's join tree, not
the child target tables, so pushing updates with joins to the remote side
seems a bit hard, because we're not going to consider child joins.  Maybe
I'm missing something though.

Thanks,
Amit



Re: Another way to fix inherited UPDATE/DELETE

From
Etsuro Fujita
Date:
(2019/02/20 6:48), Tom Lane wrote:
> In the case of a standard inheritance or partition tree, this seems to
> go through really easily, since all the children could share the same
> returned CTID column (I guess you'd also need a TABLEOID column so you
> could figure out which table to direct the update back into).  It gets
> a bit harder if the tree contains some foreign tables, because they might
> have different concepts of row identity, but I'd think in most cases you
> could still combine those into a small number of output columns.

If this is the direction we go in, we should work on the row ID issue 
[1] before this?

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/1590.1542393315%40sss.pgh.pa.us



Re: Another way to fix inherited UPDATE/DELETE

From
Tom Lane
Date:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2019/02/20 13:54, Tom Lane wrote:
>> That's something we'd need to think about.  Obviously, anything
>> along this line breaks the existing FDW update APIs, but let's assume
>> that's acceptable.  Is it impossible, or even hard, for an FDW to
>> support this definition of UPDATE rather than the existing one?
>> I don't think so --- it seems like it's just different --- but
>> I might well be missing something.

> IIUC, in the new approach, only the root of the inheritance tree (target
> table specified in the query) will appear in the query's join tree, not
> the child target tables, so pushing updates with joins to the remote side
> seems a bit hard, because we're not going to consider child joins.  Maybe
> I'm missing something though.

Hm.  Even if that's true (I'm not convinced), I don't think it's such a
significant use-case as to be considered a blocker.

            regards, tom lane


Re: Another way to fix inherited UPDATE/DELETE

From
Tom Lane
Date:
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> writes:
> (2019/02/20 6:48), Tom Lane wrote:
>> In the case of a standard inheritance or partition tree, this seems to
>> go through really easily, since all the children could share the same
>> returned CTID column (I guess you'd also need a TABLEOID column so you
>> could figure out which table to direct the update back into).  It gets
>> a bit harder if the tree contains some foreign tables, because they might
>> have different concepts of row identity, but I'd think in most cases you
>> could still combine those into a small number of output columns.

> If this is the direction we go in, we should work on the row ID issue 
> [1] before this?

Certainly, the more foreign tables can use a standardized concept of row
identity, the better this would work.  What I'm loosely envisioning is
that we have one junk row-identity column for each distinct row-identity
datatype needed --- so, for instance, all ordinary tables could share
a TID column.  Different FDWs might need different things, though one
would hope for no more than one datatype per FDW-type involved in the
inheritance tree.  Where things could break down is if we have a lot
of tables that need a whole-row-variable for row identity, and they're
all different rowtypes --- eventually we'd run out of available columns.
So we'd definitely wish to encourage FDWs to have some more efficient
row-identity scheme than that one.

I don't see that as being something that constrains those two projects
to be done in a particular order; it'd just be a nice-to-have improvement.

            regards, tom lane


Re: Another way to fix inherited UPDATE/DELETE

From
Etsuro Fujita
Date:
(2019/02/21 0:14), Tom Lane wrote:
> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  writes:
>> (2019/02/20 6:48), Tom Lane wrote:
>>> In the case of a standard inheritance or partition tree, this seems to
>>> go through really easily, since all the children could share the same
>>> returned CTID column (I guess you'd also need a TABLEOID column so you
>>> could figure out which table to direct the update back into).  It gets
>>> a bit harder if the tree contains some foreign tables, because they might
>>> have different concepts of row identity, but I'd think in most cases you
>>> could still combine those into a small number of output columns.
>
>> If this is the direction we go in, we should work on the row ID issue
>> [1] before this?
>
> Certainly, the more foreign tables can use a standardized concept of row
> identity, the better this would work.  What I'm loosely envisioning is
> that we have one junk row-identity column for each distinct row-identity
> datatype needed --- so, for instance, all ordinary tables could share
> a TID column.  Different FDWs might need different things, though one
> would hope for no more than one datatype per FDW-type involved in the
> inheritance tree.  Where things could break down is if we have a lot
> of tables that need a whole-row-variable for row identity, and they're
> all different rowtypes --- eventually we'd run out of available columns.
> So we'd definitely wish to encourage FDWs to have some more efficient
> row-identity scheme than that one.

Seems reasonable.  I guess that that can address not only the issue [1] 
but our restriction on FDW row locking that APIs for that currently only 
allow TID for row identity, in a uniform way.

> I don't see that as being something that constrains those two projects
> to be done in a particular order; it'd just be a nice-to-have improvement.

OK, thanks for the explanation!

Best regards,
Etsuro Fujita



Re: Another way to fix inherited UPDATE/DELETE

From
Amit Langote
Date:
Hi Tom,

On Wed, Feb 20, 2019 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Obviously this'd be a major rewrite with no chance of making it into v12,
> but it doesn't sound too big to get done during v13.

Are you planning to work on this?

Thanks,
Amit



Re: Another way to fix inherited UPDATE/DELETE

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> On Wed, Feb 20, 2019 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Obviously this'd be a major rewrite with no chance of making it into v12,
>> but it doesn't sound too big to get done during v13.

> Are you planning to work on this?

It's on my list, but so are a lot of other things.  If you'd like to
work on it, feel free.

            regards, tom lane



Re: Another way to fix inherited UPDATE/DELETE

From
Amit Langote
Date:
On Wed, Jul 3, 2019 at 10:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Wed, Feb 20, 2019 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Obviously this'd be a major rewrite with no chance of making it into v12,
> >> but it doesn't sound too big to get done during v13.
>
> > Are you planning to work on this?
>
> It's on my list, but so are a lot of other things.  If you'd like to
> work on it, feel free.

Thanks for the reply.  Let me see if I can get something done for the
September CF.

Regards,
Amit