Re: [HACKERS] Partitioned tables and relfilenode - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partitioned tables and relfilenode
Date
Msg-id CA+Tgmobxn+oJisQTMKK9hcO5NorSPw=WxAqPhoGBrCg8HcMpKg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partitioned tables and relfilenode  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Tue, Mar 21, 2017 at 1:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> The decision not to require the attribute numbers to match doesn't
>> necessarily mean we can't get rid of the Append node, though.  First
>> of all, in a lot of practical cases the attribute numbers will all
>> match.  Second, if they don't, the most that would be required is a
>> projection step, which could usually be done without a separate node
>> because most nodes are projection-capable.  And maybe not even that
>> much is needed; I'd have to go back and look at what Tom was worried
>> about the last time this came up.  (Hmm, maybe the problem had to do
>> with varnos matching, rather then attribute numbers?)
>
> There used to be some code there to fix them up, not sure where that went.

Me neither.  To be clear in case I haven't been already, I'm totally
fine with somebody doing the work to get rid of the Append node; I
just think it'll take some investigation and work that hasn't been
done yet.

(I'm also a little skeptical about the value of the work.  The Append
node doesn't cost much; what's expensive is that the planner isn't
smart about planning queries that involve Append nodes and so getting
rid of one can improve the whole plan shape.  But I think the answer
to that problem is optimizations like partition-wise join and
partition-wise aggregate, which can handle cases where an Append has
any number of surviving children.  Eliminating the Append only helps
when the number of surviving children is exactly one.  Now, that's not
to say I'm going to fight a patch if somebody writes one, but I think
to some extent it's just a band-aid.)

>> Another and independent problem with eliding the Append node is that,
>> if we did that, we'd still have to guarantee that the parent relation
>> corresponding to the Append node got locked somehow.  Otherwise, we'll
>> be accessing the tuple routing information for a table on which we
>> don't have a lock.  That's probably a solvable problem, too, but it
>> hasn't been solved yet.
>
> Hmm, why would we need to access tuple routing information?

I didn't state that very well.  It's not so much that we need access
to the tuple routing information as that we need to replan if it
changes, because if the tuple routing information changes then we
might need to include partitions that were previously being pruned.
If we haven't got some kind of a lock on the parent, I'm pretty sure
that's not going to work reliably.  Synchronization of invalidation
traffic relies on DDL statements holding a lock that conflicts with
the lock held by the process using the table; if there is no such
lock, we might fail to notice that we need to replan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)