Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CA+TgmoaiysOESNWQQAseUP3sBijrmfEd_ZKABBCHH4XrV7P-vA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Rafia Sabih <rafia.sabih@enterprisedb.com>)
List pgsql-hackers
On Fri, Mar 17, 2017 at 8:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> While I was studying what you did with reparameterize_path_by_child(),
> I started to wonder whether reparameterize_path() doesn't need to
> start handling join paths.  I think it only handles scan paths right
> now because that's the only thing that can appear under an appendrel
> created by inheritance expansion, but you're changing that.  Maybe
> it's not critical -- I think the worst consequences of missing some
> handling there is that we won't consider a parameterized path in some
> case where it would be advantageous to do so.  Still, you might want
> to investigate a bit.

I spent a fair amount of time this weekend musing over
reparameterize_path_by_child().  I think a key question for this patch
- as you already pointed out - is whether we're happy with that
approach.  When we discover that we want to perform a partitionwise
parameterized nestloop, and therefore that we need the paths for each
inner appendrel to get their input values from the corresponding outer
appendrel members rather than from the outer parent, we've got two
choices.  The first is to do what the patch actually does, which is to
build a new path tree for the nestloop inner path parameterized by the
appropriate childrel.  The second is to use the existing paths, which
are parameterized by the parent rel, and then somehow allow make that
work.  For example, you can imagine that create_plan_recurse() could
pass down a list of parameterized nestloops above the current point in
the path tree, and a parent-child mapping for each, and then we could
try to substitute everything while actually generating the plan
instead of creating paths sooner.  Which is better?

It would be nice to hear opinions from anyone else who cares, but
after some thought I think the approach you've picked is probably
better, because it's more like what we do already.  We have existing
precedent for reparameterizing a path, but none for allowing a Var for
one relation (the parent) to in effect refer to another relation (the
child).

That having been said, having try_nestloop_path() perform the
reparameterization at the very top of the function seems quite
undesirable.  You're creating a new path there before you know whether
it's going to be rejected by the invalid-parameterization test and
also before you know whether initial_cost_nestloop is going to reject
it.  It would be much better if you could find a way to postpone the
reparameterization until after those steps, and only do it if you're
going to try add_path().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Next
From: Jim Nasby
Date:
Subject: Re: [HACKERS] Re: Improve OR conditions on joined columns (commonstar schema problem)