Re: Declarative partitioning - another take - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Declarative partitioning - another take
Date
Msg-id CAFjFpReiJS2gw6z4mS6K5Aero3fFpCZHXcypExYouLPidQsgcw@mail.gmail.com
Whole thread Raw
In response to Re: Declarative partitioning - another take  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers



>
> I don't think you need to do anything in the path creation code for this.
> As is it flattens all AppendPath hierarchies whether for partitioning or
> inheritance or subqueries. We should leave it as it is.

I thought it would be convenient for pairwise join code to work with the
hierarchy intact even within the AppendPath tree.  If it turns out to be
so, maybe that patch can take care of it.

Partition-wise join work with RelOptInfos, so it's fine if the AppendPath hierarchy is flattened out. We need the RelOptInfo hierarchy though.
 

>> I think I can manage to squeeze in (a) in the next version patch and will
>> also start working on (b), mainly the part about RelOptInfo getting some
>> partitioning info.
>
> I am fine with b, where you would include some partitioning information in
> RelOptInfo. But you don't need to do what you said in (b) above.
>
> In a private conversation Robert Haas suggested a way slightly different
> than what my patch for partition-wise join does. He suggested that the
> partitioning schemes i.e strategy, number of partitions and bounds of the
> partitioned elations involved in the query should be stored in PlannerInfo
> in the form of a list. Each partitioning scheme is annotated with the
> relids of the partitioned relations. RelOptInfo of the partitioned relation
> will point to the partitioning scheme in PlannerInfo. Along-with that each
> RelOptInfo will need to store partition keys for corresponding relation.
> This simplifies matching the partitioning schemes of the joining relations.
> Also it reduces the number of copies of partition bounds floating around as
> we expect that a query will involve multiple partitioned tables following
> similar partitioning schemes. May be you want to consider this idea while
> working on (b).

So IIUC, a partitioned relation's (baserel or joinrel) RelOptInfo has only
the information about partition keys.  They will be matched with query
restriction quals pruning away any unneeded partitions which happens
individually for each such parent baserel (within set_append_rel_size() I
suppose).  Further, two joining relations are eligible to be considered
for pairwise joining if they have identical partition keys and query
equi-join quals match the same.  The resulting joinrel will have the same
partition key (as either joining relation) and will have as many
partitions as there are in the intersection of sets of partitions of
joining rels (intersection proceeds by matching partition bounds).

"Partition scheme" structs go into a PlannerInfo list member, one
corresponding to each partitioned relation - baserel or joinrel, right?

Multiple relations (base or join) can share Partition Scheme if they are partitioned the same way. Each partition scheme also stores the relids of the base relations partitioned by that scheme.
 
As you say, each such struct has the following pieces of information:
strategy, num_partitions, bounds (and other auxiliary info).  Before
make_one_rel() starts, the list has one for each partitioned baserel.
After make_one_rel() has formed baserel pathlists and before
make_rel_from_joinlist() is called, are the partition scheme structs of
processed baserels marked with some information about the pruning activity
that occurred so far?

Right now pruned partitions are labelled as dummy rels (empty appent paths). That's enough to detect a pruned partition. I haven't found a need to label partitioning scheme with pruned partitions for partition-wise join.
 
Then as we build successively higher levels of
joinrels, new entries will be made for those joinrels for which we added
pairwise join paths, with relids matching the corresponding joinrels.
Does that make sense?


I don't think we will make any new partition scheme entry in PlannerInfo after all the base relations have been considered. Partitionin-wise join will pick the one suitable for the given join. But in case partition-wise join needs to make new entries, I will take care of that in my patch.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: pg_basebackup stream xlog to tar
Next
From: Pavel Stehule
Date:
Subject: new gcc warning