Re: Declarative partitioning - another take - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Declarative partitioning - another take
Date
Msg-id eba002a2-7bda-c40d-75b2-811e571d456c@lab.ntt.co.jp
Whole thread Raw
In response to Re: Declarative partitioning - another take  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: Declarative partitioning - another take
List pgsql-hackers
On 2016/08/25 16:15, Ashutosh Bapat wrote:
> On Thu, Aug 25, 2016 at 12:22 PM, Amit Langote wrote:
>> b)
>> when accumulating append subpaths, do not flatten a subpath that is itself
>> an append when ((AppendPath *) subpath)->path.parent is a RelOptInfo with
>> non-NULL partitioning info.Is the latter somehow necessary for
>> pairwise-join considerations?
> 
> I don't think you need to do anything in the path creation code for this.
> As is it flattens all AppendPath hierarchies whether for partitioning or
> inheritance or subqueries. We should leave it as it is.

I thought it would be convenient for pairwise join code to work with the
hierarchy intact even within the AppendPath tree.  If it turns out to be
so, maybe that patch can take care of it.

>> I think I can manage to squeeze in (a) in the next version patch and will
>> also start working on (b), mainly the part about RelOptInfo getting some
>> partitioning info.
> 
> I am fine with b, where you would include some partitioning information in
> RelOptInfo. But you don't need to do what you said in (b) above.
> 
> In a private conversation Robert Haas suggested a way slightly different
> than what my patch for partition-wise join does. He suggested that the
> partitioning schemes i.e strategy, number of partitions and bounds of the
> partitioned elations involved in the query should be stored in PlannerInfo
> in the form of a list. Each partitioning scheme is annotated with the
> relids of the partitioned relations. RelOptInfo of the partitioned relation
> will point to the partitioning scheme in PlannerInfo. Along-with that each
> RelOptInfo will need to store partition keys for corresponding relation.
> This simplifies matching the partitioning schemes of the joining relations.
> Also it reduces the number of copies of partition bounds floating around as
> we expect that a query will involve multiple partitioned tables following
> similar partitioning schemes. May be you want to consider this idea while
> working on (b).

So IIUC, a partitioned relation's (baserel or joinrel) RelOptInfo has only
the information about partition keys.  They will be matched with query
restriction quals pruning away any unneeded partitions which happens
individually for each such parent baserel (within set_append_rel_size() I
suppose).  Further, two joining relations are eligible to be considered
for pairwise joining if they have identical partition keys and query
equi-join quals match the same.  The resulting joinrel will have the same
partition key (as either joining relation) and will have as many
partitions as there are in the intersection of sets of partitions of
joining rels (intersection proceeds by matching partition bounds).

"Partition scheme" structs go into a PlannerInfo list member, one
corresponding to each partitioned relation - baserel or joinrel, right?
As you say, each such struct has the following pieces of information:
strategy, num_partitions, bounds (and other auxiliary info).  Before
make_one_rel() starts, the list has one for each partitioned baserel.
After make_one_rel() has formed baserel pathlists and before
make_rel_from_joinlist() is called, are the partition scheme structs of
processed baserels marked with some information about the pruning activity
that occurred so far?  Then as we build successively higher levels of
joinrels, new entries will be made for those joinrels for which we added
pairwise join paths, with relids matching the corresponding joinrels.
Does that make sense?

Thanks,
Amit





pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [PATCH] Transaction traceability - txid_status(bigint)
Next
From: Amit Kapila
Date:
Subject: Re: WAL consistency check facility