Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
Date
Msg-id 31895.1504269332@localhost
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
List pgsql-hackers
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

> I originally thought to provide it along-with the changes to
> expand_inherited_rtentry(), but that thread is taking longer. Jeevan
> Chalke needs rebased patches for his work on aggregate pushdown and
> Thomas might need them for further review. So, here they are.

Since I have related patch in the current commitfest
(https://commitfest.postgresql.org/14/1247/), I spent some time reviewing your
patch:

* generate_partition_wise_join_paths()

Right parenthesis is missing in the prologue.


* get_partitioned_child_rels_for_join()

I think the Assert() statement is easier to understand inside the loop, see
the assert.diff attachment.


* have_partkey_equi_join()

As the function handles generic join, this comment doesn't seem to me
relevant:

    /*
     * The equi-join between partition keys is strict if equi-join between
     * at least one partition key is using a strict operator. See
     * explanation about outer join reordering identity 3 in
     * optimizer/README
     */
    strict_op = op_strict(opexpr->opno);

And I think the function can return true even if strict_op is false for all
the operators evaluated in the loop.


* match_expr_to_partition_keys()

I'm not sure this comment is clear enough:

    /*
     * If it's a strict equi-join a NULL partition key on one side will
     * not join a NULL partition key on the other side. So, rows with NULL
     * partition key from a partition on one side can not join with those
     * from a non-matching partition on the other side. So, search the
     * nullable partition keys as well.
     */
    if (!strict_op)
        continue;

My understanding of the problem of NULL values generated by outer join is:
these NULL values --- if evaluated by non-strict expression --- can make row
of N-th partition on one side of the join match row(s) of *other than* N-th
partition(s) on the other side. Thus the nullable input expressions may only
be evaluated by strict operators. I think it'd be clearer if you stressed that
(undesired) *match* of partition keys can be a problem, rather than mismatch.

If you insist on your wording, then I think you should at least move the
comment below to the part that only deals with strict operators.


* There are several places where lfirst_node() macro should be used. For
  example

rel = lfirst_node(RelOptInfo, lc);

instead of

rel = (RelOptInfo *) lfirst(lc);


* map_and_merge_partitions()

Besides a few changes proposed in map_and_merge_partitions.diff (a few of them
to suppress compiler warnings) I think that this part needs more thought:

    {
            Assert(mergemap1[index1] != mergemap2[index2] &&
                       mergemap1[index1] >= 0 && mergemap2[index2] >= 0);

            /*
             * Both the partitions map to different merged partitions. This
             * means that multiple partitions from one relation matches to one
             * partition from the other relation. Partition-wise join does not
             * handle this case right now, since it requires ganging multiple
             * partitions together (into one RelOptInfo).
             */
            merged_index = -1;
    }

I could hit this path with the following test:

CREATE TABLE a(i int) PARTITION BY LIST(i);
CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2);
CREATE TABLE b(j int) PARTITION BY LIST(j);
CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2);

SET enable_partition_wise_join TO on;

SELECT  *
FROM    a
        FULL JOIN
        b ON i = j;

I don't think there's a reason not to join a_0 partition to b_0, is there?

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: [HACKERS] [PATCH] Off-by-one error in logical slot resource retention
Next
From: Aleksander Alekseev
Date:
Subject: Re: [HACKERS] [PATCH] Improve geometric types