Thread: [HACKERS] PoC: full merge join on comparison clause

[HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

12 May 2017, 14:09:32

Hi hackers,

As you know, at this time Postgres cannot perform a full join on a 
comparison clause. For example, if we have two tables with numeric 
columns and run a query like 'select * from t1 full join t2 on t1.a > 
t2.a', we get an error: "FULL JOIN is only supported with merge-joinable 
or hash-joinable join conditions". Such queries are legitimate SQL and 
sometimes arise when porting applications from different DBMS, so it 
would be good to support them in Postgres. They can be rewritten as 
union of right and left joins, but it requires manual work and runs 
somewhat slower (see the benchmark at the end of the letter). This 
proof-of-concept patch explores the possibility of performing such 
queries as merge joins.

Consider the following example where outer and inner relations are in 
ascending order, and we are asked to return outer tuples that are 
greater than inner.
                 outer > inner
    outer tuple -  6       4  - marked tuple
                   7       5
                   8       6  - inner tuple
                   8       7

The main difference from normal merge join is that we do not need to 
advance the marked tuple. This behavior can be implemented with some 
simple changes to the function that compares inner and outer tuples. 
However, for the join clause 'outer < inner' we would have to advance 
the marked tuple, which would require adding a new state to the merge 
join executor node. We do not do this. Instead, at the path creation 
stage, we make sure that the particular combination of sorting order and 
join clause allows us to perform merge join the simple way.

The optimizer requires some other changes to support these joins. 
Currently, it uses the same clauses to generate equivalence classes and 
to perform merge joins. This patch has to separate these two uses. 
Clauses that correspond to a btree equality operator are used to 
construct equivalence classes; the operator families for these clauses 
are recorded in the 'equivopfamilies' field of RestrictInfo struct. 
Clauses that correspond to btree equality or comparison are used to 
perform merge joins, and have their operator families recorded in the 
'mergeopfamilies'.

The optimizer also has to check whether the particular join clause list 
can be used for merge join, and ensure that it is compatible with 
inner/outer path ordering. These checks are performed by 
'can_sort_for_mergejoin()' and 'outer_sort_suitable_for_mergejoin()'.

There is an important unsolved problem in this patch. When generating 
index paths for base relations, the optimizer tries to use only one scan 
direction to limit the number of paths. This direction might not be 
suitable for a given join clause, and the input path will have to be 
sorted. We could generate paths for both directions, but this was 
specifically removed for optimization (SHA 834ddc62 by Tom Lane, 
10/27/2007 09:45 AM).

For inner joins one would expect the merge join to be slower than the 
nested loop, because it has more complex code paths, and indeed this can 
be seen on simple benchmarks (see the end of the letter). Costs should 
be revised further to reflect this difference.

I would be glad to hear your opinion on this approach.

Some benchmarks:

===== Full join vs union of left and right joins 
========================================

test1=# explain analyze select * from t4 right join t1 on t4.a < t1.a 
union all select * from t4 left join t1 on t4.a < t1.a where t1.a is null;
                                                                QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
  Append  (cost=809.69..70703.19 rows=3340000 width=8) (actual 
time=8.336..1195.534 rows=5007546 loops=1)
    ->  Merge Left Join  (cost=809.69..34230.49 rows=3333333 width=8) 
(actual time=8.335..920.442 rows=5007537 loops=1)
          Merge Cond: (t1.a > t4.a)
          ->  Index Only Scan using idx_t1_a on t1 (cost=0.28..35.27 
rows=1000 width=4) (actual time=0.027..0.395 rows=1001 loops=1)
                Heap Fetches: 97
          ->  Sort  (cost=809.39..834.39 rows=10000 width=4) (actual 
time=8.300..356.821 rows=5007538 loops=1)
                Sort Key: t4.a
                Sort Method: quicksort  Memory: 931kB
                ->  Seq Scan on t4  (cost=0.00..145.00 rows=10000 
width=4) (actual time=0.019..2.533 rows=10000 loops=1)
    ->  Nested Loop Anti Join  (cost=0.28..3072.71 rows=6667 width=8) 
(actual time=4.685..35.421 rows=9 loops=1)
          ->  Seq Scan on t4 t4_1  (cost=0.00..145.00 rows=10000 
width=4) (actual time=0.010..0.656 rows=10000 loops=1)
          ->  Index Only Scan using idx_t1_a on t1 t1_1 (cost=0.28..6.10 
rows=333 width=4) (actual time=0.003..0.003 rows=1 loops=10000)
                Index Cond: (a > t4_1.a)
                Heap Fetches: 971
  Planning time: 1.414 ms
  Execution time: 1324.985 ms
(16 rows)

test1=# explain analyze select * from t4 full join t1 on t4.a < t1.a;
                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
  Merge Full Join  (cost=809.66..34230.49 rows=3333333 width=8) (actual 
time=8.351..914.590 rows=5007546 loops=1)
    Merge Cond: (t1.a > t4.a)
    ->  Index Only Scan using idx_t1_a on t1  (cost=0.28..35.27 
rows=1000 width=4) (actual time=0.035..0.368 rows=1001 loops=1)
          Heap Fetches: 97
    ->  Sort  (cost=809.39..834.39 rows=10000 width=4) (actual 
time=8.309..347.851 rows=5007546 loops=1)
          Sort Key: t4.a
          Sort Method: quicksort  Memory: 931kB
          ->  Seq Scan on t4  (cost=0.00..145.00 rows=10000 width=4) 
(actual time=0.020..2.563 rows=10000 loops=1)
  Planning time: 1.083 ms
  Execution time: 1044.869 ms
(10 rows)


=== Merge vs nested loop ===========================================

test1=# explain analyze select * from t5 join t1 on t5.a <= t1.a;
                                                            QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=0.28..944713.00 rows=33333333 width=8) (actual 
time=0.055..8718.840 rows=50014145 loops=1)
    ->  Seq Scan on t5  (cost=0.00..1443.00 rows=100000 width=4) (actual 
time=0.019..6.428 rows=100000 loops=1)
    ->  Index Only Scan using idx_t1_a on t1  (cost=0.28..6.10 rows=333 
width=4) (actual time=0.003..0.050 rows=500 loops=100000)
          Index Cond: (a >= t5.a)
          Heap Fetches: 9147995
  Planning time: 2.209 ms
  Execution time: 9942.176 ms
(7 rows)

test1=# set enable_mergejoin TO on;
SET
test1=# explain analyze select * from t5 join t1 on t5.a <= t1.a;
                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
  Merge Join  (cost=9748.54..343618.88 rows=33333333 width=8) (actual 
time=35.491..9281.482 rows=50014145 loops=1)
    Merge Cond: (t1.a >= t5.a)
    ->  Index Only Scan using idx_t1_a on t1  (cost=0.28..35.27 
rows=1000 width=4) (actual time=0.027..0.769 rows=1001 loops=1)
          Heap Fetches: 97
    ->  Sort  (cost=9747.82..9997.82 rows=100000 width=4) (actual 
time=35.458..3906.652 rows=50014145 loops=1)
          Sort Key: t5.a
          Sort Method: quicksort  Memory: 8541kB
          ->  Seq Scan on t5  (cost=0.00..1443.00 rows=100000 width=4) 
(actual time=0.013..8.570 rows=100000 loops=1)
  Planning time: 2.368 ms
  Execution time: 10530.356 ms
(10 rows)

-- 
Alexander Kuzmenkov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

full-merge-join-v1.patch

Re: [HACKERS] PoC: full merge join on comparison clause

From

Robert Haas

Date:

16 May 2017, 18:57:14

On Fri, May 12, 2017 at 7:09 AM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> As you know, at this time Postgres cannot perform a full join on a
> comparison clause. For example, if we have two tables with numeric columns
> and run a query like 'select * from t1 full join t2 on t1.a > t2.a', we get
> an error: "FULL JOIN is only supported with merge-joinable or hash-joinable
> join conditions". Such queries are legitimate SQL and sometimes arise when
> porting applications from different DBMS, so it would be good to support
> them in Postgres. They can be rewritten as union of right and left joins,
> but it requires manual work and runs somewhat slower (see the benchmark at
> the end of the letter). This proof-of-concept patch explores the possibility
> of performing such queries as merge joins.

Interesting.  I suggest adding this to the next CommitFest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

16 May 2017, 19:18:56

On 16.05.2017 18:57, Robert Haas wrote:
> Interesting. I suggest adding this to the next CommitFest.
Thank you, added: https://commitfest.postgresql.org/14/1141/

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

25 August 2017, 19:41:25

Here is a new version of the patch, rebased to 749c7c41 and with some 
cosmetic changes.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

full-merge-join-v2.patch

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

19 September 2017, 16:18:04

Hi Alexander,

On Fri, Aug 25, 2017 at 10:11 PM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> Here is a new version of the patch, rebased to 749c7c41 and with some
> cosmetic changes.
>

I looked at this patch briefly. This is a useful feature. This isn't a
design level review of the patch. I may get back to that later. But
here are some assorted comments

The patch applies cleanly, but there are some whitespace errors.
src/backend/executor/nodeMergejoin.c:231: trailing whitespace.
+               /*
src/backend/executor/nodeMergejoin.c:232: trailing whitespace.
+                * Determine whether we accept lesser and/or equal tuples
src/backend/optimizer/path/joinpath.c:499: trailing whitespace.
+ * a merge join. A merge join executor can only choose inner values that are
src/backend/optimizer/path/joinpath.c:501: trailing whitespace.
+ * have at most one non-equality clause.

The implementation may change, so fixing the white space errors may
not be priority now. The patch compiles cleanly.

You have renamed RestrictInfo member mergeopfamilies as
equivopfamilies. I don't think that's a good name; it doesn't convey
that these are opfamilies containing merge operators. The changes in
check_mergejoinable() suggest that an operator may act as equality
operator in few operator families and comparison operator in others.
That looks odd. Actually an operator family contains operators other
than equality operators, so you may want to retain this member and add
a new member to specify whether the clause is an equality clause or
not.

In mergejoinscansel() you have just removed Assert(op_strategy ==
BTEqualStrategyNumber); Probably this function is written considering
on equality operators. But now that we are using this for all other
operators, we will need more changes to this function. That may be the
reason why INNER join in your earlier example doesn't choose right
costing.

The comment change in final_cost_mergejoin() needs more work. n1, n2,
n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 +
n2 + n3 + ... = size of inner relation is correct. In that context I
am not able to understand your change
+    * If the merge clauses contain inequality, (n1 + n2 + ...) ~=
+    * (size of inner relation)^2.

Some stylistic comments
+       switch (join_op_strategy)
+       {
+           case BTEqualStrategyNumber:
+               parent->mj_UseEqual[iClause] = true;
+               break;
+           case BTLessEqualStrategyNumber:
+               parent->mj_UseEqual[iClause] = true;
+               /* fall through */
+           case BTLessStrategyNumber:
+               parent->mj_UseLesser[iClause] = true;
+               break;
+           case BTGreaterEqualStrategyNumber:
+               parent->mj_UseEqual[iClause] = true;
+               /* fall through */
+           case BTGreaterStrategyNumber:
+               parent->mj_UseLesser[iClause] = true;
+               break;
+           default:
+               Assert(false);

Add blank lines between different cases and you may want to replace
Assert in default case into an elog(). See for example exprType() or
get_jointype_name().

+       if (sort_result < 0)
+       {
+           result = MJCR_NextOuter;
+       }

We usually do not add {} around a single statement block.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: full merge join on comparison clause

From

Daniel Gustafsson

Date:

21 September 2017, 01:46:30

> On 19 Sep 2017, at 15:18, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
>
> On Fri, Aug 25, 2017 at 10:11 PM, Alexander Kuzmenkov
> <a.kuzmenkov@postgrespro.ru> wrote:
>> Here is a new version of the patch, rebased to 749c7c41 and with some
>> cosmetic changes.
>>
>
> I looked at this patch briefly. This is a useful feature. This isn't a
> design level review of the patch. I may get back to that later. But
> here are some assorted comments

> ..

Looking forward to further review on this patch, but based on this feedback I’m
moving this to Waiting for author.

cheers ./daniel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

28 September 2017, 18:27:09

Hi Ashutosh,

Thanks for the review.

Jeff, I'm copying you because this is relevant to our discussion about what to do with mergeopfamilies when adding new merge join types.

You have renamed RestrictInfo member mergeopfamilies as
equivopfamilies. I don't think that's a good name; it doesn't convey
that these are opfamilies containing merge operators. The changes in
check_mergejoinable() suggest that an operator may act as equality
operator in few operator families and comparison operator in others.
That looks odd. Actually an operator family contains operators other
than equality operators, so you may want to retain this member and add
a new member to specify whether the clause is an equality clause or
not.

For mergeopfamilies, I'm not sure what is the best thing to do. I'll try to explain my understanding of the situation, please correct me if I'm wrong.

Before the patch, mergeopfamilies was used for two things: creating equivalence classes and performing merge joins.

For equivalence classes: we look at the restriction clauses, and if they have mergeopfamilies set, it means that these clause are based on an equality operator, and the left and right variables must be equal. To record this fact, we create an equivalence class. The variables might be equal for one equality operator and not equal for another, so we record the particular operator families to which our equality operator belongs.

For merge join: we look at the join clauses, and if they have mergeopfamilies set, it means that these clauses are based on an equality operator, and we can try performing this particular join as merge join. These opfamilies are also used beforehand to create the equivalence classes for left and right variables. The equivalence classes are used to match the join clauses to pathkeys describing the ordering of join inputs.

So, if we want to start doing merge joins for operators other than equality, we still need to record their opfamilies, but only use them for the second case and not the first. I chose to put these opfamilies to different variables, and
name the one used for equivalence classes 'equivopfamilies' and the one used for merge joins 'mergeopfamilies'. The equality operators are used for both cases, so we put their opfamilies into both of these variables.

I agree this might look confusing. Indeed, we could keep a single variable for opfamilies, and add separate flags that show how they can be used, be that for equivalence classes, merge joins, range joins or some combination of them. This is similar to what Jeff did in his range merge join patch [1]. I will think more about this and try to produce an updated patch.

In mergejoinscansel() you have just removed Assert(op_strategy ==
BTEqualStrategyNumber); Probably this function is written considering
on equality operators. But now that we are using this for all other
operators, we will need more changes to this function. That may be the
reason why INNER join in your earlier example doesn't choose right
costing.

I changed mergejoinscansel() slightly to reflect the fact that the inner relation is scanned from the beginning if we have an inequality merge clause.

The comment change in final_cost_mergejoin() needs more work. n1, n2,
n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 +
n2 + n3 + ... = size of inner relation is correct. In that context I
am not able to understand your change
+    * If the merge clauses contain inequality, (n1 + n2 + ...) ~=
+    * (size of inner relation)^2.

I extended the comment in final_cost_mergejoin(). Not sure if that approximation makes any sense, but this is the best I could think of.

Style problems are fixed.

Attached please find the new version of the patch that addresses all the review comments except mergeopfamilies.

The current commitfest is ending, but I'd like to continue working on this patch, so I am moving it to the next one.

[1] https://www.postgresql.org/message-id/flat/CAMp0ubfwAFFW3O_NgKqpRPmm56M4weTEXjprb2gP_NrDaEC4Eg%40mail.gmail.com#CAMp0ubfwAFFW3O_NgKqpRPmm56M4weTEXjprb2gP_NrDaEC4Eg@mail.gmail.com

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

full-merge-join-v3.patch

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

03 October 2017, 08:04:28

On Thu, Sep 28, 2017 at 8:57 PM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> Hi Ashutosh,
>
> Thanks for the review.
>
> Jeff, I'm copying you because this is relevant to our discussion about what
> to do with mergeopfamilies when adding new merge join types.
>
> You have renamed RestrictInfo member mergeopfamilies as
> equivopfamilies. I don't think that's a good name; it doesn't convey
> that these are opfamilies containing merge operators. The changes in
> check_mergejoinable() suggest that an operator may act as equality
> operator in few operator families and comparison operator in others.
> That looks odd. Actually an operator family contains operators other
> than equality operators, so you may want to retain this member and add
> a new member to specify whether the clause is an equality clause or
> not.
>
>
> For mergeopfamilies, I'm not sure what is the best thing to do. I'll try to
> explain my understanding of the situation, please correct me if I'm wrong.
>
> Before the patch, mergeopfamilies was used for two things: creating
> equivalence classes and performing merge joins.
>
> For equivalence classes: we look at the restriction clauses, and if they
> have mergeopfamilies set, it means that these clause are based on an
> equality operator, and the left and right variables must be equal. To record
> this fact, we create an equivalence class. The variables might be equal for
> one equality operator and not equal for another, so we record the particular
> operator families to which our equality operator belongs.
>
> For merge join: we look at the join clauses, and if they have
> mergeopfamilies set, it means that these clauses are based on an equality
> operator, and we can try performing this particular join as merge join.
> These opfamilies are also used beforehand to create the equivalence classes
> for left and right variables. The equivalence classes are used to match the
> join clauses to pathkeys describing the ordering of join inputs.
>
> So, if we want to start doing merge joins for operators other than equality,
> we still need to record their opfamilies, but only use them for the second
> case and not the first. I chose to put these opfamilies to different
> variables, and
> name the one used for equivalence classes 'equivopfamilies' and the one used
> for merge joins 'mergeopfamilies'. The equality operators are used for both
> cases, so we put their opfamilies into both of these variables.
>
> I agree this might look confusing. Indeed, we could keep a single variable
> for opfamilies, and add separate flags that show how they can be used, be
> that for equivalence classes, merge joins, range joins or some combination
> of them. This is similar to what Jeff did in his range merge join patch [1].
> I will think more about this and try to produce an updated patch.
>

I think we have (ab?)used mergeopfamilies to indicate equality
condition, which needs some changes. May be these two patches are
where we can do those changes.

>
> In mergejoinscansel() you have just removed Assert(op_strategy ==
> BTEqualStrategyNumber); Probably this function is written considering
> on equality operators. But now that we are using this for all other
> operators, we will need more changes to this function. That may be the
> reason why INNER join in your earlier example doesn't choose right
> costing.
>
>
> I changed mergejoinscansel() slightly to reflect the fact that the inner
> relation is scanned from the beginning if we have an inequality merge
> clause.
>
>
> The comment change in final_cost_mergejoin() needs more work. n1, n2,
> n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 +
> n2 + n3 + ... = size of inner relation is correct. In that context I
> am not able to understand your change
> +    * If the merge clauses contain inequality, (n1 + n2 + ...) ~=
> +    * (size of inner relation)^2.
>
>
> I extended the comment in final_cost_mergejoin(). Not sure if that
> approximation makes any sense, but this is the best I could think of.
>
> Style problems are fixed.
>
> Attached please find the new version of the patch that addresses all the
> review comments except mergeopfamilies.
>
> The current commitfest is ending, but I'd like to continue working on this
> patch, so I am moving it to the next one.
>
>

Thanks for working on the comments. I am interested to continue
reviewing it in the next commitfest.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

04 October 2017, 16:08:41

As discussed earlier, I changed the way we work with mergeopfamilies. I 
use the "is_equality" flag to indicate whether the clause is an equality 
one, and fill mergeopfamilies for both equality and inequality operators.
The updated patch is attached (rebased to 20b6552242).

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

full-merge-join-v4.patch

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

09 October 2017, 13:47:52

Hi Alexander,
Commit c7a9fa399 has added another test on mergeopfamilies. I think
your patch will need to take care of that test.

On Wed, Oct 4, 2017 at 6:38 PM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> As discussed earlier, I changed the way we work with mergeopfamilies. I use
> the "is_equality" flag to indicate whether the clause is an equality one,
> and fill mergeopfamilies for both equality and inequality operators.
> The updated patch is attached (rebased to 20b6552242).
>
>
> --
> Alexander Kuzmenkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

30 October 2017, 15:25:50

Hi,

I am attaching the updated patch, rebased to 820c03.

On 09.10.2017 13:47, Ashutosh Bapat wrote:
> Hi Alexander,
> Commit c7a9fa399 has added another test on mergeopfamilies. I think
> your patch will need to take care of that test.
>
> On Wed, Oct 4, 2017 at 6:38 PM, Alexander Kuzmenkov
> <a.kuzmenkov@postgrespro.ru> wrote:
>> As discussed earlier, I changed the way we work with mergeopfamilies. I use
>> the "is_equality" flag to indicate whether the clause is an equality one,
>> and fill mergeopfamilies for both equality and inequality operators.
>> The updated patch is attached (rebased to 20b6552242).
>>
>>
>> --
>> Alexander Kuzmenkov
>> Postgres Professional: http://www.postgrespro.com
>> The Russian Postgres Company
>>
>
>

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

full-merge-join-v5.patch

Re: [HACKERS] PoC: full merge join on comparison clause

From

Michael Paquier

Date:

30 November 2017, 04:54:53

On Mon, Oct 30, 2017 at 9:25 PM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> I am attaching the updated patch, rebased to 820c03.

(Please avoid top-posting)
This patch has rotten and conflicts with recent changes in joinrels.c.
This did not get any reviews, so I am moving it to next CF with
"waiting on author" as status.
-- 
Michael

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

04 December 2017, 19:03:12

Here is the patch rebased to a852cfe9.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

full-merge-join-v6.patch

Re: [HACKERS] PoC: full merge join on comparison clause

From

Stephen Frost

Date:

23 January 2018, 19:31:26

Greetings Alexander,

* Alexander Kuzmenkov (a.kuzmenkov@postgrespro.ru) wrote:
> Here is the patch rebased to a852cfe9.

Thanks for updating it.  This would definitely be nice to have.
Ashutosh, thanks for your previous review, would you have a chance to
look at it again?  Would be great to at least get this to ready for
committer before the end of this CF.

Thanks!

Stephen

Attachment

signature.asc

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

29 January 2018, 08:40:14

On Tue, Jan 23, 2018 at 10:01 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Greetings Alexander,
>
> * Alexander Kuzmenkov (a.kuzmenkov@postgrespro.ru) wrote:
>> Here is the patch rebased to a852cfe9.
>
> Thanks for updating it.  This would definitely be nice to have.
> Ashutosh, thanks for your previous review, would you have a chance to
> look at it again?  Would be great to at least get this to ready for
> committer before the end of this CF.

The patch contains new code and also refactors some existing code. May
be it's better to separate these two into separate patches so that
it's easy to review patches. There's lot of executor code, which I
don't understand myself. So, I won't be able to complete the review in
this CF. Sorry.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

30 January 2018, 12:31:56

On 29.01.2018 08:40, Ashutosh Bapat wrote:
>   Maybe it's better to separate these two into separate patches so that
> it's easy to review patches.
OK, I'll try doing this. For now, moving the patch entry to the next 
commitfest.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

22 February 2018, 21:42:08

Here are some updates on this patch.

I split it into two parts. The preparatory part contains some mechanical
changes to prepare for the main part. Most importantly, a new field is
added, `RestrictInfo.is_mj_equality`. It is a marker of mergejoinable
equality clauses, and `RestrictInfo.mergeopfamilies` is a more general
marker of clauses that are mergejoinable but not necessarily equality.
The usages are changed accordingly.

The main part consists of executor and planner changes required to
support inequality merge joins.

The executor changes are as described in the original post.

The planner part has changed significantly since the last version. It
used to apply some shady hacks to ensure we have the required sort
orders of inner and outer paths. Now I think I found a reasonable way to
generate the pathkeys we need. When we sort outer relation in
`sort_inner_and_outer()`, the pathkeys are generated by
`select_outer_pathkeys_for_merge()`. When we use the pathkeys we already
have for the outer relation in `match_unsorted_outer()`, mergeclauses
are selected by `find_mergeclauses_for_pathkeys()`. I changed these
functions to select the right pathkey direction for merge clauses, and
also ensure that we only have a single inequality merge clause and it is
the last one. Also, to use the index paths, I changed
`pathkeys_useful_for_merging()` to keep both pathkey directions for
inequality merge clauses.

Some basic joins work, but I couldn't properly test all the corner cases
with different orderings, because they depend on a bug in vanilla merge
joins [1].

To sum up, the preparatory and executor changes are stable, and the
planner part is WIP.

1.
https://www.postgresql.org/message-id/flat/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

02 March 2018, 17:32:32

On 22.02.2018 21:42, Alexander Kuzmenkov wrote:
>
> Some basic joins work, but I couldn't properly test all the corner 
> cases with different orderings, because they depend on a bug in 
> vanilla merge joins [1].
>

The bug was fixed, so here is the rebased patch. The planner part of the 
patch is stable now and can be reviewed, too.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

05 March 2018, 08:30:33

On Fri, Mar 2, 2018 at 8:02 PM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> On 22.02.2018 21:42, Alexander Kuzmenkov wrote:
>>
>>
>> Some basic joins work, but I couldn't properly test all the corner cases
>> with different orderings, because they depend on a bug in vanilla merge
>> joins [1].
>>
>
> The bug was fixed, so here is the rebased patch. The planner part of the
> patch is stable now and can be reviewed, too.
>

Both the patches are named 01. Their names tell the order in which
they need to be applied, so it's ok for these patches. But creating
such patches using git format-patch (with -v as some suggest) really
helps in general. All you need to do is prepare commits in your
repository, one per patch, including changes in each patch in separate
commits and then run git format-patch on that repository. I use git
format-patch @{upstream}, but there are other ways also. Then you can
use git rebase to rebase your patches periodically. If you are already
doing that, sorry for the noise.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

06 March 2018, 12:27:44

On 05.03.2018 08:30, Ashutosh Bapat wrote:
> But creating such patches using git format-patch (with -v as some suggest) really
> helps in general.
>

Thanks for the advice. I heard about this workflow, but never used it 
myself. Perhaps it's time to try it.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

06 July 2018, 16:01:08

Hi,
I have started reviewing these patches. I haven't grasped the design
yet. But here are some comments on the first patch.


-    clauses = (MergeJoinClause) palloc0(nClauses *
sizeof(MergeJoinClauseData));
+    parent->mj_Clauses = (MergeJoinClause) palloc0(nClauses *
sizeof(MergeJoinClauseData));

crosses 80 characters.

-        StrategyNumber opstrategy = mergestrategies[iClause];
+        StrategyNumber sort_strategy = mergestrategies[iClause];
-        int            op_strategy;
+        int            join_strategy;
I don't see a reason why should we change the name of variable here. These are
operator strategies and there's no need to change their names. The name change
is introducing unnecessary diffs.

+        clause->lexpr = ExecInitExpr((Expr *) linitial(qual->args),
(PlanState *) parent);
+        clause->rexpr = ExecInitExpr((Expr *) lsecond(qual->args),
(PlanState *) parent);

cross 80 characters.

 /*
@@ -378,20 +375,29 @@ MJEvalInnerValues(MergeJoinState *mergestate,
TupleTableSlot *innerslot)
     return result;
 }

+/* Tuple comparison result */
+typedef enum
+{
+    MJCR_NextInner = 1,
+    MJCR_NextOuter = -1,
+    MJCR_Join = 0
+} MJCompareResult;
+
 /*
  * MJCompare
  *
- * Compare the mergejoinable values of the current two input tuples
- * and return 0 if they are equal (ie, the mergejoin equalities all
- * succeed), >0 if outer > inner, <0 if outer < inner.
+ * Compare the mergejoinable values of the current two input tuples.
+ * If they are equal, i.e., the mergejoin equalities all succeed,
+ * return MJCR_Join, if outer > inner, MJCR_NextInner, and else
+ * MJCR_NextOuter.
  *
  * MJEvalOuterValues and MJEvalInnerValues must already have been called
  * for the current outer and inner tuples, respectively.
  */
-static int
+static MJCompareResult
 MJCompare(MergeJoinState *mergestate)
 {

I am not sure about this change as well. MJCompare()'s job is to compare given
keys in the two tuples and return the comparison result. The result was used as
it is to decide which side to advance in an equality based merge join. But for
inequality based merge join the result needs to be interpreted further. I think
we should write a wrapper around MJCompare which interprets the result rather
than changing MJCompare itself. OR at least change the name of MJCompare. The
first option is better in case we use MJCompare for purposes other than merge
join in future. I am not sure what those could be, but say a merge based union
or something like that.

     /*
      * Sweep through the existing EquivalenceClasses looking for matches to
@@ -273,7 +274,7 @@ process_equivalence(PlannerInfo *root,
         /*
          * A "match" requires matching sets of btree opfamilies.  Use of
          * equal() for this test has implications discussed in the comments
-         * for get_mergejoin_opfamilies().
+         * for get_equality_opfamilies().

I think we should leave mergejoin word in there or at least indicate that these
are btree opfamilies so that we don't confuse it with hash equality operator
families.

It will be good if you can write something about why these changes are
required in the file. If you are using git format-patch, you could
write a commit message that gets added to the patch. That way, it
leaves there for anybody to review.

I am having a difficult time reading the next patch. There are various
changes in the second patch, which I don't understand the reason
behind. I think some comments will help, in as commit message or in
the code.

I will continue reviewing the patches.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

09 July 2018, 16:12:36

On Fri, Jul 6, 2018 at 6:31 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>
> I will continue reviewing the patches.
>

Here are some more review comments

- * sort ordering for each merge key.  The mergejoinable operator is an
- * equality operator in the opfamily, and the two inputs are guaranteed to be
+ * sort ordering for each merge key.  The mergejoinable operator is a
+ * comparison operator in the opfamily, and the two inputs are guaranteed to be

I think this prologue has to change substantially. At the beginning of the
prologue it explicitly mentions clauses like leftexpr = rightexpr. That
needs to be changed.

  * ordered in either increasing or decreasing (respectively) order according

It looks like the order of inputs is constrained by the in-equality operator.
That too needs to be specified here.

  * This allows us to obtain the needed comparison function from the opfamily.
@@ -200,6 +200,9 @@ MJExamineQuals(List *mergeclauses,
         Oid            op_righttype;
         Oid            sortfunc;

+        if (parent->mj_Ineq_Present)
+            elog(ERROR, "inequality mergejoin clause must be the last one");
+

IIUC, this never happens. If it really happens, we have created a path which
can not be used practically. That should never happen. It will help to add a
comment here clarifying that situation.

+    bool        have_inequality = have_inequality_mergeclause(mergeclauses);

There will be many paths created with different ordering of pathkeys. So,
instead of calling have_inequality_mergeclause() for each of those paths, it's
better to save this status in the path itself when creating the path.

         /* Make a pathkey list with this guy first */
         if (l != list_head(all_pathkeys))
+        {
+            if (have_inequality && l == list_tail(all_pathkeys))
+                /* Inequality merge clause must be the last, we can't
move it */
+                break;
+

I am kind of baffled by this change. IIUC the way we create different orderings
of pathkeys here, we are just rotating the pathkeys in circular order. This
means there is exactly one ordering of pathkeys where the pathkey corresponding
to the inequality clause is the last one. It's only that ordering which will be
retained and all other ordering will be discarded. Instead of that, I think we
should keep the pathkey corresponding to the inequality clause at the end (or
track in separately) and create different orderings of pathkeys by rotating
other pathkeys. This will allow us to cost the orderings as intended by this
fucntion.

     /* Might have no mergeclauses */
     if (nClauses == 0)
         return NIL;

+    {
+        List *ineq_clauses = find_inequality_clauses(mergeclauses);
+
+        if (list_length(ineq_clauses) > 1)
+            return NIL;

Without this patch, when there is an inequality clause with FULL JOIN, we will
not create a merge join path because select_mergejoin_clauses() will set
mergejoin_allowed to false. This means that we won't call
sort_inner_and_outer(). I think this patch also has to do the same i.e. when
there are more than one inequality clauses, select_mergejoin_clauses() should
set mergejoin_allowed to false in case of a FULL JOIN since merge join
machinary won't be able to handle that case.

If we do that, we could arrange extra.mergeclause_list such that the inequality
clause is always at the end thus finding inequality clause would be easy.

Again, this is not full review, but I am diving deeper into the
patch-set and understanding it better. Sorry.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

09 July 2018, 21:35:00

On 07/09/2018 04:12 PM, Ashutosh Bapat wrote:
> On Fri, Jul 6, 2018 at 6:31 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> I will continue reviewing the patches.
>>
> Here are some more review comments

Ashutosh,

Many thanks for the review, I'm glad that we are continuing with this 
patch. I'm working on your comments now, will post the updated version 
this week.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

10 July 2018, 16:54:44

On Tue, Jul 10, 2018 at 12:05 AM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> On 07/09/2018 04:12 PM, Ashutosh Bapat wrote:
>>
>> On Fri, Jul 6, 2018 at 6:31 PM, Ashutosh Bapat
>> <ashutosh.bapat@enterprisedb.com> wrote:
>>>
>>> I will continue reviewing the patches.
>>>
>> Here are some more review comments
>
>
>
> Ashutosh,
>
> Many thanks for the review, I'm glad that we are continuing with this patch.
> I'm working on your comments now, will post the updated version this week.

While updating the patches, please consider adding some comments as to
why only single inequality clause supported. I didn't see comments in
the patch explaining that.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

10 July 2018, 19:36:56

I tried to fix the things you mentioned and improve the comments. Among 
other changes, there is now a description of how merge join works with 
inequalities at the top of nodeMergejoin.c. It also explains why we only 
support one inequality clause.

Some particular points:

On 07/06/2018 04:01 PM, Ashutosh Bapat wrote:
> -        StrategyNumber opstrategy = mergestrategies[iClause];
> +        StrategyNumber sort_strategy = mergestrategies[iClause];
> -        int            op_strategy;
> +        int            join_strategy;
> I don't see a reason why should we change the name of variable here. These are
> operator strategies and there's no need to change their names. The name change
> is introducing unnecessary diffs.

These variables have different meaning but their names differ only with 
an underscore. When I had to change this function, I made mistakes 
because of this. I'd keep the descriptive names to avoid further 
confusion. Should this be a separate patch?

> I think we should write a wrapper around MJCompare which interprets the result rather
> than changing MJCompare itself. OR at least change the name of MJCompare.

Renamed the function to MJTestTuples to reflect that it decides whether 
we join tuples or advance either side.

> -         * for get_mergejoin_opfamilies().
> +         * for get_equality_opfamilies().
>
> I think we should leave mergejoin word in there or at least indicate that these
> are btree opfamilies so that we don't confuse it with hash equality operator
> families.

Renamed these to get_btree_equality_opfamilies() and 
get_btree_comparison_opfamilies().

> +        if (parent->mj_Ineq_Present)
> +            elog(ERROR, "inequality mergejoin clause must be the last one");
> +
>
> IIUC, this never happens. If it really happens, we have created a path which
> can not be used practically. That should never happen. It will help to add a
> comment here clarifying that situation.

This is just a cross-check for the planner. Added a comment. We should 
probably use a separate error code for internal errors as opposed to 
user errors, but I'm not sure if we have one, I see just elog(ERROR) 
being used everywhere.

>
> +    bool        have_inequality = have_inequality_mergeclause(mergeclauses);
>
> There will be many paths created with different ordering of pathkeys. So,
> instead of calling have_inequality_mergeclause() for each of those paths, it's
> better to save this status in the path itself when creating the path.

I removed this function altogether, because we can just check the last 
merge clause. When we cost the path, we already have a proper 
mergejoinable list of clauses, so if there is an inequality clause, it's 
the last one.

>           /* Make a pathkey list with this guy first */
>           if (l != list_head(all_pathkeys))
> +        {
> +            if (have_inequality && l == list_tail(all_pathkeys))
> +                /* Inequality merge clause must be the last, we can't
> move it */
> +                break;
> +
>
> I am kind of baffled by this change. IIUC the way we create different orderings
> of pathkeys here, we are just rotating the pathkeys in circular order. This
> means there is exactly one ordering of pathkeys where the pathkey corresponding
> to the inequality clause is the last one.

This code does not rotate the pathkeys circularly, but puts each of them 
in the first position, and keeps the rest in the original order.
Say, if we have three equality pathkeys, and one inequality pathkey at 
the end (let's denote them as E1, E2, E3, IE), the permutations it tries 
will be like this:
E1 E2 E3 IE
E2 E1 E3 IE
E3 E1 E2 IE
Does this sound right?

>       /* Might have no mergeclauses */
>       if (nClauses == 0)
>           return NIL;
>
> +    {
> +        List *ineq_clauses = find_inequality_clauses(mergeclauses);
> +
> +        if (list_length(ineq_clauses) > 1)
> +            return NIL;
>
> Without this patch, when there is an inequality clause with FULL JOIN, we will
> not create a merge join path because select_mergejoin_clauses() will set
> mergejoin_allowed to false. This means that we won't call
> sort_inner_and_outer(). I think this patch also has to do the same i.e. when
> there are more than one inequality clauses, select_mergejoin_clauses() should
> set mergejoin_allowed to false in case of a FULL JOIN since merge join
> machinary won't be able to handle that case.
>
> If we do that, we could arrange extra.mergeclause_list such that the inequality
> clause is always at the end thus finding inequality clause would be easy.

I changed select_mergejoin_clauses() to filter multiple inequality 
clauses and disable join if needed. Now we can use extra inequalities as 
join filter, if it's not full join. I didn't reorder 
extra.mergeclause_list there, because this order is ignored later. 
select_outer_pathkeys_for_merge() chooses the order of pathkeys using 
some heuristics, and then find_mergeclauses_for_outer_pathkeys() 
reorders the clauses accordingly.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [HACKERS] PoC: full merge join on comparison clause

From

Ashutosh Bapat

Date:

18 July 2018, 16:58:58

On Tue, Jul 10, 2018 at 10:06 PM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> I tried to fix the things you mentioned and improve the comments. Among
> other changes, there is now a description of how merge join works with
> inequalities at the top of nodeMergejoin.c. It also explains why we only
> support one inequality clause.

Thanks for the commit messages. I would use word "in-equality" instead
of "comparison" since equality is also a comparison.

>
> Some particular points:
>
> On 07/06/2018 04:01 PM, Ashutosh Bapat wrote:
>>
>> -        StrategyNumber opstrategy = mergestrategies[iClause];
>> +        StrategyNumber sort_strategy = mergestrategies[iClause];
>> -        int            op_strategy;
>> +        int            join_strategy;
>> I don't see a reason why should we change the name of variable here. These
>> are
>> operator strategies and there's no need to change their names. The name
>> change
>> is introducing unnecessary diffs.
>
>
> These variables have different meaning but their names differ only with an
> underscore. When I had to change this function, I made mistakes because of
> this. I'd keep the descriptive names to avoid further confusion. Should this
> be a separate patch?

No, 0001 suffice. But I am still not sure that the variable name
change is worth the trouble. Anyway, will leave this for a committer
to judge.

>
> This is just a cross-check for the planner. Added a comment. We should
> probably use a separate error code for internal errors as opposed to user
> errors, but I'm not sure if we have one, I see just elog(ERROR) being used
> everywhere.

elog(ERROR) is fine. Thanks for the comments.

-    if (op_mergejoinable(opno, exprType(leftarg)) &&
-        !contain_volatile_functions((Node *) clause))
-        restrictinfo->mergeopfamilies = get_mergejoin_opfamilies(opno);
+    if (!contain_volatile_functions((Node *) clause))
+    {
+        if (op_mergejoinable_equality(opno, exprType(leftarg)))

Why is this condition split. Also why is the change in the order of conditions?

+        {
+            restrictinfo->mergeopfamilies =
get_btree_equality_opfamilies(opno);
+            restrictinfo->is_mj_equality = true;

Comparing this with the original code, I think, is_mj_equality should be true
if restrictinfo->mergeopfamilies is not NIL. There is no way that a clause can
act as an equality clause when there are no families in which the operator is
an equality operator. If restrictinfo->mergeopfamilies can not be NIL here,
probably we should add an Assert and a bit of explanation as to why
is_mj_equality is true.

With this work the meaning of oprcanmerge (See pg_operator catalog and also
CREATE OPERATOR syntax) changes. Every btree operator can now be used to
perform a merge join. oprcanmerge however only indicates whether an operator is
an equality or not. Have you thought about that? Do we require to re-define
oprcanmerge?

+ *
+ *         If the inequality clause is not the last one, or if there
are several
+ *         of them, this algorithm doesn't work, because it is not possible to
+ *         sort the inputs in such a way that given an outer tuple,
the matching
+ *         inner tuples form a contiguous interval.

I think, it should be possible to use this technique with more than one
inequality clauses as long as all the operators require the input to be ordered
in the same direction and the clauses are ANDed. In that case the for a given
outer tuple the matching inner tuples form a contiguous interval.

I think it's better to straighten out these things before diving
further into the 2nd patch.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

30 July 2018, 20:13:21

El 18/07/18 a las 16:58, Ashutosh Bapat escribió:
>
> Thanks for the commit messages. I would use word "in-equality" instead
> of "comparison" since equality is also a comparison.

Fixed.

> Comparing this with the original code, I think, is_mj_equality should be true
> if restrictinfo->mergeopfamilies is not NIL.

My mistake, fixed.

> With this work the meaning of oprcanmerge (See pg_operator catalog and also
> CREATE OPERATOR syntax) changes. Every btree operator can now be used to
> perform a merge join. oprcanmerge however only indicates whether an operator is
> an equality or not. Have you thought about that? Do we require to re-define
> oprcanmerge?

For now we can test with old oprcanmerge meaning, not to bump the 
catalog version. Merge join needs only BTORDER_PROC function, which is 
required for btree opfamilies. This means that it should be always 
possible to merge join on operators that correspond to standard btree 
strategies. We could set oprcanmerge to true for all built-in btree 
comparison operators, and leave the possibility to disable it for custom 
operators.

> I think, it should be possible to use this technique with more than one
> inequality clauses as long as all the operators require the input to be ordered
> in the same direction and the clauses are ANDed. In that case the for a given
> outer tuple the matching inner tuples form a contiguous interval.

Consider a table "t(a int, b int)", the value of each column can be 1, 
2, 3, 4 and the table contains all possible combinations. If merge 
condition is "a < 2 and b < 2", for each of the four possible sorting 
directions, the result set won't be contiguous. Generally speaking, this 
happens when we have several groups with the same value of first column, 
and the first column matches the join condition. But inside each group, 
for some rows the second column doesn't match.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [HACKERS] PoC: full merge join on comparison clause

From

Tom Lane

Date:

19 November 2018, 01:46:40

Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:
> [ Inequality-merge-join-v10.patch ]

Just thinking about this patch a bit ... I wonder why you were so quick to
reject the UNION approach at the outset.  This patch is pretty messy, and
it complicates a lot of stuff that is quite fundamental to the planner,
and you still end up that the only functionality gain is now we can handle
full joins whose conditions include a single btree inequality clause.
Nor are we doing that remarkably efficiently ... it's pretty much
impossible to do it efficiently, in fact, since if the inputs have M and N
rows respectively then the output will have something like (M*N)/2 rows.

So it seems to me that if we're going to put sweat into this area at all,
our ambition ought to be "we'll successfully perform a FULL JOIN with any
join clause whatsoever, though it might take O(M*N) time".

Now as far as I can tell, the UNION substitution you proposed is
completely valid, although it'd be better to phrase the second step
as an antijoin.  That is, I believe

select * from t1 full join t2 on (anything)

is exactly equal to

select t1.*, t2.* from t1 left join t2 on (anything)
union all
select t1.*, t2.* from t2 anti join t1 on (anything)

There is one fly in the ointment, which is that we will have to run the
join clause twice, so it can't contain volatile functions --- but the
merge join approach wouldn't handle that case either.

Having to read the inputs twice is not good, but we could put them
into CTEs, which fixes any problems with volatility below the join
and at least alleviates the performance problem.  Since we can't
currently do any meaningful qual pushdown through full joins, the
optimization-fence aspect of a CTE doesn't seem like an issue either.

In short, proceeding like the above when we can't find another plan
type for a full join seems like it fixes a far wider variety of cases.
The possibility that maybe we could do some of those cases a bit faster
isn't sufficiently attractive to me to justify also putting in a
mechanism like this patch proposes.  We only rarely see complaints at
all about can't-do-a-full-join problems, and I do not think this patch
would fix enough of those complaints to be worthwhile.

            regards, tom lane

Re: [HACKERS] PoC: full merge join on comparison clause

From

Alexander Kuzmenkov

Date:

21 November 2018, 13:02:47

On 11/19/18 04:46, Tom Lane wrote:
> In short, proceeding like the above when we can't find another plan
> type for a full join seems like it fixes a far wider variety of cases.
> The possibility that maybe we could do some of those cases a bit faster
> isn't sufficiently attractive to me to justify also putting in a
> mechanism like this patch proposes.  We only rarely see complaints at
> all about can't-do-a-full-join problems, and I do not think this patch
> would fix enough of those complaints to be worthwhile.


I agree, the automated UNION substitutions seems to be a better 
approach. I'll mark this patch as rejected then.


-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company