Re: [HACKERS] Range Merge Join v1 - Mailing list pgsql-hackers

From Andrew Borodin
Subject Re: [HACKERS] Range Merge Join v1
Date
Msg-id CAJEAwVE3oEFDMUnoz_=+ut8QYVb4UtPdfPv0P2o3eauyUua8zw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Range Merge Join v1  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
2017-06-02 19:42 GMT+05:00 Jeff Davis <pgsql@j-davis.com>:
> On Tue, May 30, 2017 at 11:44 PM, Andrew Borodin <borodin@octonica.com> wrote:
>> 1. Are there any types, which could benefit from Range Merge and are
>> not covered by this patch?
>
> I thought about this for a while, and the only thing I can think of
> are range joins that don't explicitly use range types.

Let me try to write && in SQL

select * from a join b where (a.min<=b.max and a.min>=b.min) or
(a.max<=b.max and a.max>=b.min);

Quite complicated. Here user knows that min <= max, but DB don't. If
we write it precisely we get hell lot of permutations.
Here's what I think:
1. For me, this feature seems hard to implement.
2. This feature also will be hard to commit solely, since it's use
case will be rare.
3. If this could yield unexpected performance for queries like
select * from t where x<y and b>c or a!=z or [other conditions]
and optimizer could think: "Aha! I'll sort it and do it fast" it'd be cool.

I do not think range joins that don't explicitly use range types are
possible right now...

>> 2. Can Range Merge handle merge of different ranges? Like int4range()
>> && int8range() ?
>
> Right now, there aren't even casts between range types. I think the
> best way to handle that at this point would be to add casts among the
> numeric ranges. There may be advantages to supporting any two ranges
> where the contained types are part of the same opfamily, but it seems
> a little early to add that complication.
Agree.


> I think there are a couple more things that could be done if we want
> to. Let me know if you think these things should be done now, or if
> they should be a separate patch later when the need arises:
>
> * Support for r1 @> r2 joins (join on "contains" rather than "overlaps").
> * Better integration with the catalog so that users could add their
> own types that support range merge join.

1. I believe this changes can be incremental. They constitute value
for the end user, then they are committable. This patch is not overly
complicated, but it's easier to do this stuff in small pieces.
2. The commitfest is 3 months away. If you will have new versions of
the patch, I'll review again. Maybe will spot some new things :)

Best regards, Andrey Borodin, Octonica.



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] walsender & parallelism
Next
From: "Mengxing Liu"
Date:
Subject: Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scalingfrom rw-conflict tracking in serializable transactions