Re: [HACKERS] PoC: full merge join on comparison clause - Mailing list pgsql-hackers

From Alexander Kuzmenkov
Subject Re: [HACKERS] PoC: full merge join on comparison clause
Date
Msg-id 9de15ac8-10b2-8569-4683-002db9131771@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] PoC: full merge join on comparison clause  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] PoC: full merge join on comparison clause  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
El 18/07/18 a las 16:58, Ashutosh Bapat escribió:
>
> Thanks for the commit messages. I would use word "in-equality" instead
> of "comparison" since equality is also a comparison.

Fixed.

> Comparing this with the original code, I think, is_mj_equality should be true
> if restrictinfo->mergeopfamilies is not NIL.

My mistake, fixed.

> With this work the meaning of oprcanmerge (See pg_operator catalog and also
> CREATE OPERATOR syntax) changes. Every btree operator can now be used to
> perform a merge join. oprcanmerge however only indicates whether an operator is
> an equality or not. Have you thought about that? Do we require to re-define
> oprcanmerge?

For now we can test with old oprcanmerge meaning, not to bump the 
catalog version. Merge join needs only BTORDER_PROC function, which is 
required for btree opfamilies. This means that it should be always 
possible to merge join on operators that correspond to standard btree 
strategies. We could set oprcanmerge to true for all built-in btree 
comparison operators, and leave the possibility to disable it for custom 
operators.

> I think, it should be possible to use this technique with more than one
> inequality clauses as long as all the operators require the input to be ordered
> in the same direction and the clauses are ANDed. In that case the for a given
> outer tuple the matching inner tuples form a contiguous interval.

Consider a table "t(a int, b int)", the value of each column can be 1, 
2, 3, 4 and the table contains all possible combinations. If merge 
condition is "a < 2 and b < 2", for each of the four possible sorting 
directions, the result set won't be contiguous. Generally speaking, this 
happens when we have several groups with the same value of first column, 
and the first column matches the join condition. But inside each group, 
for some rows the second column doesn't match.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: request for new parameter for disable promote (slave only mode)
Next
From: Fabien COELHO
Date:
Subject: Re: Adding a note to protocol.sgml regarding CopyData