Re: [HACKERS] Performance improvement for joins where outer side is unique - Mailing list pgsql-hackers

From David Rowley
Subject Re: [HACKERS] Performance improvement for joins where outer side is unique
Date
Msg-id CAKJS1f8wzzsv4m6dK1rGzi97ob7_icM7YD==EAJQRiyoRgeYcQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Performance improvement for joins where outer side is unique  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Performance improvement for joins where outer side is unique  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers


On 14 March 2017 at 07:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ getting back to this patch finally... ]

David Rowley <david.rowley@2ndquadrant.com> writes:
> I've attached a patch which implements this, though only for
> MergeJoin, else I'd imagine we'd also need to ensure all proofs used
> for testing the uniqueness were also hash-able too. I added some XXX
> comments in analyzejoin.c around the mergeopfamilies == NIL tests to
> mention that Merge Join depends on all the unique proof quals having
> mergeopfamilies.  This also assumes we'll never use some subset of
> mergejoin-able quals for a merge join, which could be an interesting
> area in the future, as we might have some btree index on a subset of
> those columns to provide pre-sorted input. In short, it's a great
> optimisation, but I'm a little scared we might break it one day.

Umm ... it's broken already isn't it?  See the comment for
generate_mergejoin_paths:

 * We generate mergejoins if mergejoin clauses are available.  We have
 * two ways to generate the inner path for a mergejoin: sort the cheapest
 * inner path, or use an inner path that is already suitably ordered for the
 * merge.  If we have several mergeclauses, it could be that there is no inner
 * path (or only a very expensive one) for the full list of mergeclauses, but
 * better paths exist if we truncate the mergeclause list (thereby discarding
 * some sort key requirements).  So, we consider truncations of the
 * mergeclause list as well as the full list.  (Ideally we'd consider all
 * subsets of the mergeclause list, but that seems way too expensive.)

There's another, more subtle, way in which it could fail in
sort_inner_and_outer():

 * Each possible ordering of the available mergejoin clauses will generate
 * a differently-sorted result path at essentially the same cost.  We have
 * no basis for choosing one over another at this level of joining, but
 * some sort orders may be more useful than others for higher-level
 * mergejoins, so it's worth considering multiple orderings.
 *
 * Actually, it's not quite true that every mergeclause ordering will
 * generate a different path order, because some of the clauses may be
 * partially redundant (refer to the same EquivalenceClasses).  Therefore,
 * what we do is convert the mergeclause list to a list of canonical
 * pathkeys, and then consider different orderings of the pathkeys.

I'm fairly sure that it's possible to end up with fewer pathkeys than
there are mergeclauses in this code path.  Now, you might be all right
anyway given that the mergeclauses must refer to the same ECs in such a
case --- maybe they're fully redundant and we can take testing the
included clause as proving the omitted one(s) too.  I'm not certain
right now what I meant by "partially redundant" in this comment.
But in any case, it's moot for the present purpose because
generate_mergejoin_paths certainly breaks your assumption.

Thanks for looking at this again. 

Yeah confirmed. It's broken. I guess I just need to remember in the Path if we got all the join quals, although I've not looked in detail what the best fix is. I imagine it'll require storing something else in the JoinPath.

Here's my test case:

select 'create tablespace slow_disk LOCATION ''' || current_setting('data_directory') || ''' WITH (random_page_cost=1000);';
\gexec
create table ab (a int not null, b int not null);
create unique index ab_pkey on ab (a,b) tablespace slow_disk;
alter table ab add constraint ab_pkey primary key using index ab_pkey;

create index ab_a_idx on ab (a);

insert into ab values(1,1);
insert into ab values(1,2);

set enable_nestloop to 0;
set enable_hashjoin to 0;
set enable_sort to 0;

explain verbose select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Merge Join  (cost=0.26..24.35 rows=1 width=16)
   Output: ab1.a, ab1.b, ab2.a, ab2.b
   Inner Unique: Yes
   Merge Cond: (ab1.a = ab2.a)
   Join Filter: (ab1.b = ab2.b)
   ->  Index Scan using ab_a_idx on public.ab ab1  (cost=0.13..12.16 rows=2 width=8)
         Output: ab1.a, ab1.b
   ->  Index Scan using ab_a_idx on public.ab ab2  (cost=0.13..12.16 rows=2 width=8)
         Output: ab2.a, ab2.b
(9 rows)

select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b; -- wrong results
 a | b | a | b 
---+---+---+---
 1 | 2 | 1 | 2
(1 row)

drop index ab_a_idx;

-- same query again. This time it'll use the PK index on the slow_disk tablespace
select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b;
 a | b | a | b 
---+---+---+---
 1 | 1 | 1 | 1
 1 | 2 | 1 | 2
(2 rows)
 
I had to fix up some conflicts before testing this again on a recent master, so I've attached my rebased version. No fix yet for the above issue


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] bytea_output vs make installcheck