Re: Poor performance on 9.4.4 when matching multiple columns from left side of LATERAL join - Mailing list pgsql-general

From Steven Grimm
Subject Re: Poor performance on 9.4.4 when matching multiple columns from left side of LATERAL join
Date
Msg-id 56476632.1020106@thesegovia.com
Whole thread Raw
In response to Re: Poor performance on 9.4.4 when matching multiple columns from left side of LATERAL join  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Poor performance on 9.4.4 when matching multiple columns from left side of LATERAL join
List pgsql-general

November 14, 2015 at 12:32 AM
The problem is that the optimizer is unable to use hash join or merge joins when you have the IN() condition as the join condition, the reason for this is that you're effectively saying to join on any of 3 conditions: settings.owner_id = mid.id1 OR settings.owner_id = mid.id2 OR settings.owner_id = mid.id3. If you think how a hash join works then 1 hash table is no good here, as you effectively have 3 possible keys for the hash table, the executor would have to build 3 tables to make that possible, but we only ever build 1 in PostgreSQL. As you may know, a hash table is a very efficient data structure for key value lookups, but there can only be a single key.  Merge join has the same problem because it's only possible to have a single sort order.

Thanks, that's the key thing I was missing. I was expecting it to see that there were exactly three conditions (as opposed to a variable number) and evaluate "build 3 hash tables" as a possible execution plan. Knowing that it doesn't do that completely explains the behavior I'm seeing.

It is puzzling that if, as suggested by someone else in the thread, I expand IN(a,b,c) to (x = a OR x = b OR x = c) it gets substantially faster, though still obviously falls afoul of the problem you describe above (~4 seconds instead of ~6 seconds). Should those two be equivalent?

-Steve

pgsql-general by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: fast refresh materialized view
Next
From: Tom Lane
Date:
Subject: Re: Poor performance on 9.4.4 when matching multiple columns from left side of LATERAL join