Home > mailing lists

Re: Ambigous Plan - Larger Table on Hash Side - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Ambigous Plan - Larger Table on Hash Side
Date	March 12, 2018 20:43:14
Msg-id	20180312174314.fehtgox5qr4lfqp6@alap3.anarazel.de Whole thread Raw
In response to	Re: Ambigous Plan - Larger Table on Hash Side (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Ambigous Plan - Larger Table on Hash Side
List	pgsql-hackers

Tree view

On 2018-03-12 12:52:00 -0400, Tom Lane wrote:
> Narendra Pradeep U U <narendra.pradeep@zohocorp.com> writes:
> >       Recently I came across a case where the planner choose larger table on hash side. I am not sure whether it is
anintended  behavior or we are missing something. 

> 
> Probably the reason is that the smaller table has a less uniform
> distribution of the hash key.  You don't want to hash with a nonuniform
> distribution of the hashtable key; if many keys go into the same bucket
> then performance degrades drastically.

Not sure I follow. Unless the values are equivalent (i.e. duplicate key
values), why should non-uniformity in key space translate to hash space?
And if there's duplicates it shouldn't hurt much either, unless doing
a semi/anti-join? All rows are going to be returned and IIRC we quite
cheaply continue a bucket scan?

Greetings,

Andres Freund

pgsql-hackers by date:

From: stalkthetiger
Date: 12 March 2018, 20:21:59
Subject: Re: All Taxi Services need Index Clustered Heap Append

From: Tom Lane
Date: 12 March 2018, 20:56:24
Subject: Re: CURRENT OF causes an error when IndexOnlyScan is used

Re: Ambigous Plan - Larger Table on Hash Side - Mailing list pgsql-hackers

Previous

Next