Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why? - Mailing list pgsql-performance

From Tom Lane
Subject Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Date
Msg-id 1532.1287459825@sss.pgh.pa.us
Whole thread Raw
In response to HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?  (Scott Carey <scott@richrelevance.com>)
Responses Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
List pgsql-performance
Scott Carey <scott@richrelevance.com> writes:
> I consistently see HashJoin plans that hash the large table, and scan
> the small table.

Could we see a self-contained test case?  And what cost parameters are
you using, especially work_mem?

> This is especially puzzling in some cases where I have 30M rows in the big table and ~ 100 in the small... shouldn't
ithash the small table and scan the big one? 

Well, size of the table isn't the only factor; in particular, a highly
nonuniform distribution of the key value will inflate the cost estimate
for using a table on the inner size of the hash.  But the example you
show here seems a bit extreme.

            regards, tom lane

pgsql-performance by date:

Previous
From: Jon Nelson
Date:
Subject: Re: unexpected query failure: ERROR: GIN indexes do not support whole-index scans
Next
From: AI Rumman
Date:
Subject: Re: how to get the total number of records in report