Re: query plan question, nested loop vs hash join - Mailing list pgsql-performance

From Andrey Lizenko
Subject Re: query plan question, nested loop vs hash join
Date
Msg-id CADKuZZD=6HdHUe7TmPT-5KbuvKM1xd+3f1QAafy_9kLH6H09mQ@mail.gmail.com
Whole thread Raw
In response to Re: query plan question, nested loop vs hash join  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Thanks a lot, Tom,
reducing 'random_page_cost' from 2 to 1 and increasing 'effective_cache_size' from 70% to 80% of RAM solved this at least on my virtual sandbox.
By the way, why increasing of cache only (with the same random_page_cost=2) can lead to mergejoin selection? 


On 5 October 2014 23:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrey Lizenko <lizenko79@gmail.com> writes:
> What is the reason of "Seq Scan on activities_example" in the first case?
> Is it possible to force optimizer choose the second plan without doing
>  "set enable_hashjoin = off;" ?

Disabling hashjoins altogether would be a pretty dangerous "fix".

I think the real issue here is that you have an entirely cached-in-memory
database and therefore you ought to reduce random_page_cost.  The
planner's estimates for the first query seem to more or less match reality
(on the assumption that 1 msec equals about 100 cost units on your
machine).  The cost estimates for the second one are way off though,
mainly in that the repeated indexscans are far cheaper than the planner
thinks.  Getting that cost estimate down requires reducing random_page_cost
or increasing effective_cache_size or some combination.

You can find the conventional wisdow about this sort of thing at
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

                        regards, tom lane



--
С уважением, Андрей Лизенко

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: query plan question, nested loop vs hash join
Next
From: Andrey Lizenko
Date:
Subject: Re: query plan question, nested loop vs hash join