Thread: Re: [BUGS] BUG #8130: Hashjoin still gives issues

Re: [BUGS] BUG #8130: Hashjoin still gives issues

From
Stefan de Konink
Date:
Dear Tom,


On Wed, 1 May 2013, Tom Lane wrote:

>> What can we do to provide a bit more of information?
>
> https://wiki.postgresql.org/wiki/Slow_Query_Questions
>
> There is no particularly good reason to think this is a bug; please
> take it up on pgsql-performance if you have more questions.

I beg to disagree, the performance of a select * query and the select b.id
query are both "hot". The result in a fundamentally different query plan
(and performance). Combined with the recent bugfix regarding hash
estimation, it gives me a good indication that there might be a bug.

I am not deep into the query optimiser of PostgreSQL but given the above
same were different selections can change an entire query plan (and * is
in fact out of the box 30 times faster than b.id) it does. When hash is
disabled the entire query is -depending on the system checked- 2 to
30x faster.


The original query:

select * from ambit_privateevent_calendars as a, ambit_privateevent as b,
ambit_calendarsubscription as c, ambit_calendar as d where c.calendar_id =
d.id and a.privateevent_id = b.id and c.user_id = 1270 and  c.calendar_id
= a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, 12, 20) and not
b.main_recurrence = true;

select b.id from ambit_privateevent_calendars as a, ambit_privateevent as
b, ambit_calendarsubscription as c, ambit_calendar as d where c.calendar_id =
d.id and a.privateevent_id = b.id and c.user_id = 1270 and  c.calendar_id
= a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, 12, 20) and not
b.main_recurrence = true;

(select * => select b.id, the star query is *fastest*)

We compare:
http://explain.depesz.com/s/jRx
http://explain.depesz.com/s/eKE


By setting "set enable_hashjoin = off;" performance in our entire
application increased 30 fold in throughput, which was a bit unexpected
but highly appreciated. The result of the last query switch the mergejoin:

http://explain.depesz.com/s/AWB

It is also visible that after hashjoin is off, the b.id query is faster
than the * query (what would be expected).


Our test machine is overbudgetted, 4x the memory of the entire database
~4GB, and uses the PostgreSQL stock settings.


Stefan


Re: [BUGS] BUG #8130: Hashjoin still gives issues

From
Igor Neyman
Date:

> -----Original Message-----
>

>
> The original query:
>
> select * from ambit_privateevent_calendars as a, ambit_privateevent as
> b, ambit_calendarsubscription as c, ambit_calendar as d where
> c.calendar_id = d.id and a.privateevent_id = b.id and c.user_id = 1270
> and  c.calendar_id = a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4,
> 12, 20) and not b.main_recurrence = true;
>
> select b.id from ambit_privateevent_calendars as a, ambit_privateevent
> as b, ambit_calendarsubscription as c, ambit_calendar as d where
> c.calendar_id = d.id and a.privateevent_id = b.id and c.user_id = 1270
> and  c.calendar_id = a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4,
> 12, 20) and not b.main_recurrence = true;
>
> (select * => select b.id, the star query is *fastest*)
>
> We compare:
> http://explain.depesz.com/s/jRx
> http://explain.depesz.com/s/eKE
>
>
> By setting "set enable_hashjoin = off;" performance in our entire
> application increased 30 fold in throughput, which was a bit unexpected
> but highly appreciated. The result of the last query switch the
> mergejoin:
>
> http://explain.depesz.com/s/AWB
>
> It is also visible that after hashjoin is off, the b.id query is faster
> than the * query (what would be expected).
>
>
> Our test machine is overbudgetted, 4x the memory of the entire database
> ~4GB, and uses the PostgreSQL stock settings.
>
>
> Stefan
>

I'd suggest that you adjust Postgres configuration, specifically memory settings (buffer_cache, work_mem,
effective_cache_size),to reflect your hardware config, and see how it affects your query. 

Regards,
Igor Neyman


Re: [BUGS] BUG #8130: Hashjoin still gives issues

From
Jeff Davis
Date:
On Wed, 2013-05-01 at 17:44 +0200, Stefan de Konink wrote:
> Combined with the recent bugfix regarding hash
> estimation, it gives me a good indication that there might be a bug.

To which recent bugfix are you referring?

The best venue for fixing an issue like this is pgsql-performance -- it
doesn't make too much difference whether it's a "bug" or not.
Performance problems sometimes end up as bugs and sometimes end up being
treated more like an enhancement; but most of the progress is made on
pgsql-performance regardless.

Regards,
    Jeff Davis