Thread: Re: [BUGS] BUG #8130: Hashjoin still gives issues
Dear Tom, On Wed, 1 May 2013, Tom Lane wrote: >> What can we do to provide a bit more of information? > > https://wiki.postgresql.org/wiki/Slow_Query_Questions > > There is no particularly good reason to think this is a bug; please > take it up on pgsql-performance if you have more questions. I beg to disagree, the performance of a select * query and the select b.id query are both "hot". The result in a fundamentally different query plan (and performance). Combined with the recent bugfix regarding hash estimation, it gives me a good indication that there might be a bug. I am not deep into the query optimiser of PostgreSQL but given the above same were different selections can change an entire query plan (and * is in fact out of the box 30 times faster than b.id) it does. When hash is disabled the entire query is -depending on the system checked- 2 to 30x faster. The original query: select * from ambit_privateevent_calendars as a, ambit_privateevent as b, ambit_calendarsubscription as c, ambit_calendar as d where c.calendar_id = d.id and a.privateevent_id = b.id and c.user_id = 1270 and c.calendar_id = a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, 12, 20) and not b.main_recurrence = true; select b.id from ambit_privateevent_calendars as a, ambit_privateevent as b, ambit_calendarsubscription as c, ambit_calendar as d where c.calendar_id = d.id and a.privateevent_id = b.id and c.user_id = 1270 and c.calendar_id = a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, 12, 20) and not b.main_recurrence = true; (select * => select b.id, the star query is *fastest*) We compare: http://explain.depesz.com/s/jRx http://explain.depesz.com/s/eKE By setting "set enable_hashjoin = off;" performance in our entire application increased 30 fold in throughput, which was a bit unexpected but highly appreciated. The result of the last query switch the mergejoin: http://explain.depesz.com/s/AWB It is also visible that after hashjoin is off, the b.id query is faster than the * query (what would be expected). Our test machine is overbudgetted, 4x the memory of the entire database ~4GB, and uses the PostgreSQL stock settings. Stefan
> -----Original Message----- > > > The original query: > > select * from ambit_privateevent_calendars as a, ambit_privateevent as > b, ambit_calendarsubscription as c, ambit_calendar as d where > c.calendar_id = d.id and a.privateevent_id = b.id and c.user_id = 1270 > and c.calendar_id = a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, > 12, 20) and not b.main_recurrence = true; > > select b.id from ambit_privateevent_calendars as a, ambit_privateevent > as b, ambit_calendarsubscription as c, ambit_calendar as d where > c.calendar_id = d.id and a.privateevent_id = b.id and c.user_id = 1270 > and c.calendar_id = a.calendar_id and c.STATUS IN (1, 8, 2, 15, 18, 4, > 12, 20) and not b.main_recurrence = true; > > (select * => select b.id, the star query is *fastest*) > > We compare: > http://explain.depesz.com/s/jRx > http://explain.depesz.com/s/eKE > > > By setting "set enable_hashjoin = off;" performance in our entire > application increased 30 fold in throughput, which was a bit unexpected > but highly appreciated. The result of the last query switch the > mergejoin: > > http://explain.depesz.com/s/AWB > > It is also visible that after hashjoin is off, the b.id query is faster > than the * query (what would be expected). > > > Our test machine is overbudgetted, 4x the memory of the entire database > ~4GB, and uses the PostgreSQL stock settings. > > > Stefan > I'd suggest that you adjust Postgres configuration, specifically memory settings (buffer_cache, work_mem, effective_cache_size),to reflect your hardware config, and see how it affects your query. Regards, Igor Neyman
On Wed, 2013-05-01 at 17:44 +0200, Stefan de Konink wrote: > Combined with the recent bugfix regarding hash > estimation, it gives me a good indication that there might be a bug. To which recent bugfix are you referring? The best venue for fixing an issue like this is pgsql-performance -- it doesn't make too much difference whether it's a "bug" or not. Performance problems sometimes end up as bugs and sometimes end up being treated more like an enhancement; but most of the progress is made on pgsql-performance regardless. Regards, Jeff Davis