Re: Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off) - Mailing list pgsql-hackers
From | Mario Weilguni |
---|---|
Subject | Re: Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off) |
Date | |
Msg-id | FA095C015271B64E99B197937712FD026679DB@freedom.grz.icomedias.com Whole thread Raw |
In response to | Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off) (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
<p><font size="2">If the query runs slow it will be not such a problem, but I was very concerned about other queries havingthis problem too - without knowing it. I've already rewritten the query to use IN instead of exists.<br /><br /> I'llcompile again and try it again.<br /><br /> Thanks alot!<br /><br /> Best regards,<br /> Mario Weilguni<br /><br /><br/><br /> -----Ursprüngliche Nachricht-----<br /> Von: Tom Lane [<a href="mailto:tgl@sss.pgh.pa.us">mailto:tgl@sss.pgh.pa.us</a>]<br/> Gesendet: Mo 28.11.2005 19:39<br /> An: Mario Weilguni<br/> Cc: pgsql-hackers@postgresql.org<br /> Betreff: Hashjoin startup strategy (was Re: [HACKERS] Getting differentnumber of results when using hashjoin on/off)<br /><br /> "Mario Weilguni" <mario.weilguni@icomedias.com>writes:<br /> > Thanks for the quick response, I've tried the patch, but it did notwork<br /> > as expected. When I set enable_hashjoin to off, everything works as<br /> > expected, but with hashjoinon I do not even get results anymore, CPU is<br /> > going up to 100% and after 3 minutes I cancelled the query(it normale<br /> > would take ~100-500 milliseconds).<br /><br /> Try letting it run longer. I think your expectationis tuned for the<br /> broken implementation (which runs the subqueries only once instead of<br /> 26k times...)<br/><br /> The test case I developed for this failure in the regression database is<br /><br /> select count(*)from tenk1 a<br /> where exists (select 1 from tenk1 b, tenk1 c<br /> where b.unique1=c.unique2 and<br/> b.hundred in (4,5) and c.hundred=a.hundred+99);<br /><br /> 8.0 prefers a nestloop for the subquery,and that plan runs in about<br /> 600 ms on my machine. If forced to a hash join, it takes about 2450 ms.<br />8.1 prefers the hash join to start with, but takes 11300 ms to run it :-(<br /> (after the patch that is).<br /><br />The reason for the differential is that 8.1 guesses wrong about which<br /> subplan to cycle first: most of the time, theinner plan is empty and<br /> so there's no need to pull any rows from the outer plan, but 8.1 pulls<br /> the first rowfrom the outer plan anyway, and doing that 10000 times is<br /> what's eating the extra runtime. It looks from your previousmessage<br /> that similar things are happening with your data distribution, allowing<br /> 8.0 to run faster foryou than 8.1 does.<br /><br /> Not sure if there's much we can do about this. The presence of the<br /> upper-query parameterin the subplan makes it difficult to derive any<br /> stats at all, let alone guess how often the subplan will becompletely<br /> empty, so I'm not sure the planner can help.<br /><br /> For a query like this, where the hash join isbeing done repeatedly,<br /> it might be useful for the executor itself to track how often each<br /> subplan has beenseen to be empty. In particular, the executor knows<br /> that the outer subplan is parameterless and therefore shoulddeliver<br /> the same results each time (modulo volatile functions of course), so<br /> after the first cycle it couldknow that there's no point in trying<br /> the early fetch on that side. Dunno if this will be of wide enough<br />use to be worth implementing though --- in simple cases the join<br /> won't be rescanned and so the executor can't help.<br/><br /> Anyone have any other ideas?<br /><br /> regards, tom lane<br /><br /></font>
pgsql-hackers by date: