Thread: not exits slow compared to not in. (nested loops killing me)
Hi all, I am trying to speed up a query on a DB I inherited and I am falling flat on my face . I changed a query from NOT IN to use NOT EXISTS and my query time went from 19000ms to several hours (~50000000 ms). this shocked me so much I pretty much had to post. This seems like a corner case of the planner not knowing that the nested-loops are going to turn out badly in this case. The planner choosing a 13hr nested loop here is basically the reason I am posting. I have played around with rewriting this query using some CTEs and a left join but thus far my results are not encouraging. Given what little I know , it seems like a LEFT JOIN where right_table.col is null gets the same performance and estimates as a NOT EXISTS. (and still picks a nested loop in this case) I can see where it all goes to hell time wise, turning off nested loops seems to keep it from running for hours for this query, but not something I am looking to do globally. The time is not really that much better than just leaving it alone with a NOT IN. two queries are at http://pgsql.privatepaste.com/a0b672bab0# the "pretty" explain versions : NOT IN (with large work mem - 1GB) http://explain.depesz.com/s/ukj NOT IN (with only 64MB for work_mem) http://explain.depesz.com/s/wT0 NOT EXISTS (with 64MB of work_mem) http://explain.depesz.com/s/EuX NOT EXISTS (with nested loop off. and 64MB of work_mem) http://explain.depesz.com/s/UXG LEFT JOIN/CTE (with nested loop off and 1GB of work_mem) http://explain.depesz.com/s/Hwm table defs, with estimated row counts (which all 100% match exact row count) http://pgsql.privatepaste.com/c2ff39b653 tried running an analyze across the whole database, no affect. I haven't gotten creative with explicit join orders yet . postgresql 9.0.2. willing to try stuff for people as I can run things on a VM for days and it is no big deal. I can't do that on production machines. thoughts ? ideas ? -Mark
On 06/07/2011 04:38 AM, mark wrote: > NOT EXISTS (with 64MB of work_mem) > http://explain.depesz.com/s/EuX Hash Anti Join (cost=443572.19..790776.84 rows=1 width=1560) (actual time=16337.711..50358.487 rows=2196299 loops=1) Note the estimated vs actual rows. Either your stats are completely ridiculous, or the planner is confused. What are your stats target levels? Have you tried increasing the stats levels on the table(s) or at least column(s) affected? Or tweaking default_statistics_target if you want to use a bigger hammer? Is autovacuum being allowed to do its work and regularly ANALYZE the database? Does an explicit 'ANALYZE' help? -- Craig Ringer
Craig Ringer <craig@postnewspapers.com.au> writes: > On 06/07/2011 04:38 AM, mark wrote: > Hash Anti Join (cost=443572.19..790776.84 rows=1 width=1560) > (actual time=16337.711..50358.487 rows=2196299 loops=1) > Note the estimated vs actual rows. Either your stats are completely > ridiculous, or the planner is confused. The latter ... I think the OP is hurting for lack of this 9.0.4 fix: http://git.postgresql.org/gitweb?p=postgresql.git&a=commitdiff&h=159c47dc7170110a39f8a16b1d0b7811f5556f87 regards, tom lane
> -----Original Message----- > From: Craig Ringer [mailto:craig@postnewspapers.com.au] > Sent: Monday, June 06, 2011 5:08 PM > To: mark > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] not exits slow compared to not in. (nested loops > killing me) > > On 06/07/2011 04:38 AM, mark wrote: > > > NOT EXISTS (with 64MB of work_mem) > > http://explain.depesz.com/s/EuX > > Hash Anti Join (cost=443572.19..790776.84 rows=1 width=1560) > (actual time=16337.711..50358.487 rows=2196299 loops=1) > > Note the estimated vs actual rows. Either your stats are completely > ridiculous, or the planner is confused. I am starting to think the planner might be confused in 9.0.2. I got a reasonable query time, given resource constraints, on a very small VM on my laptop running 9.0.4. I am going to work on getting the vm I was using to test this with up to 9.0.4 and test again. There is a note in the 9.0.4 release notes " Improve planner's handling of semi-join and anti-join cases (Tom Lane)" Not sure that is the reason I got a much better outcome with a much smaller vm. But once I do some more testing I will report back. > > What are your stats target levels? Have you tried increasing the stats > levels on the table(s) or at least column(s) affected? Or tweaking > default_statistics_target if you want to use a bigger hammer? Will try that as well. Currently the default stat target is 100. Will try at 250, and 500 and report back. > > Is autovacuum being allowed to do its work and regularly ANALYZE the > database? Does an explicit 'ANALYZE' help? Auto vac is running, I have explicitly vacuum & analyzed the whole db. That didn't change anything. > > -- > Craig Ringer