Thread: not exits slow compared to not in. (nested loops killing me)

not exits slow compared to not in. (nested loops killing me)

From
mark
Date:
Hi all,

I am trying to speed up a query on a DB I inherited and I am falling
flat on my face .

I changed a query from NOT IN to use NOT EXISTS and my query time went
from 19000ms to several hours (~50000000  ms). this shocked me so much
I pretty much had to post. This seems like a corner case of the
planner not knowing that the nested-loops are going to turn out badly
in this case. The planner choosing a 13hr nested loop here is
basically the reason I am posting.

I have played around with rewriting this query using some CTEs and a
left join but thus far my results are not encouraging.   Given what
little I know , it seems like a LEFT JOIN where right_table.col is
null gets the same performance and estimates as a NOT EXISTS. (and
still picks a nested loop in this case)

I can see where it all goes to hell time wise, turning off nested
loops seems to keep it from running for hours for this query, but not
something I am looking to do globally. The time is not really that
much better than just leaving it alone with a NOT IN.

two queries are at http://pgsql.privatepaste.com/a0b672bab0#

the "pretty" explain versions :

NOT IN (with large work mem - 1GB)
http://explain.depesz.com/s/ukj

NOT IN (with only 64MB for work_mem)
http://explain.depesz.com/s/wT0

NOT EXISTS (with 64MB of work_mem)
http://explain.depesz.com/s/EuX

NOT EXISTS (with nested loop off. and 64MB of work_mem)
http://explain.depesz.com/s/UXG

LEFT JOIN/CTE (with nested loop off and 1GB of work_mem)
http://explain.depesz.com/s/Hwm

table defs, with estimated row counts (which all 100% match exact row count)
http://pgsql.privatepaste.com/c2ff39b653

tried running an analyze across the whole database, no affect.

I haven't gotten creative with explicit join orders yet .

postgresql 9.0.2.

willing to try stuff for people as I can run things on a VM for days
and it is no big deal. I can't do that on production machines.

thoughts ? ideas ?


-Mark

Re: not exits slow compared to not in. (nested loops killing me)

From
Craig Ringer
Date:
On 06/07/2011 04:38 AM, mark wrote:

> NOT EXISTS (with 64MB of work_mem)
> http://explain.depesz.com/s/EuX

Hash Anti Join (cost=443572.19..790776.84 rows=1 width=1560)
(actual time=16337.711..50358.487 rows=2196299 loops=1)

Note the estimated vs actual rows. Either your stats are completely
ridiculous, or the planner is confused.

What are your stats target levels? Have you tried increasing the stats
levels on the table(s) or at least column(s) affected? Or tweaking
default_statistics_target if you want to use a bigger hammer?

Is autovacuum being allowed to do its work and regularly ANALYZE the
database? Does an explicit 'ANALYZE' help?

--
Craig Ringer

Re: not exits slow compared to not in. (nested loops killing me)

From
Tom Lane
Date:
Craig Ringer <craig@postnewspapers.com.au> writes:
> On 06/07/2011 04:38 AM, mark wrote:
> Hash Anti Join (cost=443572.19..790776.84 rows=1 width=1560)
> (actual time=16337.711..50358.487 rows=2196299 loops=1)

> Note the estimated vs actual rows. Either your stats are completely
> ridiculous, or the planner is confused.

The latter ... I think the OP is hurting for lack of this 9.0.4 fix:
http://git.postgresql.org/gitweb?p=postgresql.git&a=commitdiff&h=159c47dc7170110a39f8a16b1d0b7811f5556f87

            regards, tom lane

Re: not exits slow compared to not in. (nested loops killing me)

From
"mark"
Date:
> -----Original Message-----
> From: Craig Ringer [mailto:craig@postnewspapers.com.au]
> Sent: Monday, June 06, 2011 5:08 PM
> To: mark
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] not exits slow compared to not in. (nested loops
> killing me)
>
> On 06/07/2011 04:38 AM, mark wrote:
>
> > NOT EXISTS (with 64MB of work_mem)
> > http://explain.depesz.com/s/EuX
>
> Hash Anti Join (cost=443572.19..790776.84 rows=1 width=1560)
> (actual time=16337.711..50358.487 rows=2196299 loops=1)
>
> Note the estimated vs actual rows. Either your stats are completely
> ridiculous, or the planner is confused.


I am starting to think the planner might be confused in 9.0.2. I got a
reasonable query time, given resource constraints, on a very small VM on my
laptop running 9.0.4.

I am going to work on getting the vm I was using to test this with up to
9.0.4 and test again.

There is a note in the 9.0.4 release notes
" Improve planner's handling of semi-join and anti-join cases (Tom Lane)"

Not sure that is the reason I got a much better outcome with a much smaller
vm. But once I do some more testing I will report back.


>
> What are your stats target levels? Have you tried increasing the stats
> levels on the table(s) or at least column(s) affected? Or tweaking
> default_statistics_target if you want to use a bigger hammer?

Will try that as well. Currently the default stat target is 100. Will try at
250, and 500 and report back.

>
> Is autovacuum being allowed to do its work and regularly ANALYZE the
> database? Does an explicit 'ANALYZE' help?

Auto vac is running, I have explicitly vacuum & analyzed the whole db. That
didn't change anything.





>
> --
> Craig Ringer