Re: order of nested loop - Mailing list pgsql-general

From Tom Lane
Subject Re: order of nested loop
Date
Msg-id 14184.1055873509@sss.pgh.pa.us
Whole thread Raw
In response to Re: order of nested loop  (Joseph Shraibman <jks@selectacast.net>)
Responses Re: order of nested loop  (Joseph Shraibman <jks@selectacast.net>)
List pgsql-general
Joseph Shraibman <jks@selectacast.net> writes:
> The planner shows this for the scan on d:
> (cost=0.00..2380577.42 rows=525568 width=49)
> Maybe it thinks it will reach the limit of 25 before it actually does,
> which is why it is willing to try something so expensive?

Yeah, exactly, it's extrapolating that it will only actually have to
process a relatively small part of that scan.  Which would be true if
it were getting 4492 rows out of the join per estimate, and not just 1
per reality.  This is the same estimation failure as in the other plan,
I think, but it's a lot simpler to see in the other plan.

> ... Thus it would make sense to first get the entries in u,
> filter them, then filter by their status in d.

Right, but the problem is to know how many u entries will get through
the filter.  When that estimate is off by a factor of ~5000, it's no
surprise that the planner is likely to choose the wrong plan.  If you
could cut that estimation error by even a factor of 2, it would have
made the right choices here.

So we're back to my previous question: why is that estimate so far off?
You might try comparing
    explain select * from usertable where podkey = 20;
    select count(*) from usertable where podkey = 20;
to see whether the estimate is failing on the basic podkey=20 part.
If that doesn't seem too far off, add in the status = 2 and/or
(NOT banned) parts to see what confuses it.  I'd like to see the
pg_stats rows for these three columns, too.

BTW, you have done an ANALYZE recently on usertable, I hope.

            regards, tom lane

pgsql-general by date:

Previous
From: Guillaume LELARGE
Date:
Subject: Re: Bad link on techdocs
Next
From: Ron Johnson
Date:
Subject: Re: postgreSQL on NAS/SAN?