Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK? - Mailing list pgsql-performance

From Miernik
Subject Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
Date
Msg-id 20080809215643.GA7153@tarnica
Whole thread Raw
In response to Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Sat, Aug 09, 2008 at 05:37:29PM -0400, Tom Lane wrote:
> > miernik=> explain select * from cnts, alog where alog.uid = cnts.uid;
> >                                        QUERY PLAN
> > -----------------------------------------------------------------------------------------
> >  Nested Loop  (cost=4.95..573640.43 rows=159220 width=76)
> >    ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=39)
> >    ->  Bitmap Heap Scan on alog  (cost=4.95..285.80 rows=80 width=37)
> >          Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer)
> >          ->  Bitmap Index Scan on alog_uid_idx  (cost=0.00..4.93 rows=80 width=0)
> >                Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
> > (6 rows)
>
> > Trying EXPLAIN ANALZYE now on this makes it run forever...
>
> It couldn't run very long if those rowcounts were accurate.

Well, count "over 5 minutes" as "forever".

> How many
> rows in "cnts" really?  How big is "alog", and how many of its rows join
> to "cnts"?

cnts is exactly 1000 rows.
alog as a whole is now 3041833 rows

"SELECT uid FROM alog WHERE pid = 3452654 AND o = 1" gives 870 rows (202
of them NULL), but if I would first try to JOIN alog to cnts, that would
be really huge, like roughly 100000 rows, so to have this work
reasonably well, it MUST first filter alog on pid and o, and then JOIN
the result to cnts, not the other way around. Trying to fist JOIN alog to
cnts, and then filter the whole thing on pid and o is excessively stupid
in this situation, isn't it?

>
> While I'm looking at this, what's the real datatypes of the uid columns?
> Those explicit coercions seem a bit fishy.

uid is of DOMAIN uid which is defined as:

CREATE DOMAIN uid AS integer CHECK (VALUE > 0);

But I don't think its a problem. It was working perfectly for serveral
months until yesterday I decided to mess with autovacuum and manual
ANALYZE.

--
Miernik
http://miernik.name/

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
Next
From: Miernik
Date:
Subject: Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?