Thread: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Miernik
Date:
Something goes wrong that this query plan thinks there is only gonna be 1 row from (SELECT uid FROM alog ... ) so chooses such query plan, and thus it runs forever (at least so long that I didn't bother to wait, like 10 minutes): miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); QUERY PLAN ----------------------------------------------------------------------------------------------- Nested Loop IN Join (cost=0.00..3317.34 rows=1 width=44) -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=44) -> Index Scan using alog_uid_idx on alog (cost=0.00..296.95 rows=1 width=4) Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) (5 rows) But if I give him only the inner part, it makes reasonable assumptions and runs OK: miernik=> EXPLAIN SELECT uid FROM alog WHERE pid = 3452654 AND o = 1; QUERY PLAN ----------------------------------------------------------------------------------------- Bitmap Heap Scan on alog (cost=100.21..9559.64 rows=3457 width=4) Recheck Cond: ((pid = 3452654::numeric) AND (o = 1::numeric)) -> Bitmap Index Scan on alog_pid_o_idx (cost=0.00..99.35 rows=3457 width=0) Index Cond: ((pid = 3452654::numeric) AND (o = 1::numeric)) (4 rows) Can't show you EXPLAIN ANALYZE for the first one, as it also runds forver. For the second one, its consistent with the EXPLAIN. Before it was running OK, but I recently disabled autovacuum and now run VACUUM manually serveal times a day, and run ANALYZE manually on alog and cnts tables before runnign the above. How may I fix this to work? shared_buffers = 5MB work_mem = 1MB Machine is a 48 MB RAM Xen. But not the disabling autovacuum broke it, but running ANALYZE manually on the tables broke it, and I don't know why, I thougt ANALYZE would improve the guesses? -- Miernik http://miernik.name/
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Tom Lane
Date:
Miernik <public@public.miernik.name> writes: > miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); > QUERY PLAN > ----------------------------------------------------------------------------------------------- > Nested Loop IN Join (cost=0.00..3317.34 rows=1 width=44) > -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=44) > -> Index Scan using alog_uid_idx on alog (cost=0.00..296.95 rows=1 width=4) > Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) > Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) > (5 rows) > But if I give him only the inner part, it makes reasonable assumptions > and runs OK: What's the results for explain select * from cnts, alog where alog.uid = cnts.uid ? If necessary, turn off enable_hashjoin and enable_mergejoin so we can see a comparable plan. I'm suspecting it thinks the condition on uid is more selective than the one on the other index. regards, tom lane
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Miernik
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Miernik <public@public.miernik.name> writes: >> miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); >> QUERY PLAN >> ----------------------------------------------------------------------------------------------- >> Nested Loop IN Join (cost=0.00..3317.34 rows=1 width=44) >> -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=44) >> -> Index Scan using alog_uid_idx on alog (cost=0.00..296.95 rows=1 width=4) >> Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) >> Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) >> (5 rows) > >> But if I give him only the inner part, it makes reasonable assumptions >> and runs OK: > > What's the results for > > explain select * from cnts, alog where alog.uid = cnts.uid miernik=> explain select * from cnts, alog where alog.uid = cnts.uid; QUERY PLAN ---------------------------------------------------------------------------- Hash Join (cost=61.00..71810.41 rows=159220 width=76) Hash Cond: ((alog.uid)::integer = (cnts.uid)::integer) -> Seq Scan on alog (cost=0.00..54951.81 rows=3041081 width=37) -> Hash (cost=36.00..36.00 rows=2000 width=39) -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=39) (5 rows) > If necessary, turn off enable_hashjoin and enable_mergejoin so we can > see a comparable plan. After doing that it thinks like this: miernik=> explain select * from cnts, alog where alog.uid = cnts.uid; QUERY PLAN ----------------------------------------------------------------------------------------- Nested Loop (cost=4.95..573640.43 rows=159220 width=76) -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=39) -> Bitmap Heap Scan on alog (cost=4.95..285.80 rows=80 width=37) Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer) -> Bitmap Index Scan on alog_uid_idx (cost=0.00..4.93 rows=80 width=0) Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) (6 rows) Trying EXPLAIN ANALZYE now on this makes it run forever... How can I bring it back to working? Like un-run ANALYZE on that table or something? All was running reasonably well before I changed from autovacuum to running ANALYZE manually, and I thought I would improve performance... ;( -- Miernik http://miernik.name/
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Miernik
Date:
Miernik <public@public.miernik.name> wrote: > How can I bring it back to working? Like un-run ANALYZE on that table or > something? All was running reasonably well before I changed from > autovacuum to running ANALYZE manually, and I thought I would improve > performance... ;( I now removed all manual ANALYZE commands from the scripts, set enable_hashjoin = on enable_mergejoin = on and set on autovacuum, but it didn't bring back the performance of that query :( -- Miernik http://miernik.name/
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Tom Lane
Date:
Miernik <public@public.miernik.name> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If necessary, turn off enable_hashjoin and enable_mergejoin so we can >> see a comparable plan. > After doing that it thinks like this: > miernik=> explain select * from cnts, alog where alog.uid = cnts.uid; > QUERY PLAN > ----------------------------------------------------------------------------------------- > Nested Loop (cost=4.95..573640.43 rows=159220 width=76) > -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=39) > -> Bitmap Heap Scan on alog (cost=4.95..285.80 rows=80 width=37) > Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer) > -> Bitmap Index Scan on alog_uid_idx (cost=0.00..4.93 rows=80 width=0) > Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) > (6 rows) > Trying EXPLAIN ANALZYE now on this makes it run forever... It couldn't run very long if those rowcounts were accurate. How many rows in "cnts" really? How big is "alog", and how many of its rows join to "cnts"? While I'm looking at this, what's the real datatypes of the uid columns? Those explicit coercions seem a bit fishy. > How can I bring it back to working? It's premature to ask for a solution when we don't understand the problem. regards, tom lane
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Miernik
Date:
On Sat, Aug 09, 2008 at 05:37:29PM -0400, Tom Lane wrote: > > miernik=> explain select * from cnts, alog where alog.uid = cnts.uid; > > QUERY PLAN > > ----------------------------------------------------------------------------------------- > > Nested Loop (cost=4.95..573640.43 rows=159220 width=76) > > -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=39) > > -> Bitmap Heap Scan on alog (cost=4.95..285.80 rows=80 width=37) > > Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer) > > -> Bitmap Index Scan on alog_uid_idx (cost=0.00..4.93 rows=80 width=0) > > Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) > > (6 rows) > > > Trying EXPLAIN ANALZYE now on this makes it run forever... > > It couldn't run very long if those rowcounts were accurate. Well, count "over 5 minutes" as "forever". > How many > rows in "cnts" really? How big is "alog", and how many of its rows join > to "cnts"? cnts is exactly 1000 rows. alog as a whole is now 3041833 rows "SELECT uid FROM alog WHERE pid = 3452654 AND o = 1" gives 870 rows (202 of them NULL), but if I would first try to JOIN alog to cnts, that would be really huge, like roughly 100000 rows, so to have this work reasonably well, it MUST first filter alog on pid and o, and then JOIN the result to cnts, not the other way around. Trying to fist JOIN alog to cnts, and then filter the whole thing on pid and o is excessively stupid in this situation, isn't it? > > While I'm looking at this, what's the real datatypes of the uid columns? > Those explicit coercions seem a bit fishy. uid is of DOMAIN uid which is defined as: CREATE DOMAIN uid AS integer CHECK (VALUE > 0); But I don't think its a problem. It was working perfectly for serveral months until yesterday I decided to mess with autovacuum and manual ANALYZE. -- Miernik http://miernik.name/
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Miernik
Date:
Miernik <public@public.miernik.name> wrote: > Something goes wrong that this query plan thinks there is only gonna be > 1 row from (SELECT uid FROM alog ... ) so chooses such query plan, and > thus it runs forever (at least so long that I didn't bother to wait, > like 10 minutes): > > > miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); > QUERY PLAN > ----------------------------------------------------------------------------------------------- > Nested Loop IN Join (cost=0.00..3317.34 rows=1 width=44) > -> Seq Scan on cnts (cost=0.00..36.00 rows=2000 width=44) > -> Index Scan using alog_uid_idx on alog (cost=0.00..296.95 rows=1 width=4) > Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) > Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) > (5 rows) Well, in fact its not only the autovacuum/manual VACUUM ANALYZE that changed, its a new copy of the cnts table with only 1000 rows, and before it was a 61729 row table. The new, smaller, 1000 row table is recreated, but I have a copy of the old 61729 row table, and guess what? It runs correctly! And the query plan of the exactly the same query, on a table of the exactly same structure and indexes, differing only by having 61729 rows instead of 1000 rows, is like this: I've done a SELECT uid plan, instead of an UPDATE plan, but it should be no difference. This is a plan that is quick: miernik=> EXPLAIN SELECT uid FROM cnts_old WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); QUERY PLAN ----------------------------------------------------------------------------------------------------- Nested Loop (cost=9077.07..9238.61 rows=12 width=4) -> HashAggregate (cost=9077.07..9077.29 rows=22 width=4) -> Bitmap Heap Scan on alog (cost=93.88..9069.00 rows=3229 width=4) Recheck Cond: ((pid = 3452654::numeric) AND (o = 1::numeric)) -> Bitmap Index Scan on alog_pid_o (cost=0.00..93.07 rows=3229 width=0) Index Cond: ((pid = 3452654::numeric) AND (o = 1::numeric)) -> Index Scan using cnts_old_pkey on cnts_old (cost=0.00..7.32 rows=1 width=4) Index Cond: ((cnts_old.uid)::integer = (alog.uid)::integer) (8 rows) I present a SELECT uid plan with the 1000 table also below, just to be sure, this is the "bad" plan, that takes forever: miernik=> EXPLAIN SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); QUERY PLAN ----------------------------------------------------------------------------------------------- Nested Loop IN Join (cost=0.00..3532.70 rows=1 width=4) -> Seq Scan on cnts (cost=0.00..26.26 rows=1026 width=4) -> Index Scan using alog_uid_idx on alog (cost=0.00..297.32 rows=1 width=4) Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) (5 rows) I've also got a version of the cnts table with only 14 rows, called cnts_small, and the query plan on that one is below: miernik=> EXPLAIN SELECT uid FROM cnts_small WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop IN Join (cost=99.05..1444.29 rows=1 width=4) -> Seq Scan on cnts_small (cost=0.00..1.14 rows=14 width=4) -> Bitmap Heap Scan on alog (cost=99.05..103.07 rows=1 width=4) Recheck Cond: (((alog.uid)::integer = (cnts_small.uid)::integer) AND (alog.pid = 3452654::numeric) AND (alog.o =1::numeric)) -> BitmapAnd (cost=99.05..99.05 rows=1 width=0) -> Bitmap Index Scan on alog_uid_idx (cost=0.00..5.21 rows=80 width=0) Index Cond: ((alog.uid)::integer = (cnts_small.uid)::integer) -> Bitmap Index Scan on alog_pid_o (cost=0.00..92.78 rows=3229 width=0) Index Cond: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) (9 rows) That one is fast too. And the structure and indexes of cnts_small is exactly the same as of cnts and cnts_old. So it works OK if I use a 14 row table and if I use a 61729 row table, but breaks when I use a 1000 row table. Any ideas? -- Miernik http://miernik.name/
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
Miernik
Date:
Miernik <public@public.miernik.name> wrote: > I present a SELECT uid plan with the 1000 table also below, just to be > sure, this is the "bad" plan, that takes forever: > > miernik=> EXPLAIN SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1); > QUERY PLAN > ----------------------------------------------------------------------------------------------- > Nested Loop IN Join (cost=0.00..3532.70 rows=1 width=4) > -> Seq Scan on cnts (cost=0.00..26.26 rows=1026 width=4) > -> Index Scan using alog_uid_idx on alog (cost=0.00..297.32 rows=1 width=4) > Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) > Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric)) > (5 rows) If I reduce the number of rows in cnts to 100, I can actually make an EXPLAIN ANALYZE with this query plan finish in reasonable time: miernik=> EXPLAIN ANALYZE SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 555949 AND odp = 1); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop IN Join (cost=0.00..3585.54 rows=1 width=4) (actual time=51831.430..267844.815 rows=7 loops=1) -> Seq Scan on cnts (cost=0.00..14.00 rows=700 width=4) (actual time=0.005..148.464 rows=100 loops=1) -> Index Scan using alog_uid_idx on alog (cost=0.00..301.02 rows=1 width=4) (actual time=2676.959..2676.959 rows=0 loops=100) Index Cond: ((alog.uid)::integer = (cnts.uid)::integer) Filter: ((alog.pid = 555949::numeric) AND (alog.odp = 1::numeric)) Total runtime: 267844.942 ms (6 rows) The real running times are about 10 times more than the estimates. Is that normal? -- Miernik http://miernik.name/
Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?
From
"Scott Carey"
Date:
It is hardware dependent. The estimates are not time estimates, but on an arbitrary scale.
On the server I work with, the estimates are almost always 10x larger than the run times, and sometimes more than 50x.
(many GBs RAM, 8 CPU cores, more than 10 disks, standard optimizer settings other than statistics sample sizes and increased common values for columns).
-Scott
On the server I work with, the estimates are almost always 10x larger than the run times, and sometimes more than 50x.
(many GBs RAM, 8 CPU cores, more than 10 disks, standard optimizer settings other than statistics sample sizes and increased common values for columns).
-Scott
On Sat, Aug 9, 2008 at 7:20 PM, Miernik <public@public.miernik.name> wrote:
Miernik <public@public.miernik.name> wrote:> I present a SELECT uid plan with the 1000 table also below, just to beIf I reduce the number of rows in cnts to 100, I can actually make an
> sure, this is the "bad" plan, that takes forever:
>
> miernik=> EXPLAIN SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
> QUERY PLAN
> -----------------------------------------------------------------------------------------------
> Nested Loop IN Join (cost=0.00..3532.70 rows=1 width=4)
> -> Seq Scan on cnts (cost=0.00..26.26 rows=1026 width=4)
> -> Index Scan using alog_uid_idx on alog (cost=0.00..297.32 rows=1 width=4)
> Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
> Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
> (5 rows)
EXPLAIN ANALYZE with this query plan finish in reasonable time:
miernik=> EXPLAIN ANALYZE SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 555949 AND odp = 1);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop IN Join (cost=0.00..3585.54 rows=1 width=4) (actual time=51831.430..267844.815 rows=7 loops=1)
-> Seq Scan on cnts (cost=0.00..14.00 rows=700 width=4) (actual time=0.005..148.464 rows=100 loops=1)
-> Index Scan using alog_uid_idx on alog (cost=0.00..301.02 rows=1 width=4) (actual time=2676.959..2676.959 rows=0 loops=100)Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)Filter: ((alog.pid = 555949::numeric) AND (alog.odp = 1::numeric))
Total runtime: 267844.942 ms
(6 rows)
The real running times are about 10 times more than the estimates. Is
that normal?
--
Miernik
http://miernik.name/
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance