Thread: 7.4.7: strange planner decision
Why does the planner want to crawl the table that has 5M rows instead of the one with 176k rows? Both tables are freshly vacuum-full-analyzed. 7.4.7 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.3 20041212 (Red Hat 3.4.3-9.EL4) callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); QUERY PLAN ---------------------------------------------------------------------------- Hash Join (cost=5340.00..292675.06 rows=176161 width=44) Hash Cond: (("outer".base)::text = ("inner".base)::text) -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 width=41) -> Hash (cost=3436.60..3436.60 rows=176160 width=44) -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 width=44) (5 rows) callrec32=# \d fix.files Table "fix.files" Column | Type | Modifiers --------+------------------------+----------- dir | character varying(255) | base | character varying(255) | Indexes: "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text))) "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text))) callrec32=# \d fix.dups Table "fix.dups" Column | Type | Modifiers --------+------------------------+----------- base | character varying(255) | Indexes: "dups_base_key" unique, btree (base) callrec32=# select count(*) from fix.files; select count(*) from fix.dups; count --------- 5278458 (1 row) count -------- 176160 (1 row) -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
Roman Neuhauser wrote: > Why does the planner want to crawl the table that has 5M rows instead of the one > with 176k rows? Both tables are freshly vacuum-full-analyzed. Because you don't have an index on "base" for the files table. > callrec32=# \d fix.files > Table "fix.files" > Column | Type | Modifiers > --------+------------------------+----------- > dir | character varying(255) | > base | character varying(255) | > Indexes: > "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text))) > "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text))) A couple of indexes, but none simple on "base", so it can't be used for the join. -- Richard Huxton Archonet Ltd
How does one pass an array as a parameter to a plpgsql function? I have tried this the following. I can't seem to get a select statement to work without syntax problems and no examples in Postgres book to help with this :( This is just a test so please ignore the fact it is a simple function. CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS ' DECLARE test_array ALIAS FOR $1; -- alias for input array BEGIN return array_upper(test_array,1) END; ' LANGUAGE 'plpgsql'; SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test two']]) AS output;
# dev@archonet.com / 2005-07-13 12:57:31 +0100: > Roman Neuhauser wrote: > >Why does the planner want to crawl the table that has 5M rows instead of > >the one > >with 176k rows? Both tables are freshly vacuum-full-analyzed. > > Because you don't have an index on "base" for the files table. I added one, ran vacuum full analyze fix.files, and: callrec32=# \d fix.files Table "fix.files" Column | Type | Modifiers --------+------------------------+----------- dir | character varying(255) | base | character varying(255) | Indexes: "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text))) "ff_baseonly_idx" btree (base) "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text))) callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); QUERY PLAN ---------------------------------------------------------------------------- Hash Join (cost=5340.00..292675.06 rows=176161 width=44) Hash Cond: (("outer".base)::text = ("inner".base)::text) -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 width=41) -> Hash (cost=3436.60..3436.60 rows=176160 width=44) -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 width=44) (5 rows) Which is exactly what I expected. Using left prefix of a multicolumn index normally works just fine, thank you. http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html: The query planner can use a multicolumn index for queries that involve ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the leftmost column in the index definition plus any number of columns ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ listed to the right of it, without a gap. For example, an index on (a, b, c) can be used in queries involving all of a, b, and c, or in queries involving both a and b, or in queries involving only a -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
Roman Neuhauser wrote: >>Because you don't have an index on "base" for the files table. > > > I added one, ran vacuum full analyze fix.files, and: > > callrec32=# \d fix.files > Table "fix.files" > Column | Type | Modifiers > --------+------------------------+----------- > dir | character varying(255) | > base | character varying(255) | > Indexes: > "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || (base)::text))) > "ff_baseonly_idx" btree (base) > "ff_storename_idx" btree (((((dir)::text || '/'::text) || (base)::text))) > > callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); > QUERY PLAN > ---------------------------------------------------------------------------- > Hash Join (cost=5340.00..292675.06 rows=176161 width=44) > Hash Cond: (("outer".base)::text = ("inner".base)::text) > -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 width=41) > -> Hash (cost=3436.60..3436.60 rows=176160 width=44) > -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 width=44) > (5 rows) > > Which is exactly what I expected. Using left prefix of a multicolumn > index normally works just fine, thank you. Couldn't figure out what you meant here - had to go back and re-read your index definitions. Sorry - missed the (base, ...) on the front of base_storename_idx. What happens to the plan if you SET enable_seqscan=false; first? It's presumably getting the row-estimate right, so unless there's terrible correlation on "base" in the files table I can only assume it's getting the cost estimates horribly wrong. -- Richard Huxton Archonet Ltd
# dev@archonet.com / 2005-07-13 14:09:34 +0100: > Roman Neuhauser wrote: > > callrec32=# \d fix.files > > Table "fix.files" > > Column | Type | Modifiers > > --------+------------------------+----------- > > dir | character varying(255) | > > base | character varying(255) | > > Indexes: > > "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || > > (base)::text))) > > "ff_baseonly_idx" btree (base) > > "ff_storename_idx" btree (((((dir)::text || '/'::text) || > > (base)::text))) > > > > callrec32=# explain select fd.base from fix.dups fd join fix.files ff > > using (base); > > QUERY PLAN > > ---------------------------------------------------------------------------- > > Hash Join (cost=5340.00..292675.06 rows=176161 width=44) > > Hash Cond: (("outer".base)::text = ("inner".base)::text) > > -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 > > width=41) > > -> Hash (cost=3436.60..3436.60 rows=176160 width=44) > > -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 > > width=44) > > (5 rows) > What happens to the plan if you SET enable_seqscan=false; first? It's > presumably getting the row-estimate right, so unless there's terrible > correlation on "base" in the files table I can only assume it's getting > the cost estimates horribly wrong. callrec32=# SET enable_seqscan=false; SET callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); QUERY PLAN -------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..1066990.93 rows=176161 width=44) -> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44) -> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41) Index Cond: (("outer".base)::text = (ff.base)::text) (4 rows) -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
David Pratt <fairwinds@eastlink.ca> writes: > CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS ' > DECLARE > test_array ALIAS FOR $1; -- alias for input array > BEGIN > return array_upper(test_array,1) > END; > ' LANGUAGE 'plpgsql'; > SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test > two']]) AS output; Works fine for me, once I add the semicolon you forgot: return array_upper(test_array,1); regards, tom lane
# neuhauser@sigpipe.cz / 2005-07-13 15:58:09 +0200: > # dev@archonet.com / 2005-07-13 14:09:34 +0100: > > Roman Neuhauser wrote: > > > callrec32=# \d fix.files > > > Table "fix.files" > > > Column | Type | Modifiers > > > --------+------------------------+----------- > > > dir | character varying(255) | > > > base | character varying(255) | > > > Indexes: > > > "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || > > > (base)::text))) > > > "ff_baseonly_idx" btree (base) > > > "ff_storename_idx" btree (((((dir)::text || '/'::text) || > > > (base)::text))) > > > > > > callrec32=# explain select fd.base from fix.dups fd join fix.files ff > > > using (base); > > > QUERY PLAN > > > ---------------------------------------------------------------------------- > > > Hash Join (cost=5340.00..292675.06 rows=176161 width=44) > > > Hash Cond: (("outer".base)::text = ("inner".base)::text) > > > -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 > > > width=41) > > > -> Hash (cost=3436.60..3436.60 rows=176160 width=44) > > > -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 > > > width=44) > > > (5 rows) > > > What happens to the plan if you SET enable_seqscan=false; first? It's > > presumably getting the row-estimate right, so unless there's terrible > > correlation on "base" in the files table I can only assume it's getting > > the cost estimates horribly wrong. > > callrec32=# SET enable_seqscan=false; > SET > callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); > QUERY PLAN > -------------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..1066990.93 rows=176161 width=44) > -> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44) > -> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41) > Index Cond: (("outer".base)::text = (ff.base)::text) > (4 rows) BTW, this query or its equivalent is what I'm really after: callrec32=# explain select c1.storename from fix.dups fd join calls c1 on (fd.base = basename(c1.storename) and c1.iscouple = '1') where not exists (select 1 from fix.files ff where c1.storename = ff.dir || '/' || ff.base); QUERY PLAN -------------------------------------------------------------------------------------------------------- Hash Join (cost=7474.26..23127970.91 rows=2354719 width=60) Hash Cond: ((basename("outer".storename))::text = ("inner".base)::text) -> Index Scan using calls2_iscouple_idx on calls c1 (cost=0.00..22982439.69 rows=2354719 width=60) Filter: ((iscouple = 1::smallint) AND (NOT (subplan))) SubPlan -> Index Scan using ff_storename_idx on files ff (cost=0.00..88570.16 rows=26393 width=0) Index Cond: (($0)::text = (((dir)::text || '/'::text) || (base)::text)) -> Hash (cost=5570.86..5570.86 rows=176160 width=44) -> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44) (9 rows) calls has (among others): "calls2_basename_storename_idx" btree (basename(storename), storename) WHERE (iscouple = (1)::smallint) "calls2_storename_idx" btree (storename) WHERE (iscouple = (1)::smallint) "calls2_iscouple_idx" btree (id) WHERE (iscouple = (1)::smallint) WHy does it use the calls2_iscouple_idx index when calls.id isn't used anywhere in the query? I would guess that calls2_storename_idx would actually be more useful. -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991
Richard Huxton <dev@archonet.com> writes: > What happens to the plan if you SET enable_seqscan=false; first? It's > presumably getting the row-estimate right, so unless there's terrible > correlation on "base" in the files table I can only assume it's getting > the cost estimates horribly wrong. I think you'll find that the results suck ;-). It looks to me that the planner is making exactly the right choice here. The only plausible alternative is a nestloop with inner indexscan on "files", which would imply 176160 separate index probes into "files", which is unlikely to win compared to one seqscan. (In the aggregate the index probes are likely to end up touching every page of "files" anyway --- you would need a much larger files table before this stopped being true.) If you want to compare the nestloop plan to test this theory, turning off enable_hashjoin and (if necessary) enable_mergejoin would be the better way to get it. But let's see EXPLAIN ANALYZE results for both cases, not just EXPLAIN. regards, tom lane
Roman Neuhauser wrote: > # dev@archonet.com / 2005-07-13 14:09:34 +0100: > >>Roman Neuhauser wrote: >> >>> callrec32=# \d fix.files >>> Table "fix.files" >>> Column | Type | Modifiers >>> --------+------------------------+----------- >>> dir | character varying(255) | >>> base | character varying(255) | >>> Indexes: >>> "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || >>> (base)::text))) >>> "ff_baseonly_idx" btree (base) >>> "ff_storename_idx" btree (((((dir)::text || '/'::text) || >>> (base)::text))) >>> >>> callrec32=# explain select fd.base from fix.dups fd join fix.files ff >>> using (base); >>> QUERY PLAN >>> ---------------------------------------------------------------------------- >>> Hash Join (cost=5340.00..292675.06 rows=176161 width=44) >>> Hash Cond: (("outer".base)::text = ("inner".base)::text) >>> -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 >>> width=41) >>> -> Hash (cost=3436.60..3436.60 rows=176160 width=44) >>> -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 >>> width=44) >>> (5 rows) > > >>What happens to the plan if you SET enable_seqscan=false; first? It's >>presumably getting the row-estimate right, so unless there's terrible >>correlation on "base" in the files table I can only assume it's getting >>the cost estimates horribly wrong. > > > callrec32=# SET enable_seqscan=false; > SET > callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); > QUERY PLAN > -------------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..1066990.93 rows=176161 width=44) > -> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44) > -> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41) > Index Cond: (("outer".base)::text = (ff.base)::text) OK - so it thinks the cost of this plan will be about 1 million, whereas the old plan was 290 thousand. The question is - why? What are your planner settings? Ch 16.4.4.2 here http://www.postgresql.org/docs/7.4/static/runtime-config.html#RUNTIME-CONFIG-QUERY I'm guessing something to do with cpu_index_tuple_cost or random_page_cost. -- Richard Huxton Archonet Ltd
argh!!! It was telling me I had an error in select statement. Thanks Tom! Regards David On Wednesday, July 13, 2005, at 11:08 AM, Tom Lane wrote: > David Pratt <fairwinds@eastlink.ca> writes: >> CREATE FUNCTION create_record_test(text[][]) RETURNS int4 AS ' >> DECLARE >> test_array ALIAS FOR $1; -- alias for input array >> BEGIN >> return array_upper(test_array,1) >> END; >> ' LANGUAGE 'plpgsql'; > >> SELECT create_record_test(ARRAY[ARRAY['A','test one'],ARRAY['B','test >> two']]) AS output; > > Works fine for me, once I add the semicolon you forgot: > > return array_upper(test_array,1); > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly >