Thread: seq scan over 3.3 million rows instead of single key index access
There are indexes on rid(dokumnr) and dok(dokumnr) and dokumnr is int. Instead of using single key index, 8.1.4 scans over whole rid table. Sometimes idtelluued can contain more than single row so replacing join with equality is not possible. How to fix ? Andrus. CREATE TEMP TABLE idtellUued(dokumnr INT) ON COMMIT DROP; INSERT INTO idtellUued VALUES(1249228); explain analyze select 1 from dok JOIN rid USING(dokumnr) JOIN idtellUued USING(dokumnr) "Hash Join (cost=7483.22..222259.77 rows=5706 width=0) (actual time=14905.981..27065.903 rows=8 loops=1)" " Hash Cond: ("outer".dokumnr = "inner".dokumnr)" " -> Seq Scan on rid (cost=0.00..198240.33 rows=3295833 width=4) (actual time=0.036..15021.641 rows=3280576 loops=1)" " -> Hash (cost=7477.87..7477.87 rows=2140 width=8) (actual time=0.114..0.114 rows=1 loops=1)" " -> Nested Loop (cost=0.00..7477.87 rows=2140 width=8) (actual time=0.076..0.099 rows=1 loops=1)" " -> Seq Scan on idtelluued (cost=0.00..31.40 rows=2140 width=4) (actual time=0.006..0.011 rows=1 loops=1)" " -> Index Scan using dok_dokumnr_idx on dok (cost=0.00..3.47 rows=1 width=4) (actual time=0.051..0.058 rows=1 loops=1)" " Index Cond: (dok.dokumnr = "outer".dokumnr)" "Total runtime: 27066.080 ms"
"Andrus" <kobruleht2@hot.ee> writes: > There are indexes on rid(dokumnr) and dok(dokumnr) and dokumnr is int. > Instead of using single key index, 8.1.4 scans over whole rid table. > Sometimes idtelluued can contain more than single row so replacing join with > equality is not possible. > > How to fix ? Firstly the current 8.1 release is 8.1.15. Any of the bugs fixed in those 11 releases might be related to this. Secondly: > CREATE TEMP TABLE idtellUued(dokumnr INT) ON COMMIT DROP; > INSERT INTO idtellUued VALUES(1249228); > explain analyze select 1 > from dok JOIN rid USING(dokumnr) > JOIN idtellUued USING(dokumnr) > > " -> Seq Scan on idtelluued (cost=0.00..31.40 rows=2140 width=4) > (actual time=0.006..0.011 rows=1 loops=1)" The planner thinks there are 2,140 rows in that temporary table so I don't believe this is from the example posted. I would suggest running ANALYZE idtellUued at some point before the problematic query. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
Gregory Stark <stark@enterprisedb.com> writes: > "Andrus" <kobruleht2@hot.ee> writes: >> There are indexes on rid(dokumnr) and dok(dokumnr) and dokumnr is int. >> Instead of using single key index, 8.1.4 scans over whole rid table. >> Sometimes idtelluued can contain more than single row so replacing join with >> equality is not possible. >> >> How to fix ? > Firstly the current 8.1 release is 8.1.15. Any of the bugs fixed in those 11 > releases might be related to this. If this can still be reproduced in 8.1.15 it would be worth looking into. My first guess is that there are multiple relevant indexes on the big table and the old bugs in choose_bitmap_and() are making it mess up. > The planner thinks there are 2,140 rows in that temporary table so I don't > believe this is from the example posted. I would suggest running ANALYZE > idtellUued at some point before the problematic query. No, that's a pretty likely default assumption for a never-vacuumed, never-analyzed table. Your advice is correct though. regards, tom lane
Gregory, > I would suggest running ANALYZE > idtellUued at some point before the problematic query. Thank you. After adding analyze all is OK. Is analyze command required in 8.3 also ? Or is it better better to specify some hint at create temp table time since I know the number of rows before running query ? Andrus. set search_path to firma2,public; CREATE TEMP TABLE idtellUued(dokumnr INT) ON COMMIT DROP; INSERT INTO idtellUued VALUES(1249228); analyze idtelluued; explain analyze select 1 from dok JOIN rid USING(dokumnr) JOIN idtellUued USING(dokumnr) "Nested Loop (cost=0.00..275.18 rows=3 width=0) (actual time=87.266..87.388 rows=8 loops=1)" " -> Nested Loop (cost=0.00..6.95 rows=1 width=8) (actual time=36.613..36.636 rows=1 loops=1)" " -> Seq Scan on idtelluued (cost=0.00..1.01 rows=1 width=4) (actual time=0.009..0.015 rows=1 loops=1)" " -> Index Scan using dok_dokumnr_idx on dok (cost=0.00..5.93 rows=1 width=4) (actual time=36.585..36.590 rows=1 loops=1)" " Index Cond: (dok.dokumnr = "outer".dokumnr)" " -> Index Scan using rid_dokumnr_idx on rid (cost=0.00..267.23 rows=80 width=4) (actual time=50.635..50.672 rows=8 loops=1)" " Index Cond: ("outer".dokumnr = rid.dokumnr)" "Total runtime: 87.586 ms"
Andrus <kobruleht2@hot.ee> schrieb: > There are indexes on rid(dokumnr) and dok(dokumnr) and dokumnr is int. > Instead of using single key index, 8.1.4 scans over whole rid table. > Sometimes idtelluued can contain more than single row so replacing join > with equality is not possible. > > How to fix ? > > Andrus. > > CREATE TEMP TABLE idtellUued(dokumnr INT) ON COMMIT DROP; > INSERT INTO idtellUued VALUES(1249228); > explain analyze select 1 > from dok JOIN rid USING(dokumnr) > JOIN idtellUued USING(dokumnr) > > "Hash Join (cost=7483.22..222259.77 rows=5706 width=0) (actual > time=14905.981..27065.903 rows=8 loops=1)" > " Hash Cond: ("outer".dokumnr = "inner".dokumnr)" > " -> Seq Scan on rid (cost=0.00..198240.33 rows=3295833 width=4) > (actual time=0.036..15021.641 rows=3280576 loops=1)" How many rows contains rid? The estimation are okay, rows=3295833 and actual rows=3280576 are nearly identical. An index-scan makes only sense if rid contains considerable more than 3000000 rows. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
Andrus <kobruleht2@hot.ee> schrieb: > There are indexes on rid(dokumnr) and dok(dokumnr) and dokumnr is int. > Instead of using single key index, 8.1.4 scans over whole rid table. > Sometimes idtelluued can contain more than single row so replacing join > with equality is not possible. > > How to fix ? > > Andrus. > > CREATE TEMP TABLE idtellUued(dokumnr INT) ON COMMIT DROP; > INSERT INTO idtellUued VALUES(1249228); > explain analyze select 1 > from dok JOIN rid USING(dokumnr) > JOIN idtellUued USING(dokumnr) Try to analyse the idtellUued-table after the insert. The planner has no knowledge that this table contains only one or e few rows, the planner assume 1000 (iirc) in this table. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
am Sun, dem 23.11.2008, um 6:20:08 +0200 mailte Andrus folgendes: > Gregory, > > > I would suggest running ANALYZE > >idtellUued at some point before the problematic query. > > Thank you. > After adding analyze all is OK. > Is analyze command required in 8.3 also ? Yes. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
> An index-scan makes only sense if rid contains considerable more than > 3000000 rows. I'm sorry, I meant using index to get the row. Andrus.