Thread: seq-scan or index-scan
Dear list, i have a table and i'm selecting all records without a where-condition, and i don't need a ORDER BY: production=*# explain analyse select * from boxes; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Seq Scan on boxes (cost=0.00..990783.99 rows=6499 width=581) (actual time=6.514..4588.136 rows=3060 loops=1) Total runtime: 4588.729 ms (2 rows) It's slow, so i tried with an ORDER BY: production=# explain analyse select * from boxes order by id; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Index Scan using boxes_pkey on boxes (cost=0.00..162437.00 rows=6499 width=581) (actual time=0.065..55.730 rows=3060 loops=1) Total runtime: 56.169 ms (2 rows) Why not using the index (it's a primary key) for the first query? PG 9.1.3 Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On 3 Červenec 2012, 17:58, Andreas Kretschmer wrote: > Dear list, > > i have a table and i'm selecting all records without a where-condition, > and i don't need a ORDER BY: > > > > production=*# explain analyse select * from boxes; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------- > Seq Scan on boxes (cost=0.00..990783.99 rows=6499 width=581) (actual > time=6.514..4588.136 rows=3060 loops=1) > Total runtime: 4588.729 ms > (2 rows) > > > It's slow, so i tried with an ORDER BY: > > > > production=# explain analyse select * from boxes order by id; > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------- > Index Scan using boxes_pkey on boxes (cost=0.00..162437.00 rows=6499 > width=581) (actual time=0.065..55.730 rows=3060 loops=1) > Total runtime: 56.169 ms > (2 rows) > > > Why not using the index (it's a primary key) for the first query? I suppose the second run was cached, i.e. most of the data was in the page cache. And it's not that PostgreSQL can read "just" the index - it needs to check tuple visibility which is stored in the table. Try to do "sync && echo 3 > /proc/sys/vm/drop_caches" and rerun the second query. How did that work? How much data are we talking about anyway? How much RAM is in the server? Tomas
Andreas Kretschmer <akretschmer@spamfence.net> writes: > production=*# explain analyse select * from boxes; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------- > Seq Scan on boxes (cost=0.00..990783.99 rows=6499 width=581) (actual time=6.514..4588.136 rows=3060 loops=1) > Total runtime: 4588.729 ms > (2 rows) That cost estimate seems pretty dang large for a table with only 6500 rows. I suspect this table is horribly bloated, and the indexscan manages to win because it's not visiting pages that contain only dead rows. Try VACUUM FULL, and if that makes things saner, re-examine your autovacuum settings. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andreas Kretschmer <akretschmer@spamfence.net> writes: > > production=*# explain analyse select * from boxes; > > QUERY PLAN > > --------------------------------------------------------------------------------------------------------------- > > Seq Scan on boxes (cost=0.00..990783.99 rows=6499 width=581) (actual time=6.514..4588.136 rows=3060 loops=1) > > Total runtime: 4588.729 ms > > (2 rows) > > That cost estimate seems pretty dang large for a table with only 6500 > rows. I suspect this table is horribly bloated, and the indexscan > manages to win because it's not visiting pages that contain only dead You're right as always ;-) A LOT of updates and dead rows and the table is bloated. (some rows contains more than 1MByte of TEXT and some rows updated once per second or so) > rows. Try VACUUM FULL, and if that makes things saner, re-examine > your autovacuum settings. I can't run a VACUUM FULL because of the workload, but i have decrease the fillfactor. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°