Thread: Completely wrong row estimates
Subject: Completely wrong row estimates Hello everybody, Here is the EXPLAIN ANALYZE output for a simple query in my database running on postgres 8.3.9: EXPLAIN ANALYZE SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id WHERE w.word = 'tagtext'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual time=19.266..131.255 rows=43374 loops=1) -> Index Scan using word_word_key on word w (cost=0.00..8.28 rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1) Index Cond: ((word)::text = 'tagtext'::text) -> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31 rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1) Recheck Cond: (vw.word_id = w.id) -> Bitmap Index Scan on video_words_word_id_key (cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662 rows=43374 loops=1) Index Cond: (vw.word_id = w.id) Total runtime: 154.215 ms Note how the planner estimates that there are 766 rows in the table that matches the word 'tagtext'. In reality 43374 does. I've tried to get postgres to refresh the statistics by running with enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL ANALYZE etc but nothing works. Postgres seem stuck with its bad statistics and unwilling to change them. There are many other strings that also matches tens of thousands of rows in the table which postgres only thinks matches 766. Is this a bug in postgres? -- mvh Björn
2010/4/4 Björn Lindqvist <bjourne@gmail.com>: > Subject: Completely wrong row estimates > > Hello everybody, > > Here is the EXPLAIN ANALYZE output for a simple query in my database > running on postgres 8.3.9: > > EXPLAIN ANALYZE > SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id > WHERE w.word = 'tagtext'; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual > time=19.266..131.255 rows=43374 loops=1) > -> Index Scan using word_word_key on word w (cost=0.00..8.28 > rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1) > Index Cond: ((word)::text = 'tagtext'::text) > -> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31 > rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1) > Recheck Cond: (vw.word_id = w.id) > -> Bitmap Index Scan on video_words_word_id_key > (cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662 > rows=43374 loops=1) > Index Cond: (vw.word_id = w.id) > Total runtime: 154.215 ms > > Note how the planner estimates that there are 766 rows in the table > that matches the word 'tagtext'. In reality 43374 does. I've tried to > get postgres to refresh the statistics by running with > enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL > ANALYZE etc but nothing works. Postgres seem stuck with its bad > statistics and unwilling to change them. There are many other strings > that also matches tens of thousands of rows in the table which > postgres only thinks matches 766. > > Is this a bug in postgres? > > > -- > mvh Björn > You probably want to run "analyze" or "vacuum analyze" to update statistics. Do you have auto vacuum setup? -- Rob Wultsch wultsch@gmail.com
On 04/05/10 01:44, Björn Lindqvist wrote: > Subject: Completely wrong row estimates > > Hello everybody, > > Here is the EXPLAIN ANALYZE output for a simple query in my database > running on postgres 8.3.9: > > EXPLAIN ANALYZE > SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id > WHERE w.word = 'tagtext'; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual > time=19.266..131.255 rows=43374 loops=1) > -> Index Scan using word_word_key on word w (cost=0.00..8.28 > rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1) > Index Cond: ((word)::text = 'tagtext'::text) > -> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31 > rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1) > Recheck Cond: (vw.word_id = w.id) > -> Bitmap Index Scan on video_words_word_id_key > (cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662 > rows=43374 loops=1) > Index Cond: (vw.word_id = w.id) > Total runtime: 154.215 ms > > Note how the planner estimates that there are 766 rows in the table > that matches the word 'tagtext'. In reality 43374 does. I've tried to > get postgres to refresh the statistics by running with > enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL > ANALYZE etc but nothing works. Postgres seem stuck with its bad > statistics and unwilling to change them. There are many other strings > that also matches tens of thousands of rows in the table which > postgres only thinks matches 766. > > Is this a bug in postgres? > > > -- > mvh Björn > It seems like regular maintainance tasks haven't been carried out. VACUUM, etc. -- Nilesh Govindarajan Site & Server Administrator www.itech7.com मेरा भारत महान ! मम भारत: महत्तम भवतु !
2010/4/4 Björn Lindqvist <bjourne@gmail.com>: > Subject: Completely wrong row estimates > > Hello everybody, > > Here is the EXPLAIN ANALYZE output for a simple query in my database > running on postgres 8.3.9: > > EXPLAIN ANALYZE > SELECT * FROM word w JOIN video_words vw ON w.id = vw.word_id > WHERE w.word = 'tagtext'; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=18.89..2711.16 rows=95 width=24) (actual > time=19.266..131.255 rows=43374 loops=1) > -> Index Scan using word_word_key on word w (cost=0.00..8.28 > rows=1 width=12) (actual time=0.029..0.034 rows=1 loops=1) > Index Cond: ((word)::text = 'tagtext'::text) > -> Bitmap Heap Scan on video_words vw (cost=18.89..2693.31 > rows=766 width=12) (actual time=19.227..77.809 rows=43374 loops=1) > Recheck Cond: (vw.word_id = w.id) > -> Bitmap Index Scan on video_words_word_id_key > (cost=0.00..18.70 rows=766 width=0) (actual time=12.662..12.662 > rows=43374 loops=1) > Index Cond: (vw.word_id = w.id) > Total runtime: 154.215 ms > > Note how the planner estimates that there are 766 rows in the table > that matches the word 'tagtext'. In reality 43374 does. I've tried to > get postgres to refresh the statistics by running with > enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL > ANALYZE etc but nothing works. Postgres seem stuck with its bad > statistics and unwilling to change them. There are many other strings > that also matches tens of thousands of rows in the table which > postgres only thinks matches 766. I assume you mean default_statistics_target, not enable_statistics_target. You should try setting it higher - but obviously just for these columns. Use something like ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000 Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum full, just analyze. Oh, and if what you're doing is actually full text search, which is what it looks like, you should really look at using the native full text indexing support rather than just stuffing your words in a table. You'll get better and much faster results. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>: >> Note how the planner estimates that there are 766 rows in the table >> that matches the word 'tagtext'. In reality 43374 does. I've tried to >> get postgres to refresh the statistics by running with >> enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL >> ANALYZE etc but nothing works. Postgres seem stuck with its bad >> statistics and unwilling to change them. There are many other strings >> that also matches tens of thousands of rows in the table which >> postgres only thinks matches 766. > > I assume you mean default_statistics_target, not enable_statistics_target. Yes, sorry. > You should try setting it higher - but obviously just for these > columns. Use something like > > ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000 > > Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum > full, just analyze. Done that and it doesn't help. The estimates are always off for the query of the type I specified. > Oh, and if what you're doing is actually full text search, which is > what it looks like, you should really look at using the native full > text indexing support rather than just stuffing your words in a table. > You'll get better and much faster results. It is more "full tag search" because I'm not using any word stemming, phrase matching or OR:ing query terms. It was, when I measured it, significantly faster than using the native text searching feature. -- mvh Björn
2010/4/6 Björn Lindqvist <bjourne@gmail.com>
Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>:>> Note how the planner estimates that there are 766 rows in the table
>> that matches the word 'tagtext'. In reality 43374 does. I've tried to
>> get postgres to refresh the statistics by running with
>> enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL
>> ANALYZE etc but nothing works. Postgres seem stuck with its bad
>> statistics and unwilling to change them. There are many other strings
>> that also matches tens of thousands of rows in the table which
>> postgres only thinks matches 766.
Have you tried running :
'EXPLAIN ANALYZE <your query>'
?
This will show you the estimates and the actuals (for each operation) side by side.
--Scott
>Yes, sorry.
> I assume you mean default_statistics_target, not enable_statistics_target.Done that and it doesn't help. The estimates are always off for the
> You should try setting it higher - but obviously just for these
> columns. Use something like
>
> ALTER TABLE words ALTER COLUMN word SET STATISTICS 1000
>
> Then you need to run ANALYZE on it. Not vacuum ,certainly not vacuum
> full, just analyze.
query of the type I specified.It is more "full tag search" because I'm not using any word stemming,
> Oh, and if what you're doing is actually full text search, which is
> what it looks like, you should really look at using the native full
> text indexing support rather than just stuffing your words in a table.
> You'll get better and much faster results.
phrase matching or OR:ing query terms. It was, when I measured it,
significantly faster than using the native text searching feature.
--
mvh Björn
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Den 6 april 2010 14.22 skrev Scott Mead <scott.lists@enterprisedb.com>: > > 2010/4/6 Björn Lindqvist <bjourne@gmail.com> >> >> Den 5 april 2010 11.57 skrev Magnus Hagander <magnus@hagander.net>: >> >> Note how the planner estimates that there are 766 rows in the table >> >> that matches the word 'tagtext'. In reality 43374 does. I've tried to >> >> get postgres to refresh the statistics by running with >> >> enable_statistics_target=100, running VACUUM, VACUUM FULL, VACUUM FULL >> >> ANALYZE etc but nothing works. Postgres seem stuck with its bad >> >> statistics and unwilling to change them. There are many other strings >> >> that also matches tens of thousands of rows in the table which >> >> postgres only thinks matches 766. > > Have you tried running : > 'EXPLAIN ANALYZE <your query>' > ? > This will show you the estimates and the actuals (for each operation) side > by side. Yes, see my first message where I post the EXPLAIN ANALYZE output for the query. -- mvh Björn