Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk? - Mailing list pgsql-performance

From Tom Lane
Subject Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Date
Msg-id 28370.1265860369@sss.pgh.pa.us
Whole thread Raw
In response to Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?  (Bryce Nesbitt <bryce2@obviously.com>)
Responses Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?  (Bryce Nesbitt <bryce2@obviously.com>)
Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?  (Bryce Nesbitt <bryce2@obviously.com>)
List pgsql-performance
Bryce Nesbitt <bryce2@obviously.com> writes:
> The query plans are now attached (sorry I did not start there: many
> lists reject attachments). Or you can click on "text" at the query
> planner analysis site http://explain.depesz.com/s/qYq

At least some of the problem is the terrible quality of the rowcount
estimates in the IN subquery, as you extracted here:

>  Nested Loop  (cost=0.00..23393.15 rows=23 width=4) (actual time=0.077..15.637 rows=4003 loops=1)
>    ->  Index Scan using words_word on words  (cost=0.00..5.47 rows=1 width=4) (actual time=0.049..0.051 rows=1
loops=1)
>          Index Cond: ((word)::text = 'insider'::text)
>    ->  Index Scan using article_words_wc on article_words (cost=0.00..23234.38 rows=12264 width=8) (actual
time=0.020..7.237rows=4003 loops=1) 
>          Index Cond: (article_words.word_key = words.word_key)
>  Total runtime: 19.776 ms

Given that it estimated 1 row out of "words" (quite correctly) and 12264
rows out of each scan on article_words, you'd think that the join size
estimate would be 12264, which would be off by "only" a factor of 3 from
the true result.  Instead it's 23, off by a factor of 200 :-(.

Running a roughly similar test case here, I see that 8.4 gives
significantly saner estimates, which I think is because of this patch:
http://archives.postgresql.org/pgsql-committers/2008-10/msg00191.php

At the time I didn't want to risk back-patching it, because there
were a lot of other changes in the same general area in 8.4.  But
it would be interesting to see what happens with your example if
you patch 8.3 similarly.  (Note: I think only the first diff hunk
is relevant to 8.3.)

            regards, tom lane

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: How exactly PostgreSQL allocates memory for its needs?
Next
From: jesper@krogh.cc
Date:
Subject: Re: perf problem with huge table