Re: postgres 8.2 seems to prefer Seq Scan - Mailing list pgsql-performance
From | Alex Deucher |
---|---|
Subject | Re: postgres 8.2 seems to prefer Seq Scan |
Date | |
Msg-id | a728f9f90704091143t58141a32mb0de1713499da4f8@mail.gmail.com Whole thread Raw |
In response to | Re: postgres 8.2 seems to prefer Seq Scan (Michael Fuhr <mike@fuhr.org>) |
Responses |
Re: postgres 8.2 seems to prefer Seq Scan
|
List | pgsql-performance |
On 4/6/07, Michael Fuhr <mike@fuhr.org> wrote: > On Fri, Apr 06, 2007 at 04:38:33PM -0400, Alex Deucher wrote: > > One more anomaly between 7.4 and 8.2. DB dumped from 7.4 and loaded > > onto 8.2, both have locale set to C. 8.2 seems to prefer Seq Scans > > for the first query while the ordering in the second query seems to > > perform worse on 8.2. I ran analyze. I've tried with the encoding > > set to UTF-8 and SQL_ASCII; same numbers and plans. Any ideas how to > > improve this? > > Are you sure the data sets are identical? The 7.4 query returned > 0 rows; the 8.2 query returned 1 row. If you're running the same > query against the same data in both versions then at least one of > them appears to be returning the wrong result. Exactly which > versions of 7.4 and 8.2 are you running? > > Have you analyzed all tables in both versions? The row count > estimate in 7.4 is much closer to reality than in 8.2: > > 7.4 > > -> Index Scan using pnum_idx on event (cost=0.00..3.37 rows=19 > > width=172) (actual time=0.063..0.063 rows=0 loops=1) > > Index Cond: ((pnum)::text = 'AB5819188'::text) > > 8.2 > > -> Index Scan using pnum_idx on event (cost=0.00..3147.63 > > rows=1779 width=171) (actual time=0.030..0.033 rows=1 loops=1) > > Index Cond: ((pnum)::text = 'AB5819188'::text) > > If analyzing the event table doesn't improve the row count estimate > then try increasing the statistics target for event.pnum and analyzing > again. Example: > > ALTER TABLE event ALTER pnum SET STATISTICS 100; > ANALYZE event; > > You can set the statistics target as high as 1000 to get more > accurate results at the cost of longer ANALYZE times. > Setting statistics to 400 seems to be the sweet spot. Values above that seem to only marginally improve performance. However, I have to disable seqscan in order for the query to be fast. Why does the query planner insist on doing a seq scan? Is there anyway to make it prefer the index scan? Thanks, Alex postgres 8.2 db=# EXPLAIN ANALYZE select pnum, event_pid, code_name, code_description, code_mcam, event_date, effective_date, ref_country, ref_country_legal_code, corresponding_pnum, withdrawal_date, payment_date, extension_date, fee_payment_year, requester, free_form from code inner join event on code_pid = code_pid_fk where pnum ='US5819188'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=106.91..210.37 rows=54 width=229) (actual time=11.245..11.253 rows=1 loops=1) Hash Cond: (event.code_pid_fk = code.code_pid) -> Index Scan using pnum_idx on event (cost=0.00..102.58 rows=54 width=170) (actual time=0.108..0.112 rows=1 loops=1) Index Cond: ((pnum)::text = 'US5819188'::text) -> Hash (cost=70.85..70.85 rows=2885 width=67) (actual time=11.006..11.006 rows=2885 loops=1) -> Seq Scan on code (cost=0.00..70.85 rows=2885 width=67) (actual time=0.025..5.392 rows=2885 loops=1) Total runtime: 11.429 ms (7 rows) db=# set enable_seqscan=0; SET db=# EXPLAIN ANALYZE select pnum, event_pid, code_name, code_description, code_mcam, event_date, effective_date, ref_country, ref_country_legal_code, corresponding_pnum, withdrawal_date, payment_date, extension_date, fee_payment_year, requester, free_form from code inner join event on code_pid = code_pid_fk where pnum ='US5819188'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..289.72 rows=54 width=229) (actual time=0.068..0.076 rows=1 loops=1) -> Index Scan using pnum_idx on event (cost=0.00..102.58 rows=54 width=170) (actual time=0.019..0.020 rows=1 loops=1) Index Cond: ((pnum)::text = 'US5819188'::text) -> Index Scan using code_pkey on code (cost=0.00..3.45 rows=1 width=67) (actual time=0.041..0.043 rows=1 loops=1) Index Cond: (code.code_pid = event.code_pid_fk) Total runtime: 0.126 ms (6 rows)
pgsql-performance by date: