PG8.2.1 choosing slow seqscan over idx scan - Mailing list pgsql-performance

From Jeremy Haile
Subject PG8.2.1 choosing slow seqscan over idx scan
Date
Msg-id 1168982580.8275.1169588211@webmail.messagingengine.com
Whole thread Raw
Responses Re: PG8.2.1 choosing slow seqscan over idx scan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Running PostgreSQL 8.2.1 on Win32.   The query planner is choosing a seq
scan over index scan even though index scan is faster (as shown by
disabling seqscan).  Table is recently analyzed and row count estimates
seem to be in the ballpark.

Another tidbit - I haven't done a "vacuum full" ever, although I "vacuum
analyze" regularly (and autovacuum).  I recently noticed that the PG
data drive is 40% fragmented (NTFS).  Could that be making the seqscan
slower than it should be?  Regardless of the fragmentations affect on
performance, is the query planner making a good decision here?


SOME CONFIGURATION PARAMS
effective_cache_size=1000MB
random_page_cost=3
default_statistics_target=50
shared_buffers=400MB
temp_buffers=10MB
work_mem=10MB
checkpoint_segments=12


QUERY
select merchant_dim_id,
       dcms_dim_id,
       sum(success) as num_success,
       sum(failed) as num_failed,
       count(*) as total_transactions,
       (sum(success) * 1.0 / count(*)) as success_rate
from transaction_facts
where transaction_date >= '2007-1-16'
and transaction_date < '2007-1-16 15:20'
group by merchant_dim_id, dcms_dim_id;


EXPLAIN ANALYZE (enable_seqscan=true)
HashAggregate  (cost=339573.01..340089.89 rows=15904 width=16) (actual
time=140606.593..140650.573 rows=10549 loops=1)
  ->  Seq Scan on transaction_facts  (cost=0.00..333928.25 rows=322558
  width=16) (actual time=19917.957..140036.910 rows=347434 loops=1)
        Filter: ((transaction_date >= '2007-01-16 00:00:00'::timestamp
        without time zone) AND (transaction_date < '2007-01-16
        15:20:00'::timestamp without time zone))
Total runtime: 140654.813 ms


EXPLAIN ANALYZE (enable_seqscan=false)
HashAggregate  (cost=379141.53..379658.41 rows=15904 width=16) (actual
time=3720.838..3803.748 rows=10549 loops=1)
  ->  Bitmap Heap Scan on transaction_facts  (cost=84481.80..373496.76
  rows=322558 width=16) (actual time=244.568..3133.741 rows=347434
  loops=1)
        Recheck Cond: ((transaction_date >= '2007-01-16
        00:00:00'::timestamp without time zone) AND (transaction_date <
        '2007-01-16 15:20:00'::timestamp without time zone))
        ->  Bitmap Index Scan on transaction_facts_transaction_date_idx
        (cost=0.00..84401.16 rows=322558 width=0) (actual
        time=241.994..241.994 rows=347434 loops=1)
              Index Cond: ((transaction_date >= '2007-01-16
              00:00:00'::timestamp without time zone) AND
              (transaction_date < '2007-01-16 15:20:00'::timestamp
              without time zone))
Total runtime: 3810.795 ms

pgsql-performance by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Table Size
Next
From: Tom Lane
Date:
Subject: Re: PG8.2.1 choosing slow seqscan over idx scan