Inconsistant use of index. - Mailing list pgsql-bugs
From | Ron Mayer |
---|---|
Subject | Inconsistant use of index. |
Date | |
Msg-id | Pine.LNX.4.33.0203260845110.16667-100000@ron Whole thread Raw |
In response to | Re: Bug #624: case sensitivity in pg_dumpall (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Inconsistant use of index.
|
List | pgsql-bugs |
In porting a pretty large (10s of millions of records) data warehouse from Oracle to PostgreSQL, Once some of my tables started getting pretty large, PostgreSQL suddenly stopped using indexes when I use expressions like "col = value" decreasing performance by 20X. This meant that my daily reports started taking two days instead of 2 hours to run!!!!). Interestingly when I re-write the queries using >= and <= to produce identical results, the index works fine. Example queries in question include: select count(*) from fact where dat='2002-03-01'; select count(*) from fact where dat<='2002-03-01' and dat>='2002-03-01'; The distribution of values in "dat" are roughly evenly spaced from '2002-01-01' through '2002-03-25'. Attached below are A: Information about the table, including "\d" and "vacuum verbose analyze" B: Output of "explain analyze" from the above queries (showing the 20X slowdown) C: Version and configuration information. Any suggestions on what I should look at next would be appreciated. Thanks much, Ron PS: As a quite perverse workaround, I rewrote all my queries to have "col<=val and col>=val" everywhere I used to have "col=val" and everything is running fine again... but that's just wierd. ============================================================================ == A: Information about the table ============================================================================ logs2=# \d fact Table "fact" Column | Type | Modifiers --------+------------------------+----------- dat | date | tim | time without time zone | ip_id | integer | bid_id | integer | req_id | integer | ref_id | integer | Indexes: i_fact__bid_id, i_fact__dat, i_fact__ref_id, i_fact__req_id, i_fact__tim logs2=# select count(*) from fact; count ---------- 18410778 (1 row) logs2=# vacuum verbose analyze fact; NOTICE: --Relation fact-- NOTICE: Pages 144967: Changed 0, Empty 0; Tup 18410778: Vac 0, Keep 0, UnUsed 0. Total CPU 11.56s/2.97u sec elapsed 71.91 sec. NOTICE: Analyzing fact VACUUM ============================================================================ == B: Explain Analyze for the two queries. == Note that the <=, >= one was over 20X faster. ============================================================================ logs2=# explain analyze select count(*) from fact where dat='2002-03-01'; NOTICE: QUERY PLAN: Aggregate (cost=375631.14..375631.14 rows=1 width=0) (actual time=76689.42..76689.42 rows=1 loops=1) -> Seq Scan on fact (cost=0.00..375101.72 rows=211765 width=0) (actual time=20330.96..76391.94 rows=180\ 295 loops=1) Total runtime: 76707.92 msec EXPLAIN logs2=# explain analyze select count(*) from fact where dat<='2002-03-01' and dat >='2002-03-01'; NOTICE: QUERY PLAN: Aggregate (cost=5.98..5.98 rows=1 width=0) (actual time=2921.39..2921.39 rows=1 loops=1) -> Index Scan using i_fact__dat on fact (cost=0.00..5.98 rows=1 width=0) (actual time=73.55..2583.53 ro\ ws=180295 loops=1) Total runtime: 2921.55 msec EXPLAIN logs2=# ============================================================================ == C: Version and configuration information. ============================================================================ [17]localhost:~/apps/pgsql% psql -V psql (PostgreSQL) 7.2 contains support for: readline, history Portions Copyright (c) 1996-2001, PostgreSQL Global Development Group Portions Copyright (c) 1996, Regents of the University of California Read the file COPYRIGHT or use the command \copyright to see the usage and distribution terms. [17]localhost:/scratch/pgsql/data% diff -wiu postgresql.conf postgresql.conf.bak --- postgresql.conf Sat Mar 23 15:39:34 2002 +++ postgresql.conf.bak Tue Mar 5 19:33:54 2002 @@ -50,7 +50,7 @@ #shared_buffers = 10000 # 2*max_connections, min 16 ## goes to about 84 meg with 4000. #shared_buffers = 4000 # 2*max_connections, min 16 -shared_buffers = 10000 # 2*max_connections, min 16 +shared_buffers = 8000 # 2*max_connections, min 16 #max_fsm_relations = 100 # min 10, fsm is free space map #max_fsm_pages = 10000 # min 1000, fsm is free space map -- Ronald Mayer Director of Web Business InterVideo, Inc.
pgsql-bugs by date: