Thread: Planner doesn't chose Index - (slow select)
Hi all, I've been struggling with some performance issues with certain SQL queries. I was prepping a long-ish overview of my problem to submit, but I think I'll start out with a simple case of the problem first, hopefully answers I receive will help me solve my initial issue. Consider the following two queries which yield drastically different run-time: db=# select count(*) from pk_c2 b0 where b0.offer_id=7141; count ------- 1 (1 row) Time: 5139.004 ms db=# select count(*) from pk_c2 b0 where b0.pending=true and b0.offer_id=7141; count ------- 1 (1 row) Time: 1.828 ms That's 2811 times faster! Just to give you an idea of size of pk_c2 table: db=# select count(*) from pk_c2 ; count --------- 2158094 (1 row) Time: 5275.782 ms db=# select count(*) from pk_c2 where pending=true; count ------- 51 (1 row) Time: 5073.699 ms db=# explain select count(*) from pk_c2 b0 where b0.offer_id=7141; QUERY PLAN --------------------------------------------------------------------------- Aggregate (cost=44992.78..44992.78 rows=1 width=0) -> Seq Scan on pk_c2 b0 (cost=0.00..44962.50 rows=12109 width=0) Filter: (offer_id = 7141) (3 rows) Time: 1.350 ms db=# explain select count(*) from pk_c2 b0 where b0.pending=true and b0.offer_id=7141; QUERY PLAN ---------------------------------------------------------------------------------------- Aggregate (cost=45973.10..45973.10 rows=1 width=0) -> Index Scan using pk_boidx on pk_c2 b0 (cost=0.00..45973.09 rows=1 width=0) Index Cond: (offer_id = 7141) Filter: (pending = true) (4 rows) Time: 1.784 ms The table has indexes for both 'offer_id' and '(pending=true)': Indexes: "pk_boidx" btree (offer_id) "pk_bpidx" btree (((pending = true))) So, why would the planner chose to use the index on the second query and not on the first? Note that I am able to fool the planner into using an Index scan on offer_id by adding a silly new condition in the where clause of the first form of the query: db=# explain select count(*) from pk_c2 b0 where b0.offer_id=7141 and oid > 1; QUERY PLAN ------------------------------------------------------------------------------------------- Aggregate (cost=45983.19..45983.19 rows=1 width=0) -> Index Scan using pk_boidx on pk_c2 b0 (cost=0.00..45973.09 rows=4037 width=0) Index Cond: (offer_id = 7141) Filter: (oid > 1::oid) (4 rows) Time: 27.301 ms db=# select count(*) from pk_c2 b0 where b0.offer_id=7141 and oid > 1; count ------- 1 (1 row) Time: 1.900 ms What gives? This seems just too hokey for my taste. --patrick db=# select version(); version ------------------------------------------------------------------------- PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.6
"patrick keshishian" <pkeshish@gmail.com> writes: > I've been struggling with some performance issues with certain > SQL queries. I was prepping a long-ish overview of my problem > to submit, but I think I'll start out with a simple case of the > problem first, hopefully answers I receive will help me solve > my initial issue. Have you ANALYZEd this table lately? > db=# select count(*) from pk_c2 b0 where b0.offer_id=7141; > count > ------- > 1 > (1 row) The planner is evidently estimating that there are 12109 such rows, not 1, which is the reason for its reluctance to use an indexscan. Generally the only reason for it to be off that far on such a simple statistical issue is if you haven't updated the stats in a long time. (If you've got a really skewed data distribution for offer_id, you might need to raise the statistics target for it.) > The table has indexes for both 'offer_id' and '(pending=true)': > Indexes: > "pk_boidx" btree (offer_id) > "pk_bpidx" btree (((pending = true))) The expression index on (pending = true) won't do you any good, unless you spell your query in a weird way like ... WHERE (pending = true) = true I'd suggest a plain index on "pending" instead. > db=# select version(); > PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.6 You might want to think about an update, too. 7.4 is pretty long in the tooth. regards, tom lane
Tom, You are absolutely correct about not having run ANALYZE on the particular table. In my attempt to create a simple "test case" I created that table (pk_c2) from the original and had not run ANALYZE on it, even though, ANALYZE had been run prior to building that table. The problem on the test table and the simple select count(*) is no longer there (after ANALYZE). The original issue, however, is still there. I'm stumped as how to formulate my question without having to write a lengthy essay. As to upgrading from 7.4, I hear you, but I'm trying to support a deployed product. Thanks again for your input, --patrick On 4/18/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "patrick keshishian" <pkeshish@gmail.com> writes: > > I've been struggling with some performance issues with certain > > SQL queries. I was prepping a long-ish overview of my problem > > to submit, but I think I'll start out with a simple case of the > > problem first, hopefully answers I receive will help me solve > > my initial issue. > > Have you ANALYZEd this table lately? > > > db=# select count(*) from pk_c2 b0 where b0.offer_id=7141; > > count > > ------- > > 1 > > (1 row) > > The planner is evidently estimating that there are 12109 such rows, > not 1, which is the reason for its reluctance to use an indexscan. > Generally the only reason for it to be off that far on such a simple > statistical issue is if you haven't updated the stats in a long time. > (If you've got a really skewed data distribution for offer_id, you > might need to raise the statistics target for it.) > > > The table has indexes for both 'offer_id' and '(pending=true)': > > > Indexes: > > "pk_boidx" btree (offer_id) > > "pk_bpidx" btree (((pending = true))) > > The expression index on (pending = true) won't do you any good, > unless you spell your query in a weird way like > ... WHERE (pending = true) = true > I'd suggest a plain index on "pending" instead. > > > db=# select version(); > > PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.6 > > You might want to think about an update, too. 7.4 is pretty long in the > tooth. > > regards, tom lane