Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX - Mailing list pgsql-performance

From Tom Lane
Subject Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX
Date
Msg-id 25978.1137783074@sss.pgh.pa.us
Whole thread Raw
In response to Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: Stored procedures  (Rikard Pavelic <rikard.pavelic@zg.htnet.hr>)
Re: SELECT MIN, MAX took longer time than SELECT  (K C Lau <kclau60@netvigator.com>)
List pgsql-performance
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Fri, Jan 20, 2006 at 12:35:36PM +0800, K C Lau wrote:
> Here's the problem... the estimate for the backwards index scan is *way*
> off:

>> ->  Limit  (cost=0.00..1.26 rows=1 width=4) (actual
>> time=200032.928..200032.931 rows=1 loops=1)
>> ->  Index Scan Backward using pk_log on
>> log  (cost=0.00..108047.11 rows=86089 width=4) (actual
>> time=200032.920..200032.920 rows=1 loops=1)
>> Filter: (((create_time)::text < '2005/10/19'::text) AND
>> (logsn IS NOT NULL))
>> Total runtime: 200051.701 ms

It's more subtle than you think.  The estimated rowcount is the
estimated number of rows fetched if the indexscan were run to
completion, which it isn't because the LIMIT cuts it off after the
first returned row.  That estimate is not bad (we can see from the
aggregate plan that the true value would have been 106708, assuming
that the "logsn IS NOT NULL" condition isn't filtering anything).

The real problem is that it's taking quite a long time for the scan
to reach the first row with create_time < 2005/10/19, which is not
too surprising if logsn is strongly correlated with create_time ...
but in the absence of any cross-column statistics the planner has
no very good way to know that.  (Hm ... but both of them probably
also show a strong correlation to physical order ... we could look
at that maybe ...)  The default assumption is that the two columns
aren't correlated and so it should not take long to hit the first such
row, which is why the planner likes the indexscan/limit plan.

            regards, tom lane

pgsql-performance by date:

Previous
From: Rikard Pavelic
Date:
Subject: Re: [PERFORMANCE] Stored Procedures
Next
From: Rikard Pavelic
Date:
Subject: Re: Stored procedures