Re: Delaying the planning of unnamed statements until Bind - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Delaying the planning of unnamed statements until Bind
Date
Msg-id 87d64xqwfb.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Delaying the planning of unnamed statements until Bind  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Delaying the planning of unnamed statements until Bind  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Delaying the planning of unnamed statements until Bind  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

>     select * from mytable where entry_time >= $1;
>
> The planner will take a seqscan when it sees this because it is worried
> about the downside if a large fraction of the table is being selected.

I wonder if it would make sense for the planner to be smarter about this. In a
sense the cost is a probability distribution, and representing it with a
single number, the expected value, is a just not enough information.

If the planner had the expected value as well as the variance of the cost
distribution then it might realize that in this case for instance that the
penalty for guessing wrong with an index scan is only going to be a small
slowdown factor, perhaps 2-4x slower. Whereas the penalty for guessing wrong
with a sequential scan could be a factor in the thousands or more.


In this particular case I think the guessing that a sequential scan is faster
is just plain wrong. There's a bit of hidden information here than the planner
isn't using which is that the DBA chose to put an index on this column. That
would mean some substantial number of queries are expected by a human to be
selective enough to warrant using an index.



-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Widening time_t to 8 bytes
Next
From: Tom Lane
Date:
Subject: Re: Delaying the planning of unnamed statements until Bind