Re: BETWEEN optimizer problems with single-value

From: Tom Lane
Subject: Re: BETWEEN optimizer problems with single-value
Date: ,
Msg-id: 11428.1142474724@sss.pgh.pa.us
(view: Whole thread, Raw)
In response to: Re: BETWEEN optimizer problems with single-value  (Simon Riggs)
Responses: Re: BETWEEN optimizer problems with single-value  (Simon Riggs)
List: pgsql-performance

Tree view

BETWEEN optimizer problems with single-value range  ("Kevin Grittner", )
 Re: BETWEEN optimizer problems with single-value range  (Andreas Kretschmer, )
  Re: BETWEEN optimizer problems with single-value  ("Kevin Grittner", )
   Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
    Re: BETWEEN optimizer problems with single-value  ("Kevin Grittner", )
    Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
     Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
      Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
       Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
        Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
         Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
          Re: BETWEEN optimizer problems with single-value  (Alvaro Herrera, )
           Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
          Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
           Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
 Re: BETWEEN optimizer problems with single-value range  ("Merlin Moncure", )
  Re: BETWEEN optimizer problems with single-value range  (Andreas Kretschmer, )
   Re: BETWEEN optimizer problems with single-value range  ("Merlin Moncure", )
 Re: [HACKERS] BETWEEN optimizer problems with single-value range  (Simon Riggs, )
  Re: [HACKERS] BETWEEN optimizer problems with single-value  ("Kevin Grittner", )

Simon Riggs <> writes:
>> ISTM that when the BETWEEN constants match we end up in this part of
>> clauselist_selectivity()...

Yeah, I think you are right.

> so that the planner underestimates the cost of using "Cal_CalDate" so
> that it ends up the same as "Cal_CtofcNo", and then we pick
> "Cal_CalDate" because it was created first.

No, it doesn't end up the same --- but the difference is small enough to
be in the roundoff-error regime.  The real issue here is that we're
effectively assuming that one row will be fetched from the index in both
cases, and this is clearly not the case for the Cal_CalDate index.  So
we need a more accurate estimate for the boundary case.

> Using 1.0e-10 isn't very useful... the selectivity for a range should
> never be less than the selectivity for an equality, so we should simply
> put in a test against one of the pseudo constants and use that as the
> minimal value.

That's easier said than done, because you'd first have to find the
appropriate equality operator to use (ie, one having semantics that
agree with the inequality operators).  Another point is that the above
statement is simply wrong, consider
    calDate BETWEEN '2006-03-15' AND '2006-03-14'
for which an estimate of zero really is correct.

Possibly we could drop this code's reliance on seeing
SCALARLTSEL/SCALARGTSEL as the estimators, and instead try to locate a
common btree opclass for the operators --- which would then let us
identify the right equality operator to use, and also let us distinguish
> from >= etc.  If we're trying to get the boundary cases right I
suspect we have to account for that.  I could see such an approach being
tremendously slow though :-(, because we'd go looking for btree
opclasses even for operators that have nothing to do with < or >.

            regards, tom lane


pgsql-performance by date:

From: Tom Lane
Date:
Subject: Re: BETWEEN optimizer problems with single-value
From: "Guillaume Smet"
Date:
Subject: PostgreSQL and Xeon MP