BETWEEN optimizer problems with single-value range

From: Kevin Grittner
Subject: BETWEEN optimizer problems with single-value range
Date: ,
Msg-id: 44180105.EE98.0025.0@wicourts.gov
(view: Whole thread, Raw)
Responses: Re: BETWEEN optimizer problems with single-value range  (Andreas Kretschmer)
Re: BETWEEN optimizer problems with single-value range  ("Merlin Moncure")
Re: [HACKERS] BETWEEN optimizer problems with single-value range  (Simon Riggs)
List: pgsql-performance

Tree view

BETWEEN optimizer problems with single-value range  ("Kevin Grittner", )
 Re: BETWEEN optimizer problems with single-value range  (Andreas Kretschmer, )
  Re: BETWEEN optimizer problems with single-value  ("Kevin Grittner", )
   Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
    Re: BETWEEN optimizer problems with single-value  ("Kevin Grittner", )
    Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
     Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
      Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
       Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
        Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
         Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
          Re: BETWEEN optimizer problems with single-value  (Alvaro Herrera, )
           Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
          Re: BETWEEN optimizer problems with single-value  (Tom Lane, )
           Re: BETWEEN optimizer problems with single-value  (Simon Riggs, )
 Re: BETWEEN optimizer problems with single-value range  ("Merlin Moncure", )
  Re: BETWEEN optimizer problems with single-value range  (Andreas Kretschmer, )
   Re: BETWEEN optimizer problems with single-value range  ("Merlin Moncure", )
 Re: [HACKERS] BETWEEN optimizer problems with single-value range  (Simon Riggs, )
  Re: [HACKERS] BETWEEN optimizer problems with single-value  ("Kevin Grittner", )

Attached is a simplified example of a performance problem we have seen,
with a workaround and a suggestion for enhancement (hence both the
performance and hackers lists).

Our software is allowing users to specify the start and end dates for a
query.  When they enter the same date for both, the optimizer makes a
very bad choice.  We can work around it in application code by using an
equality test if both dates match.  I think the planner should be able
to make a better choice here.  (One obvious way to fix it would be to
rewrite "BETWEEN a AND b" as "= a" when a is equal to b, but it seems
like there is some underlying problem which should be fixed instead (or
in addition to) this.

The first query uses BETWEEN with the same date for both min and max
values.  The second query uses an equality test for the same date.  The
third query uses BETWEEN with a two-day range.  In all queries, there
are less than 4,600 rows for the specified cotfcNo value out of over 18
million rows in the table.  We tried boosting the statistics samples for
the columns in the selection, which made the estimates of rows more
accurate, but didn't change the choice of plans.

-Kevin



Attachment

pgsql-performance by date:

From: "Kevin Grittner"
Date:
Subject: Re: [HACKERS] BETWEEN optimizer problems with single-value
From: Simon Riggs
Date:
Subject: Re: BETWEEN optimizer problems with single-value