Query planner problem - Mailing list pgsql-performance

From Ryan VanMiddlesworth
Subject Query planner problem
Date
Msg-id 200410011706.54231.ryan@vanmiddlesworth.org
Whole thread Raw
Responses Re: Query planner problem  (Russell Smith <mr-russ@pws.com.au>)
Re: Query planner problem  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Okay, I've got two queries that I think the planner should reduce to be
logically equivalent but it's not happening. The example queries below have
been simplified as much as possible while still producing the problem.

What I'm trying to do is create a single prepared statement that can handle
null parameters rather than have to dynamically generate the statement in my
app code based on supplied parameters.

Basically the date constants below would be substituted with parameters
supplied on a web search form (or nulls).


Here is the query and EXPLAIN that runs quickly:
  SELECT case_id FROM case_data
  WHERE case_filed_date > '2004-09-16'
    AND case_filed_date < '2004-09-20'

                       QUERY PLAN
-------------------------------------------------------------
Index Scan using case_data_case_filed_date on case_data
(cost=0.00..13790.52 rows=3614 width=18)
  Index Cond: ((case_filed_date > '2004-09-16'::date)
           AND (case_filed_date < '2004-09-20'::date))


And here is the query and EXPLAIN from the version that I believe the planner
should reduce to be logically equivalent:
 SELECT case_id FROM case_data
 WHERE (('2004-09-16' IS NULL) OR (case_filed_date > '2004-09-16'))
   AND (('2004-09-20' IS NULL) OR (case_filed_date < '2004-09-20'))

                       QUERY PLAN
-------------------------------------------------------------
Seq Scan on case_data  (cost=0.00..107422.02 rows=27509 width=18)
  Filter: ((('2004-09-16' IS NULL) OR (case_filed_date > '2004-09-16'::date))
       AND (('2004-09-20' IS NULL) OR (case_filed_date < '2004-09-20'::date)))


I was hoping that the null comparisons would get folded out by the planner
relatively cheaply. But as you can see, the first query uses indexes and the
second one uses sequence scans, thereby taking much longer.  I guess my
question is - is there a better way to accomplish what I'm doing in SQL or am
I going to have to dynamically generate the statement based on supplied
parameters?

Thanks,
Ryan

pgsql-performance by date:

Previous
From: William Yu
Date:
Subject: Re: Caching of Queries
Next
From: Josh Berkus
Date:
Subject: Re: Caching of Queries