Re: Confirmation of bad query plan generated by 7.4 - Mailing list pgsql-performance

From Tom Lane
Subject Re: Confirmation of bad query plan generated by 7.4
Date
Msg-id 10918.1150236282@sss.pgh.pa.us
Whole thread Raw
In response to Re: Confirmation of bad query plan generated by 7.4  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: Confirmation of bad query plan generated by 7.4
List pgsql-performance
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Also, I'm not sure that the behavior is entirely changed, either. On a
> 8.1.4 database I'm still seeing a difference between now() - interval
> and a hard-coded date.

It'd depend on the context, possibly, but it's easy to show that the
current planner does fold "now() - interval_constant" when making
estimates.  Simple example:

-- create and populate 1000-row table:

regression=# create table t1 (f1 timestamptz);
CREATE TABLE
regression=# insert into t1 select now() - x * interval '1 day' from generate_series(1,1000) x;
INSERT 0 1000

-- default estimate is pretty awful:

regression=# explain select * from t1 where f1 > now();
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on t1  (cost=0.00..39.10 rows=647 width=8)
   Filter: (f1 > now())
(2 rows)

regression=# vacuum t1;
VACUUM

-- now the planner at least knows how many rows in the table with some
-- accuracy, but with no stats it's still falling back on a default
-- selectivity estimate:

regression=# explain select * from t1 where f1 > now();
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on t1  (cost=0.00..21.00 rows=333 width=8)
   Filter: (f1 > now())
(2 rows)

-- and the default doesn't really care what the comparison value is:

regression=# explain select * from t1 where f1 > now() - interval '10 days';
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on t1  (cost=0.00..23.50 rows=333 width=8)
   Filter: (f1 > (now() - '10 days'::interval))
(2 rows)

-- but let's give it some stats:

regression=# vacuum analyze t1;
VACUUM

-- and things get better:

regression=# explain select * from t1 where f1 > now() - interval '10 days';
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on t1  (cost=0.00..23.50 rows=9 width=8)
   Filter: (f1 > (now() - '10 days'::interval))
(2 rows)

7.4 would still be saying "rows=333" in the last case, because it's
falling back on DEFAULT_INEQ_SEL whenever the comparison value isn't
strictly constant.

            regards, tom lane

pgsql-performance by date:

Previous
From: Zydoon
Date:
Subject: Re: scaling up postgres
Next
From: "Jim C. Nasby"
Date:
Subject: Re: scaling up postgres