Thread: constant scalar subselect no longer equivalent to constant?

constant scalar subselect no longer equivalent to constant?

From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
I upgraded from 7.2.1 to 7.3.2 over the past weekend.  One of my
favorite queries went from an indexed scan to a sequential scan.  Ugh.

Here's the details...  200,000+ records, indexed on "stamp" (a timestamp).
My query was:

    select count(*), otherthing from requests
    where stamp > (select now() - '1 hour' :: interval)
    group by 2
    order by 1 desc
    limit 10;

In 7.2.1, the subselect was treated as a constant, and so the btree
index was used to reduce the scan significantly.  I would get the
result within a few seconds.

In 7.3.2, I get a sequential scan!  Ugh!  It takes about 15 seconds.

However, if I replace the subselect with a constant timestamp
('2003-03-10'), I get an index scan, so it isn't that the analyze
hasn't recognized the number of records.

Did I break something during the upgrade?

Is there anything I can do to get the index scan back?

Is this a known change between 7.2.1 and 7.3.2?

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: constant scalar subselect no longer equivalent to constant?

From
Joshua Moore-Oliva
Date:
Taking a complete shot in the dark here, but I did read that in the upgrade
from 7.2.x -- 7.3 the type timestamp was changed to default from

timestamp with time zone

to

timestamp without time zone

to be more adhesive to SQL standards....

and now() is of type timestamp with time zone, and if you only declared your
field of type timestamp it will be timestamp without time zone, therefore the
database may be converting the value each time therefore causing a speed
reduction?

The fast that constant timestamp works seems to further prove my theory since
then you have timestamp without time zone.

I'd be interested to know if this is the solution :)

Josh.

On March 14, 2003 02:47 am, Randal L. Schwartz wrote:
> I upgraded from 7.2.1 to 7.3.2 over the past weekend.  One of my
> favorite queries went from an indexed scan to a sequential scan.  Ugh.
>
> Here's the details...  200,000+ records, indexed on "stamp" (a timestamp).
> My query was:
>
>     select count(*), otherthing from requests
>     where stamp > (select now() - '1 hour' :: interval)
>     group by 2
>     order by 1 desc
>     limit 10;
>
> In 7.2.1, the subselect was treated as a constant, and so the btree
> index was used to reduce the scan significantly.  I would get the
> result within a few seconds.
>
> In 7.3.2, I get a sequential scan!  Ugh!  It takes about 15 seconds.
>
> However, if I replace the subselect with a constant timestamp
> ('2003-03-10'), I get an index scan, so it isn't that the analyze
> hasn't recognized the number of records.
>
> Did I break something during the upgrade?
>
> Is there anything I can do to get the index scan back?
>
> Is this a known change between 7.2.1 and 7.3.2?


Re: constant scalar subselect no longer equivalent to constant?

From
Tom Lane
Date:
merlyn@stonehenge.com (Randal L. Schwartz) writes:
> I upgraded from 7.2.1 to 7.3.2 over the past weekend.  One of my
> favorite queries went from an indexed scan to a sequential scan.  Ugh.

> Here's the details...  200,000+ records, indexed on "stamp" (a timestamp).
> My query was:

>     select count(*), otherthing from requests
>     where stamp > (select now() - '1 hour' :: interval)
>     group by 2
>     order by 1 desc
>     limit 10;

FWIW, you don't need a subselect here anymore; it should work as well
(or as poorly) to do "where stamp > (now() - '1 hour' :: interval)".

> In 7.3.2, I get a sequential scan!  Ugh!  It takes about 15 seconds.

Can you force an indexscan by doing "set enable_seqscan to off"?
If not, the problem is likely a datatype mismatch as Josh suggested.
If you can, then the problem is that the planner doesn't think this is
selective enough to justify an indexscan.  (I'm kinda surprised that
7.2 wouldn't have thought the same.)  The difficulty is that since
now() isn't a constant, the planner doesn't know what value the stamp
column will get compared to, and so it has to fall back on a default
selectivity estimate that will not be favorable to an indexscan.

If that's your problem, the answer is to add a dummy condition to turn
the query into a range scan.  This should work:
    where stamp > (now() - '1 hour'::interval)
      and stamp < (now() + '1 hour'::interval);
The planner still doesn't know the selectivity of the now() conditions,
but its default estimate for a range query is lots tighter than for
a one-sided inequality.  It should be willing to indexscan this way.

            regards, tom lane

Re: constant scalar subselect no longer equivalent to constant?

From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> FWIW, you don't need a subselect here anymore; it should work as well
Tom> (or as poorly) to do "where stamp > (now() - '1 hour' :: interval)".

Ahh.  In 7.2, that was seen as a "calculate a lot" expression, so
I ended up with a seq scan.  The subselect kludge was enough for it
to compute it once, and figure out that it could use the index.

>> In 7.3.2, I get a sequential scan!  Ugh!  It takes about 15 seconds.

Tom> Can you force an indexscan by doing "set enable_seqscan to off"?

Yes.  That does it, but of course, I'd rather not change global status
thingies that way. :)

Tom> If not, the problem is likely a datatype mismatch as Josh suggested.

Tom> If you can, then the problem is that the planner doesn't think this is
Tom> selective enough to justify an indexscan.  (I'm kinda surprised that
Tom> 7.2 wouldn't have thought the same.)  The difficulty is that since
Tom> now() isn't a constant, the planner doesn't know what value the stamp
Tom> column will get compared to, and so it has to fall back on a default
Tom> selectivity estimate that will not be favorable to an indexscan.

So this changed between 7.2 and 7.3?

What's odd is that even writing a function didn't help:

    add function ago(interval) returns timestamp with time zone
    stable
    language 'sql'
    as 'select now() - $1';

I thought the addition of the "stable" keyword would make the return
value be the same as a constant.  It's not enough, apparently.  Is
that a bug, that a stable function and a constant are planned
differently?

Tom> If that's your problem, the answer is to add a dummy condition to turn
Tom> the query into a range scan.  This should work:
Tom>     where stamp > (now() - '1 hour'::interval)
Tom>       and stamp < (now() + '1 hour'::interval);
Tom> The planner still doesn't know the selectivity of the now() conditions,
Tom> but its default estimate for a range query is lots tighter than for
Tom> a one-sided inequality.  It should be willing to indexscan this way.

Ahh!  It does!

  explain select stamp from requests where stamp between now() - '1 hour'::interval and now();

does in fact give me an index scan by default.

Thanks for the workaround.  This planner stuff always seems like black
magic. :)

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: constant scalar subselect no longer equivalent to constant?

From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "Joshua" == Joshua Moore-Oliva <josh@chatgris.com> writes:

Joshua> The fast that constant timestamp works seems to further prove
Joshua> my theory since then you have timestamp without time zone.

Joshua> I'd be interested to know if this is the solution :)

I misspoke if that was implied.  It is indeed "timestamp with timestone".

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: constant scalar subselect no longer equivalent to constant?

From
Tom Lane
Date:
merlyn@stonehenge.com (Randal L. Schwartz) writes:
> So this changed between 7.2 and 7.3?

Well, no, it didn't change.  AFAIR the only relevant difference is that
7.3 classifies now() as stable whereas 7.2 didn't have that concept and
so had to classify it as noncachable ("volatile" in 7.3 terminology).

What your sub-select was doing for you in 7.2 was hiding the volatile
function --- the planner's decision about whether an expression is safe
to compare to an index doesn't look inside subselects.  (Which is
arguably a bug, but so far I haven't dared change it because I knew
people were using exactly this hack...)  In 7.3 that's not necessary
because now() isn't considered volatile anyway.

But in either version, what the planner is seeing is
    WHERE indexedcolumn > unknownvalue
which is going to force it to fall back on a default estimate for the
number of rows retrieved.  If you were actually scanning the whole
table, you'd not have gotten an indexscan in either version.  But the
presence of the LIMIT causes the planner to prefer a fast-start plan,
rather than the seqscan-and-sort you'd have gotten without any LIMIT.

I'm guessing that the situation with your table is right on the edge
of where the planner will flip between indexscan and seqscan for this
query.  How much difference is there in the estimated cost when you
force indexscan or seqscan?  It could be that the change in plan is
just due to some small tweak in the cost models between 7.2 and 7.3.

> What's odd is that even writing a function didn't help:
>     add function ago(interval) returns timestamp with time zone
>     stable
>     language 'sql'
>     as 'select now() - $1';
> I thought the addition of the "stable" keyword would make the return
> value be the same as a constant.

No, you'd have to claim it to be immutable to make the planner treat it
as a constant.  Stable means that the function value will hold still
long enough to be safe to use as an indexscan qualifier (ie, that
actually evaluating it at each row of a seqscan would give the same
semantic result as evaluating it only once for use in an indexscan).
It does not make it okay to pre-evaluate such a function during
planning.  now() is pretty much the prototypical example of why we
invented the "stable" category: it doesn't change within a query,
but that doesn't make it safe for the planner to pre-evaluate it.
Plans can live across transactions, so the value of now() at runtime
isn't necessarily what the planner would have gotten.

            regards, tom lane