Re: Cost of indexscan - Mailing list pgsql-performance

From Tom Lane
Subject Re: Cost of indexscan
Date
Msg-id 13308.1075475254@sss.pgh.pa.us
Whole thread Raw
In response to Cost of indexscan  (Kari Lavikka <tuner@bdb.fi>)
List pgsql-performance
Kari Lavikka <tuner@bdb.fi> writes:
> Postgres seems to estimate the cost of indexscan to be a bit too high.
> The table has something like 500000 rows and I have run reindex and vacuum
> analyze recently. Is there something to tune?

I think the real problem here is that the row estimate is off by a
factor of thirty:

>  Seq Scan on admin_event  (cost=0.00..19844.37 rows=154361 width=109) (actual time=479.173..2760.186 rows=4705
loops=1)

With a correct estimate the indexscan would have been chosen.

> galleria=> explain analyze SELECT * FROM admin_event WHERE stamp > (current_timestamp - '1
days'::INTERVAL)::TIMESTAMPWITHOUT TIME ZONE; 

It's not possible for the planner to make a good guess here since it
doesn't know what the comparison value for the stamp column is.
(current_timestamp isn't a constant and so the comparison expression
can't be reduced to a constant at plan time.)

The traditional solution for this is to cheat:

create function ago(interval) returns timestamp without time zone as
'select localtimestamp - $1' language sql strict immutable;

select * from admin_event where stamp > ago('1 days');

This works because the function is mislabeled as immutable, encouraging
the planner to fold the result to a constant on sight.  It also has the
pleasant property of making your query more readable.  The downside is
that you are in fact lying to the system about the behavior of the ago()
function, and so you can get in trouble.  This only really works for
queries executed interactively --- you can't use this method inside
plpgsql functions, for instance.

> Distribution of stamp looks like the following:

Hm, you might also find that increasing the statistics target for stamp
would be a good idea, since its distribution is so skewed.  But unless
you do something like the above, the statistics won't get used anyway...

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: query optimization question
Next
From: Jack Coates
Date:
Subject: Re: query optimization question