Hmm, interesting as I have that table clustered starting with the
rep_component, so 'ps_probe' will definitely appear later in a sequential
scan. So why does the <order by> force the use of the index?
Regards!
Ed
On Thu, 29 Apr 2004, Tom Lane wrote:
> Manfred Koizar <mkoi-pg@aon.at> writes:
> > The planner thinks that the seq scan has a startup cost of 0.00, i.e.
> > that it can return the first tuple immediately, which is obviously not
> > true in the presence of a filter condition.
>
> Not really --- the startup cost is really defined as "cost expended
> before we can start scanning for results". The estimated cost to select
> N tuples is actually "startup_cost + N*(total_cost-startup_cost)/M",
> where M is the estimated total rows returned. This is why the LIMIT
> shows a nonzero estimate for the cost to fetch 1 row.
>
> > Unfortunately there's no
> > easy way to fix this, because the statistics information does not have
> > information about the physical position of tuples with certain vaules.
>
> Yeah, I think the real problem is that the desired rows are not
> uniformly distributed, and in fact there are none near the start of the
> table. We do not keep stats detailed enough to let the planner discover
> this, so it has to estimate on the assumption of uniform distribution.
> On that assumption, it looks like a seqscan will hit a suitable tuple
> quickly enough to be faster than using the index.
>
> regards, tom lane
>