Re: bad estimates - Mailing list pgsql-performance

From Ken Geis
Subject Re: bad estimates
Date
Msg-id 3F4ECC78.7090502@speakeasy.org
Whole thread Raw
In response to Re: bad estimates / non-scanning aggregates  (Bruno Wolff III <bruno@wolff.to>)
Responses Re: bad estimates
List pgsql-performance
Bruno Wolff III wrote:
> On Thu, Aug 28, 2003 at 20:00:32 -0700,
>   Ken Geis <kgeis@speakeasy.org> wrote:
>
>>Bruno Wolff III wrote:
>>
>>>>Not according to the optimizer!  Plus, this is not guaranteed to return
>>>>the correct results.
>>>
>>>For it to be fast you need an index on (stock_id, price_date) so that
>>>you can use an index scan.
>>
>>I already said that such an index existed.  In fact, it is the primary
>>key of the table.  And yes, I *am* analyzed!
>
>
> Your original example didn't actually match that of the table you are showing
> examples from. In that example the second half of the primary key was the
> date not the end of the day price. If this is the case for the real table,
> then that is the reason the distinct on doesn't help.

I had obfuscated the table in the example and forgot to do the same with
the query.  Serves me right for thinking I care about that.

A big problem is that the values I am working with are *only* the
primary key and the optimizer is choosing a table scan over an index
scan.  That is why I titled the email "bad estimates."  The table has
(stock_id, price_date) as the primary key, and a bunch of other columns.
  What I *really* want to do efficiently is

select stock_id, min(price_date), max(price_date)
   from day_ends
  group by stock_id;

It is not the table or the query that is wrong.  It is either the db
parameters or the optimizer itself.


Ken


pgsql-performance by date:

Previous
From: Eko Pranoto
Date:
Subject: PostgreSQL HDD Grow capacity
Next
From: Bruno Wolff III
Date:
Subject: Re: bad estimates