Home > mailing lists

Re: bad estimates - Mailing list pgsql-performance

From	Ken Geis
Subject	Re: bad estimates
Date	August 29, 2003 00:46:37
Msg-id	3F4ECC78.7090502@speakeasy.org Whole thread Raw
In response to	Re: bad estimates / non-scanning aggregates (Bruno Wolff III <bruno@wolff.to>)
Responses	Re: bad estimates
List	pgsql-performance

Tree view

Bruno Wolff III wrote:
> On Thu, Aug 28, 2003 at 20:00:32 -0700,
>   Ken Geis <kgeis@speakeasy.org> wrote:
>
>>Bruno Wolff III wrote:
>>
>>>>Not according to the optimizer!  Plus, this is not guaranteed to return
>>>>the correct results.
>>>
>>>For it to be fast you need an index on (stock_id, price_date) so that
>>>you can use an index scan.
>>
>>I already said that such an index existed.  In fact, it is the primary
>>key of the table.  And yes, I *am* analyzed!
>
>
> Your original example didn't actually match that of the table you are showing
> examples from. In that example the second half of the primary key was the
> date not the end of the day price. If this is the case for the real table,
> then that is the reason the distinct on doesn't help.

I had obfuscated the table in the example and forgot to do the same with
the query.  Serves me right for thinking I care about that.

A big problem is that the values I am working with are *only* the
primary key and the optimizer is choosing a table scan over an index
scan.  That is why I titled the email "bad estimates."  The table has
(stock_id, price_date) as the primary key, and a bunch of other columns.
  What I *really* want to do efficiently is

select stock_id, min(price_date), max(price_date)
   from day_ends
  group by stock_id;

It is not the table or the query that is wrong.  It is either the db
parameters or the optimizer itself.

Ken

pgsql-performance by date:

From: Eko Pranoto
Date: 29 August 2003, 00:40:49
Subject: PostgreSQL HDD Grow capacity

From: Bruno Wolff III
Date: 29 August 2003, 00:59:39
Subject: Re: bad estimates

Re: bad estimates - Mailing list pgsql-performance

Previous

Next