Re: BUG #2658: Query not using index - Mailing list pgsql-performance

From Mark Lewis
Subject Re: BUG #2658: Query not using index
Date
Msg-id 1159911267.18640.83.camel@archimedes
Whole thread Raw
In response to Re: BUG #2658: Query not using index  (Graham Davis <gdavis@refractions.net>)
Responses Re: BUG #2658: Query not using index
List pgsql-performance
Hmmm.  How many distinct assetids are there?
-- Mark Lewis

On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote:
> The "summary table" approach maintained by triggers is something we are
> considering, but it becomes a bit more complicated to implement.
> Currently we have groups of new positions coming in every few seconds or
> less.  They are not guaranteed to be in order.  So for instance, a group
> of positions from today could come in and be inserted, then a group of
> positions that got lost from yesterday could come in and be inserted
> afterwards.
>
> This means the triggers would have to do some sort of logic to figure
> out if the newly inserted position is actually the most recent by
> timestamp.  If positions are ever deleted or updated, the same sort of
> query that is currently running slow will need to be executed in order
> to get the new most recent position.  So there is the possibility that
> new positions can be inserted faster than the triggers can calculate
> and  maintain the summary table.  There are some other complications
> with maintaining such a summary table in our system too, but I won't get
> into those.
>
> Right now I'm just trying to see if I can get the query itself running
> faster, which would be the easiest solution for now.
>
> Graham.
>
>
> Mark Lewis wrote:
>
> >Have you looked into a materialized view sort of approach?  You could
> >create a table which had assetid as a primary key, and max_ts as a
> >column.  Then use triggers to keep that table up to date as rows are
> >added/updated/removed from the main table.
> >
> >This approach would only make sense if there were far fewer distinct
> >assetid values than rows in the main table, and would get slow if you
> >commonly delete rows from the main table or decrease the value for ts in
> >the row with the highest ts for a given assetid.
> >
> >-- Mark Lewis
> >
> >On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote:
> >
> >
> >>Thanks Tom, that explains it and makes sense.  I guess I will have to
> >>accept this query taking 40 seconds, unless I can figure out another way
> >>to write it so it can use indexes.  If there are any more syntax
> >>suggestions, please pass them on.  Thanks for the help everyone.
> >>
> >>Graham.
> >>
> >>
> >>Tom Lane wrote:
> >>
> >>
> >>
> >>>Graham Davis <gdavis@refractions.net> writes:
> >>>
> >>>
> >>>
> >>>
> >>>>How come an aggreate like that has to use a sequential scan?  I know
> >>>>that PostgreSQL use to have to do a sequential scan for all aggregates,
> >>>>but there was support added to version 8 so that aggregates would take
> >>>>advantage of indexes.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>Not in a GROUP BY context, only for the simple case.  Per the comment in
> >>>planagg.c:
> >>>
> >>>     * We don't handle GROUP BY, because our current implementations of
> >>>     * grouping require looking at all the rows anyway, and so there's not
> >>>     * much point in optimizing MIN/MAX.
> >>>
> >>>The problem is that using an index to obtain the maximum value of ts for
> >>>a given value of assetid is not the same thing as finding out what all
> >>>the distinct values of assetid are.
> >>>
> >>>This could possibly be improved but it would take a considerable amount
> >>>more work.  It's definitely not in the category of "bug fix".
> >>>
> >>>            regards, tom lane
> >>>
> >>>
> >>>
> >>>
> >>
> >>
>
>

pgsql-performance by date:

Previous
From: Graham Davis
Date:
Subject: Re: BUG #2658: Query not using index
Next
From: Graham Davis
Date:
Subject: Re: BUG #2658: Query not using index