Re: BUG #2658: Query not using index - Mailing list pgsql-performance
From | Graham Davis |
---|---|
Subject | Re: BUG #2658: Query not using index |
Date | |
Msg-id | 4522D7B1.3040008@refractions.net Whole thread Raw |
In response to | Re: BUG #2658: Query not using index (Mark Lewis <mark.lewis@mir3.com>) |
Responses |
Re: BUG #2658: Query not using index
(Mark Lewis <mark.lewis@mir3.com>)
|
List | pgsql-performance |
Not many. It fluctuates, but there are usually only ever a few hundred at most. Each assetid has multi-millions of positions though. Mark Lewis wrote: >Hmmm. How many distinct assetids are there? >-- Mark Lewis > >On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote: > > >>The "summary table" approach maintained by triggers is something we are >>considering, but it becomes a bit more complicated to implement. >>Currently we have groups of new positions coming in every few seconds or >>less. They are not guaranteed to be in order. So for instance, a group >>of positions from today could come in and be inserted, then a group of >>positions that got lost from yesterday could come in and be inserted >>afterwards. >> >>This means the triggers would have to do some sort of logic to figure >>out if the newly inserted position is actually the most recent by >>timestamp. If positions are ever deleted or updated, the same sort of >>query that is currently running slow will need to be executed in order >>to get the new most recent position. So there is the possibility that >>new positions can be inserted faster than the triggers can calculate >>and maintain the summary table. There are some other complications >>with maintaining such a summary table in our system too, but I won't get >>into those. >> >>Right now I'm just trying to see if I can get the query itself running >>faster, which would be the easiest solution for now. >> >>Graham. >> >> >>Mark Lewis wrote: >> >> >> >>>Have you looked into a materialized view sort of approach? You could >>>create a table which had assetid as a primary key, and max_ts as a >>>column. Then use triggers to keep that table up to date as rows are >>>added/updated/removed from the main table. >>> >>>This approach would only make sense if there were far fewer distinct >>>assetid values than rows in the main table, and would get slow if you >>>commonly delete rows from the main table or decrease the value for ts in >>>the row with the highest ts for a given assetid. >>> >>>-- Mark Lewis >>> >>>On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote: >>> >>> >>> >>> >>>>Thanks Tom, that explains it and makes sense. I guess I will have to >>>>accept this query taking 40 seconds, unless I can figure out another way >>>>to write it so it can use indexes. If there are any more syntax >>>>suggestions, please pass them on. Thanks for the help everyone. >>>> >>>>Graham. >>>> >>>> >>>>Tom Lane wrote: >>>> >>>> >>>> >>>> >>>> >>>>>Graham Davis <gdavis@refractions.net> writes: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>How come an aggreate like that has to use a sequential scan? I know >>>>>>that PostgreSQL use to have to do a sequential scan for all aggregates, >>>>>>but there was support added to version 8 so that aggregates would take >>>>>>advantage of indexes. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>Not in a GROUP BY context, only for the simple case. Per the comment in >>>>>planagg.c: >>>>> >>>>> * We don't handle GROUP BY, because our current implementations of >>>>> * grouping require looking at all the rows anyway, and so there's not >>>>> * much point in optimizing MIN/MAX. >>>>> >>>>>The problem is that using an index to obtain the maximum value of ts for >>>>>a given value of assetid is not the same thing as finding out what all >>>>>the distinct values of assetid are. >>>>> >>>>>This could possibly be improved but it would take a considerable amount >>>>>more work. It's definitely not in the category of "bug fix". >>>>> >>>>> regards, tom lane >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >> >> -- Graham Davis Refractions Research Inc. gdavis@refractions.net
pgsql-performance by date: