Re: BUG #2658: Query not using index - Mailing list pgsql-performance
From | Mark Lewis |
---|---|
Subject | Re: BUG #2658: Query not using index |
Date | |
Msg-id | 1159911267.18640.83.camel@archimedes Whole thread Raw |
In response to | Re: BUG #2658: Query not using index (Graham Davis <gdavis@refractions.net>) |
Responses |
Re: BUG #2658: Query not using index
|
List | pgsql-performance |
Hmmm. How many distinct assetids are there? -- Mark Lewis On Tue, 2006-10-03 at 14:23 -0700, Graham Davis wrote: > The "summary table" approach maintained by triggers is something we are > considering, but it becomes a bit more complicated to implement. > Currently we have groups of new positions coming in every few seconds or > less. They are not guaranteed to be in order. So for instance, a group > of positions from today could come in and be inserted, then a group of > positions that got lost from yesterday could come in and be inserted > afterwards. > > This means the triggers would have to do some sort of logic to figure > out if the newly inserted position is actually the most recent by > timestamp. If positions are ever deleted or updated, the same sort of > query that is currently running slow will need to be executed in order > to get the new most recent position. So there is the possibility that > new positions can be inserted faster than the triggers can calculate > and maintain the summary table. There are some other complications > with maintaining such a summary table in our system too, but I won't get > into those. > > Right now I'm just trying to see if I can get the query itself running > faster, which would be the easiest solution for now. > > Graham. > > > Mark Lewis wrote: > > >Have you looked into a materialized view sort of approach? You could > >create a table which had assetid as a primary key, and max_ts as a > >column. Then use triggers to keep that table up to date as rows are > >added/updated/removed from the main table. > > > >This approach would only make sense if there were far fewer distinct > >assetid values than rows in the main table, and would get slow if you > >commonly delete rows from the main table or decrease the value for ts in > >the row with the highest ts for a given assetid. > > > >-- Mark Lewis > > > >On Tue, 2006-10-03 at 13:52 -0700, Graham Davis wrote: > > > > > >>Thanks Tom, that explains it and makes sense. I guess I will have to > >>accept this query taking 40 seconds, unless I can figure out another way > >>to write it so it can use indexes. If there are any more syntax > >>suggestions, please pass them on. Thanks for the help everyone. > >> > >>Graham. > >> > >> > >>Tom Lane wrote: > >> > >> > >> > >>>Graham Davis <gdavis@refractions.net> writes: > >>> > >>> > >>> > >>> > >>>>How come an aggreate like that has to use a sequential scan? I know > >>>>that PostgreSQL use to have to do a sequential scan for all aggregates, > >>>>but there was support added to version 8 so that aggregates would take > >>>>advantage of indexes. > >>>> > >>>> > >>>> > >>>> > >>>Not in a GROUP BY context, only for the simple case. Per the comment in > >>>planagg.c: > >>> > >>> * We don't handle GROUP BY, because our current implementations of > >>> * grouping require looking at all the rows anyway, and so there's not > >>> * much point in optimizing MIN/MAX. > >>> > >>>The problem is that using an index to obtain the maximum value of ts for > >>>a given value of assetid is not the same thing as finding out what all > >>>the distinct values of assetid are. > >>> > >>>This could possibly be improved but it would take a considerable amount > >>>more work. It's definitely not in the category of "bug fix". > >>> > >>> regards, tom lane > >>> > >>> > >>> > >>> > >> > >> > >
pgsql-performance by date: