Thread: Can this be indexed?
Is there a way to create an index that would make this query be efficient and not perform a sequential scan? SELECT count(*) AS count,id FROM sometable GROUP BY id; .. I've considered creating a rule on this table which would put the results of this into another table anytime it is updated, but I thought there might be an easier way. Thanks! - Greg
On Saturday November 6 2004 7:34, Net Virtual Mailing Lists wrote: > Is there a way to create an index that would make this query be efficient > and not perform a sequential scan? > > SELECT count(*) AS count,id FROM sometable GROUP BY id; > > .. I've considered creating a rule on this table which would put the > results of this into another table anytime it is updated, but I thought > there might be an easier way. Since you have no "WHERE" clause and you want to group by id, I believe pgsql has to scan all id values. Those id values are only fully stored in the table, so I don't think so. Ed
I am not clear how to use a trigger for this, I will need to look into that.... It is my understanding that Postgres does not have materialized views though (which I believe would solve this problem nicely) - am I mistaken?... - Greg >Net Virtual Mailing Lists wrote: >> Is there a way to create an index that would make this query be efficient >> and not perform a sequential scan? >> >> SELECT count(*) AS count,id FROM sometable GROUP BY id; > >Indexes cannot be used for retrieving results... > > >> .. I've considered creating a rule on this table which would put the >> results of this into another table anytime it is updated, but I thought >> there might be an easier way. > >I don't think that a rule could come useful in this case. IMHO you could >use triggers or a materialized view to store the results and speed up >things. > > >Best regards >-- >Matteo Beccati >http://phpadsnew.com/ >http://phppgads.com/ >
> > >Net Virtual Mailing Lists wrote: > >> Is there a way to create an index that would make this query be efficient > >> and not perform a sequential scan? > >> > >> SELECT count(*) AS count,id FROM sometable GROUP BY id; > > > >Indexes cannot be used for retrieving results... I'm not sure if it would improve performance at all, given the entire table needs to be scanned anyway, but add a where clause "where id > 0" should allow an index on id to be used. Possibly a bit like speeding up "select max(id) from ..." Someone who knows more about the internals of Postgis can prob comment on the validity/idiocy of this suggestion :-) Brent Wood
> > >Net Virtual Mailing Lists wrote: > >> Is there a way to create an index that would make this query be efficient > >> and not perform a sequential scan? > >> > >> SELECT count(*) AS count,id FROM sometable GROUP BY id; > > > >Indexes cannot be used for retrieving results... I'm not sure if it would improve performance at all, given the entire table needs to be scanned anyway, but add a where clause "where id > 0" should allow an index on id to be used. Possibly a bit like speeding up "select max(id) from ..." Someone who knows more about the internals of Postgis can prob comment on the validity/idiocy of this suggestion :-) Brent Wood
But if you do build an index over "id" then pgsql would only have to do a sequential scan on that index, which might be a lot faster if your table contains a lot of other data, won't it? Jerry ""Ed L."" <pgsql@bluepolka.net> wrote in message news:200411060930.30859.pgsql@bluepolka.net... > On Saturday November 6 2004 7:34, Net Virtual Mailing Lists wrote: >> Is there a way to create an index that would make this query be efficient >> and not perform a sequential scan? >> >> SELECT count(*) AS count,id FROM sometable GROUP BY id; >> >> .. I've considered creating a rule on this table which would put the >> results of this into another table anytime it is updated, but I thought >> there might be an easier way. > > Since you have no "WHERE" clause and you want to group by id, I believe > pgsql has to scan all id values. Those id values are only fully stored in > the table, so I don't think so. > > Ed > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
> Is there a way to create an index that would make this query be efficient > and not perform a sequential scan? > > SELECT count(*) AS count,id FROM sometable GROUP BY id; > > .. I've considered creating a rule on this table which would put the > results of this into another table anytime it is updated, but I thought > there might be an easier way. Sure. Try to create an index on id. Another way to improve this query is to use HashAggregate (this is new in 7.4). Sometimes it is much faster than group-by-using-index-scan. To enable HashAggregate you might want to increase sort_mem. -- Tatsuo Ishii
PostgreSQL doesn't provide pre-configured support for materialized views as such, but using some PL/pgSQL and triggers, onecan easily implement any kind of materialized view as seen fit for the specific intended purpose (Snapshot, Eager, Lazy,Very Lazy). You may find an excellent tutorial on materialized views with PostgreSQL here: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html > -----Ursprüngliche Nachricht----- > Von: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von > Net Virtual Mailing Lists > Gesendet: Samstag, 6. November 2004 16:49 > An: Matteo Beccati > Betreff: Re: [GENERAL] Can this be indexed? > > I am not clear how to use a trigger for this, I will need to > look into that.... > > It is my understanding that Postgres does not have > materialized views though (which I believe would solve this > problem nicely) - am I mistaken?... > > > - Greg >
Markus, Thank you for your hint! I spent the better part of last night working on this and finally was able to get it to work the way I wanted. The short version: I am continually amazed by the flexibility in Postgres, this isn't the sort of thing I'd want to go back years from now and digest what I did but this really has given me the best of both worlds: *very* quick query times (4631ms down to 2 ms!) and when the data changes the trigger which does the update is very quick, which beats my previous plan which involved rerunning the query again and then caching the result for subsequent queries.... Really this is great stuff! I simply cannot thank you (and all the other folks on this list who have helped me) enough! - Greg >PostgreSQL doesn't provide pre-configured support for materialized views >as such, but using some PL/pgSQL and triggers, one can easily implement >any kind of materialized view as seen fit for the specific intended >purpose (Snapshot, Eager, Lazy, Very Lazy). > >You may find an excellent tutorial on materialized views with PostgreSQL >here: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html > >> -----Ursprüngliche Nachricht----- >> Von: pgsql-general-owner@postgresql.org >> [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von >> Net Virtual Mailing Lists >> Gesendet: Samstag, 6. November 2004 16:49 >> An: Matteo Beccati >> Betreff: Re: [GENERAL] Can this be indexed? >> >> I am not clear how to use a trigger for this, I will need to >> look into that.... >> >> It is my understanding that Postgres does not have >> materialized views though (which I believe would solve this >> problem nicely) - am I mistaken?... >> >> >> - Greg >>
On Sun, Nov 07, 2004 at 09:29:30 +0000, Jerry III <jerryiii@hotmail.com> wrote: > But if you do build an index over "id" then pgsql would only have to do a > sequential scan on that index, which might be a lot faster if your table > contains a lot of other data, won't it? A full table index scan will be slower than a sequential scan; typically by a lot. In the old days a sort step would have been needed and that would have slowed things down. Now a method using hashing is available that will work unless there is an extremely large number of unique values for "id".