Thread: Can this be indexed?

Can this be indexed?

From
"Net Virtual Mailing Lists"
Date:
Is there a way to create an index that would make this query be efficient
and not perform a sequential scan?

SELECT count(*) AS count,id FROM sometable GROUP BY id;

.. I've considered creating a rule on this table which would put the
results of this into another table anytime it is updated, but I thought
there might be an easier way.

Thanks!

- Greg


Re: Can this be indexed?

From
"Ed L."
Date:
On Saturday November 6 2004 7:34, Net Virtual Mailing Lists wrote:
> Is there a way to create an index that would make this query be efficient
> and not perform a sequential scan?
>
> SELECT count(*) AS count,id FROM sometable GROUP BY id;
>
> .. I've considered creating a rule on this table which would put the
> results of this into another table anytime it is updated, but I thought
> there might be an easier way.

Since you have no "WHERE" clause and you want to group by id, I believe
pgsql has to scan all id values.  Those id values are only fully stored in
the table, so I don't think so.

Ed


Re: Can this be indexed?

From
"Net Virtual Mailing Lists"
Date:
I am not clear how to use a trigger for this, I will need to look into
that....

It is my understanding that Postgres does not have materialized views
though (which I believe would solve this problem nicely) - am I mistaken?...


- Greg


>Net Virtual Mailing Lists wrote:
>> Is there a way to create an index that would make this query be efficient
>> and not perform a sequential scan?
>>
>> SELECT count(*) AS count,id FROM sometable GROUP BY id;
>
>Indexes cannot be used for retrieving results...
>
>
>> .. I've considered creating a rule on this table which would put the
>> results of this into another table anytime it is updated, but I thought
>> there might be an easier way.
>
>I don't think that a rule could come useful in this case. IMHO you could
>use triggers or a materialized view to store the results and speed up
>things.
>
>
>Best regards
>--
>Matteo Beccati
>http://phpadsnew.com/
>http://phppgads.com/
>



Re: Can this be indexed?

From
Brent Wood
Date:

>
> >Net Virtual Mailing Lists wrote:
> >> Is there a way to create an index that would make this query be efficient
> >> and not perform a sequential scan?
> >>
> >> SELECT count(*) AS count,id FROM sometable GROUP BY id;
> >
> >Indexes cannot be used for retrieving results...

I'm not sure if it would improve performance at all, given the entire
table needs to be scanned anyway, but add a where clause "where id > 0"
should allow an index on id to be used.

Possibly a bit like speeding up "select max(id) from ..."

Someone who knows more about the internals of Postgis can prob comment on
the validity/idiocy of this suggestion :-)

Brent Wood

Re: Can this be indexed?

From
Brent Wood
Date:

>
> >Net Virtual Mailing Lists wrote:
> >> Is there a way to create an index that would make this query be efficient
> >> and not perform a sequential scan?
> >>
> >> SELECT count(*) AS count,id FROM sometable GROUP BY id;
> >
> >Indexes cannot be used for retrieving results...

I'm not sure if it would improve performance at all, given the entire
table needs to be scanned anyway, but add a where clause "where id > 0"
should allow an index on id to be used.

Possibly a bit like speeding up "select max(id) from ..."

Someone who knows more about the internals of Postgis can prob comment on
the validity/idiocy of this suggestion :-)

Brent Wood

Re: Can this be indexed?

From
"Jerry III"
Date:
But if you do build an index over "id" then pgsql would only have to do a
sequential scan on that index, which might be a lot faster if your table
contains a lot of other data, won't it?

Jerry

""Ed L."" <pgsql@bluepolka.net> wrote in message
news:200411060930.30859.pgsql@bluepolka.net...
> On Saturday November 6 2004 7:34, Net Virtual Mailing Lists wrote:
>> Is there a way to create an index that would make this query be efficient
>> and not perform a sequential scan?
>>
>> SELECT count(*) AS count,id FROM sometable GROUP BY id;
>>
>> .. I've considered creating a rule on this table which would put the
>> results of this into another table anytime it is updated, but I thought
>> there might be an easier way.
>
> Since you have no "WHERE" clause and you want to group by id, I believe
> pgsql has to scan all id values.  Those id values are only fully stored in
> the table, so I don't think so.
>
> Ed
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>



Re: Can this be indexed?

From
Tatsuo Ishii
Date:
> Is there a way to create an index that would make this query be efficient
> and not perform a sequential scan?
>
> SELECT count(*) AS count,id FROM sometable GROUP BY id;
>
> .. I've considered creating a rule on this table which would put the
> results of this into another table anytime it is updated, but I thought
> there might be an easier way.

Sure. Try to create an index on id. Another way to improve this query
is to use HashAggregate (this is new in 7.4). Sometimes it is much
faster than group-by-using-index-scan. To enable HashAggregate
you might want to increase sort_mem.
--
Tatsuo Ishii

Re: Can this be indexed?

From
"Markus Wollny"
Date:
PostgreSQL doesn't provide pre-configured support for materialized views as such, but using some PL/pgSQL and triggers,
onecan easily implement any kind of materialized view as seen fit for the specific intended purpose (Snapshot, Eager,
Lazy,Very Lazy). 

You may find an excellent tutorial on materialized views with PostgreSQL here:
http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html 

> -----Ursprüngliche Nachricht-----
> Von: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von
> Net Virtual Mailing Lists
> Gesendet: Samstag, 6. November 2004 16:49
> An: Matteo Beccati
> Betreff: Re: [GENERAL] Can this be indexed?
>
> I am not clear how to use a trigger for this, I will need to
> look into that....
>
> It is my understanding that Postgres does not have
> materialized views though (which I believe would solve this
> problem nicely) - am I mistaken?...
>
>
> - Greg
>

Re: Can this be indexed?

From
"Net Virtual Mailing Lists"
Date:
Markus,

Thank you for your hint!

I spent the better part of last night working on this and finally was
able to get it to work the way I wanted.   The short version: I am
continually amazed by the flexibility in Postgres, this isn't the sort of
thing I'd want to go back years from now and digest what I did but this
really has given me the best of both worlds: *very* quick query times
(4631ms down to 2 ms!) and when the data changes the trigger which does
the update is very quick, which beats my previous plan which involved
rerunning the query again and then caching the result for subsequent
queries....

Really this is great stuff!  I simply cannot thank you (and all the other
folks on this list who have helped me) enough!

- Greg

>PostgreSQL doesn't provide pre-configured support for materialized views
>as such, but using some PL/pgSQL and triggers, one can easily implement
>any kind of materialized view as seen fit for the specific intended
>purpose (Snapshot, Eager, Lazy, Very Lazy).
>
>You may find an excellent tutorial on materialized views with PostgreSQL
>here: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html
>
>> -----Ursprüngliche Nachricht-----
>> Von: pgsql-general-owner@postgresql.org
>> [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von
>> Net Virtual Mailing Lists
>> Gesendet: Samstag, 6. November 2004 16:49
>> An: Matteo Beccati
>> Betreff: Re: [GENERAL] Can this be indexed?
>>
>> I am not clear how to use a trigger for this, I will need to
>> look into that....
>>
>> It is my understanding that Postgres does not have
>> materialized views though (which I believe would solve this
>> problem nicely) - am I mistaken?...
>>
>>
>> - Greg
>>



Re: Can this be indexed?

From
Bruno Wolff III
Date:
On Sun, Nov 07, 2004 at 09:29:30 +0000,
  Jerry III <jerryiii@hotmail.com> wrote:
> But if you do build an index over "id" then pgsql would only have to do a
> sequential scan on that index, which might be a lot faster if your table
> contains a lot of other data, won't it?

A full table index scan will be slower than a sequential scan; typically by
a lot. In the old days a sort step would have been needed and that would have
slowed things down. Now a method using hashing is available that will
work unless there is an extremely large number of unique values for "id".