Josh,
I am no database expert, and even less knowledgeable about the internals
of postgresql, so I'll trust you on the 2 points you make below.
Are you saying the 7.4 'group by' trick would be faster than the simple select count(*)?
That seems hard to believe, being that the request now has to fetch / sort the data.
I must be missing something.
The kind of requests that I am really interested in are:
select count(*) from table where table.column like 'pattern%'
These seems to go much master on mysql (which I guess it not a MVCC database? or wasn't
the Innobase supposed to make it so?), than on postgresql.
So, in the meantime, I've decided to split up my data into two sets,
the static big tables which are handled by mysql, and the rest of it handled
by postgresql....
ps: apologies for the cross-posting.
> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: Tuesday, April 08, 2003 2:53 PM
> To: Denis; pgsql-performance@postgresql.org
> Subject: Re: [SQL] Yet Another (Simple) Case of Index not used
>
>
> Dennis,
>
> > I'm running into a quite puzzling simple example where the index I've
> > created on a fairly big table (465K entries) is not used, against all common
> > sense expectations:
> > The query I am trying to do (fast) is:
> >
> > select count(*) from addresses;
>
> PostgreSQL is currently unable to use indexes on aggregate queries. This is
> because of two factors:
> 1) MVCC means that the number of rows must be recalculated for each
> connection's current transaction, and cannot be "cached" anywhere by the
> database system;
> 2) Our extensible model of user-defined aggregates means that each aggregate
> is a "black box" whose internal operations are invisible to the planner.
>
> This is a known performance issue for Postgres, and I believe that a couple of
> people on Hackers are looking at modifying aggregate implementation for 8.0
> to use appropriate available indexes, at least for MIN, MAX and COUNT. Until
> then, you will need to either put up with the delay, or create a
> trigger-driven aggregates caching table.
>
> If you are trying to do a correlated count, like "SELECT type, count(*) from
> aggregates GROUP BY type", Tom Lane has already added a hash-aggregates
> structure in the 7.4 source that will speed this type of query up
> considerably for systems with lots of RAM.
>
> (PS: in the future, please stick to posting questions to one list at a time,
> thanks)
>
> --
> -Josh Berkus
> Aglio Database Solutions
> San Francisco
>