Agreed.
We originally created the indexes this way because we sometimes do
searches where one of the columns is constrained using =, and the other
using a range search, but it's not clear to me how much Postgres
understands multi-column indexes. Will I get the gain I'd expect from a
(member_id, group_id) index on a query like "where member_id = ? and
group_id > ?"?
I've since found a few other often-used tables where the reltuples
counts generated by 'analyze' are off by a factor of 5 or more. In the
short term, I'm just trying to eliminate the automatic-analyzes where
possible and make sure they're followed up quickly with a 'vacuum' where
it's not possible.
Is "analyze generating bad stats" a known issue? Is there anything I
could be doing to aggravate or work around the problem?
mike
-----Original Message-----
From: Rod Taylor [mailto:ports@rbt.ca]
Sent: Friday, June 04, 2004 5:27 PM
To: Michael Nonemacher
Cc: Postgresql Performance
Subject: Re: [PERFORM] postgres performance: comparing 2 data centers
> The members table contains about 500k rows. It has an index on
> (group_id, member_id) and on (member_id, group_id).
Yes, bad stats are causing it to pick a poor plan, but you're giving it
too many options (which doesn't help) and using space up unnecessarily.
Keep (group_id, member_id)
Remove (member_id, group_id)
Add (member_id)
An index on just member_id is actually going to perform better than
member_id, group_id since it has a smaller footprint on the disk.
Anytime where both group_id and member_id are in the query, the
(group_id, member_id) index will likely be used.
--
Rod Taylor <rbt [at] rbt [dot] ca>
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL PGP Key:
http://www.rbt.ca/signature.asc