On Tue, Jan 18, 2011 at 8:35 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011:
>>
>> Hello Postgres Hackers,
>>
>> In reference to this todo item about clustering system table indexes,
>> ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php )
>> I have been studying the system tables to see which would benefit from
>> clustering. I have some index suggestions and a question if you have a
>> moment.
>
> Wow, this is really old stuff. I don't know if this is really of any
> benefit, given that these catalogs are loaded into syscaches anyway.
> Furthermore, if you cluster at initdb time, they will soon lose the
> ordering, given that updates move tuples around and inserts put them
> anywhere. So you'd need the catalogs to be re-clustered once in a
> while, and I don't see how you'd do that (except by asking the user to
> do it, which doesn't sound so great).
The idea of the TODO seems to have been to set the default clustering
to something reasonable. That doesn't necessarily seem like a bad
idea even if we can't automatically maintain the cluster order, but
there's some question in my mind whether we'd get any measurable
benefit from the clustering. Even on a database with a gigantic
number of tables, it seems likely that the relevant system catalogs
will stay fully cached and, as you point out, the system caches will
further blunt the impact of any work in this area. I think the first
thing to do would be to try to come up with a reproducible test case
where clustering the tables improves performance. If we can't, that
might mean it's time to remove this TODO.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company