Re: ToDo List Item - System Table Index Clustering - Mailing list pgsql-hackers

From Robert Haas
Subject Re: ToDo List Item - System Table Index Clustering
Date
Msg-id AANLkTi=irkZGyFCwmNYyQcxpzeon8LF+DbdtufpxsgQG@mail.gmail.com
Whole thread Raw
In response to Re: ToDo List Item - System Table Index Clustering  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: ToDo List Item - System Table Index Clustering  ("Simone Aiken" <saiken@ulfheim.net>)
Re: ToDo List Item - System Table Index Clustering  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Tue, Jan 18, 2011 at 8:35 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011:
>>
>> Hello Postgres Hackers,
>>
>> In reference to this todo item about clustering system table indexes,
>> ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php )
>> I have been studying the system tables to see which would benefit  from
>> clustering.  I have some index suggestions and a question if you have a
>> moment.
>
> Wow, this is really old stuff.  I don't know if this is really of any
> benefit, given that these catalogs are loaded into syscaches anyway.
> Furthermore, if you cluster at initdb time, they will soon lose the
> ordering, given that updates move tuples around and inserts put them
> anywhere.  So you'd need the catalogs to be re-clustered once in a
> while, and I don't see how you'd do that (except by asking the user to
> do it, which doesn't sound so great).

The idea of the TODO seems to have been to set the default clustering
to something reasonable.  That doesn't necessarily seem like a bad
idea even if we can't automatically maintain the cluster order, but
there's some question in my mind whether we'd get any measurable
benefit from the clustering.  Even on a database with a gigantic
number of tables, it seems likely that the relevant system catalogs
will stay fully cached and, as you point out, the system caches will
further blunt the impact of any work in this area.  I think the first
thing to do would be to try to come up with a reproducible test case
where clustering the tables improves performance.  If we can't, that
might mean it's time to remove this TODO.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: texteq/byteaeq: avoid detoast [REVIEW]
Next
From: Bruce Momjian
Date:
Subject: Re: test_fsync open_sync test