Re: Eliminating VACUUM FULL WAS: remove flatfiles.c - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Eliminating VACUUM FULL WAS: remove flatfiles.c
Date
Msg-id 603c8f071001311458i767e8c04l69c38c00053287b3@mail.gmail.com
Whole thread Raw
In response to Re: Eliminating VACUUM FULL WAS: remove flatfiles.c  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Jan 31, 2010 at 3:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Back in September I wrote:
>> ... The sticking point --- not only for pg_class,
>> but for shared catalogs such as pg_database --- is the lack of a
>> way to track relfilenode if it ever changes.  What if we keep
>> the relfilenode of these critical tables someplace else?  For
>> instance, we could have a "map" file in each database holding
>> the relfilenode of pg_class, and one in $PGDATA/global holding
>> the relfilenodes of the shared catalogs and indexes.  It'd be
>> possible to update a map file atomically via the rename(2) trick.
>> Then we teach relcache or some similar place to believe the map
>> files over the contents of pg_class.
>
> Thinking about this some more, I can see one small disadvantage:
> for the relations that we use the map file for, pg_class.relfilenode
> would not be trustworthy.  This would not affect most of the system
> internals (which will be looking at the relcache's copy, which would
> be kept valid by the relcache code).  But it would affect user queries,
> such as for example attempts to use contrib/oid2name to identify a
> file on-disk.  The main case where pg_class.relfilenode would be
> likely to be out-of-sync is for shared catalogs.  We could keep it
> up to date in most cases for local catalogs, but there's no hope
> of reaching into other databases' pg_class when a shared catalog
> is relocated.
>
> What I'd suggest doing about this is:
>
> (1) Store zero in pg_class.relfilenode for those catalogs for which
> the map is used.  This at least makes it obvious that the value
> you're looking at isn't valid.
>
> (2) Provide a SQL function to extract the real relfilenode of any
> specified pg_class entry.  We'd have to modify oid2name and
> pg_dump to know to use the function instead of looking at the
> column.
>
> There might be some other client-side code that would be broken
> until it got taught about the function, but hopefully not much.
>
> Thoughts?

Seems reasonable to me (assuming there's no way to avoid changing the
relfilenode, which I assume is the case but don't actually know the
code well enough to say with certainty).

...Robert


pgsql-hackers by date:

Previous
From: Mark Cave-Ayland
Date:
Subject: Re: development setup and libdir
Next
From: Simon Riggs
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Augment WAL records for btree delete with GetOldestXmin() to