Re: Tablespaces - Mailing list pgsql-hackers

From Gavin Sherry
Subject Re: Tablespaces
Date
Msg-id Pine.LNX.4.58.0402270823180.20164@linuxworld.com.au
Whole thread Raw
In response to Re: Tablespaces  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Tablespaces  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Tablespaces  (Greg Stark <gsstark@mit.edu>)
Re: Tablespaces  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
On Thu, 26 Feb 2004, Tom Lane wrote:

> Gavin Sherry <swm@linuxworld.com.au> writes:
> > A table space is a directory structure. The directory structure is as
> > follows:
> > [swm@dev /path/to/tblspc]$ ls
> > OID1/    OID2/
> > OID1 and OID2 are the OIDs of databases which have created a table space
> > against this file system location. In this respect, a table space
> > resembles $PGDATA/base. I thought it useful to keep this kind of
> > namespace mechanism in place ...
>
> Actually, this is *necessary* AFAICT.  The case that forces it is DROP
> DATABASE.  Since you have to execute that from another database, there's
> no reasonable way to look into the target database's catalogs.  That
> means that the OID of the database has to be sufficient information to
> get rid of all its files.  You can do this fairly easily if in each
> tablespace (whose locations you know from the shared pg_tablespace
> table) you can look for a subdirectory matching the target database's
> OID.  If we tried to put the database's files just "loose" in each
> tablespace directory then we'd be in trouble.

Ahhh. Yes.

>
> I think this is also an implementation reason for favoring cluster-wide
> tablespaces over database-local ones.  I'm not sure how you drop a
> database from outside if you can't see where its tablespaces are.

Naturally.

>
> I believe that it will be necessary to expand RelFileNode to three OIDs
> (tablespace, database, relation).  I had once hoped that it could be
> kept at two (tablespace, relation) but with a physical layout like this
> you more or less have to have three.

Yes. I agree.

>
> One issue that needs to be agreed on early is how the low-level file
> access code finds a tablespace.  What I would personally like is for
> $PGDATA to contain symlinks to the tablespace top directories.  The
> actual access path for any relation could then be built trivially from
> its RelFileNode:
>     $PGDATA/tablespaces/TBOID/DBOID/RELFILENODE
>         -------------------------
> The underlined part references a symlink that leads to the directory
> containing the per-database subdirectories.
>
> I am expecting to hear some bleating about this from people whose
> preferred platforms don't support symlinks ;-).  However, if we don't

Actually, I think that's a pretty good idea :-). I'd solves a bunch of
issues in the backend (postmaster start up can recurse through
$PGDATA/tablespaces looking for postmaster.pid files) and will also assist
admins with complex configurations (perhaps).

> Speaking of locking, can we do anything to prevent people from shooting
> themselves in the foot by changing active tablespaces?  Are we even
> going to have a DROP TABLESPACE command, and if so what would it do?

Ahh. I forgot to detail my ideas on this. It seems to me that we cannot
drop a table space until the directory is empty. We will need a shared
invalidation message so that backends do not attempt to create an object
just after we drop the table space.

Thanks,

Gavin


pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: Tablespaces
Next
From: James Rogers
Date:
Subject: Re: Tablespaces