Re: For the ametures. (related to "Are we losing - Mailing list pgsql-hackers

From cbbrowne@cbbrowne.com
Subject Re: For the ametures. (related to "Are we losing
Date
Msg-id 20030418123225.4CAD6586AB@cbbrowne.com
Whole thread Raw
In response to Re: For the ametures. (related to "Are we losing momentum?")  (Kevin Brown <kevin@sysexperts.com>)
List pgsql-hackers
Kevin Brown wrote:
> Tom Lane wrote:
> > Kevin Brown <kevin@sysexperts.com> writes:
> > > It occurs to me that we could make it possible to get some of the
> > > performance gains MySQL gets through its naming conventions by
> > > including the type of object in the path of the object.
> > 
> > "Performance gains"?  Name one.
> 
> Instead of tables and their indexes being on the same platter, you'd
> be able to put them on separate platters.  Sounds like it would likely
> yield a performance gain to me...
> 
> > We have been there and done that.  I see no reason to go back.
> 
> I'm not proposing that we return to calling the individual files (or
> the database they reside in) by name, only that we include a "type"
> identifier in the path so that objects of different types can be
> located on different spindles if the DBA so desires.  As it is right
> now, tables and indexes are all stored in the same directory, and
> moving the indexes to a different spindle is an uncertain operation at
> best (you get to shut down the database in order to move any
> newly-created indexes, and dropping a moved index will not free the
> space occupied by the index as it'll only remove the symlink).

The thing is, this isn't necessarily particularly useful in managing the 
partitioning of data across disks.

If I have, defined, /disk1, /disk2, /disk3, /disk4, and /disk5, it is highly 
unlikely that my partitioning will be based on the notion of "put indices on 
disk1, tables on disk2, and, well, skip the others."

I'm liable to want WAL separate from all the others, for a start, but then 
look for what to put on different disks based on selecting particular tables 
and indices as candidates.

Your observation about the dropping of a moved index is well taken; that would 
point to the idea that the top level "thing" containing each table/index 
perhaps should be a directory, with two interesting properties:

- By being a directory, and putting files in it, this allows extensions to be 
more clearly tied to the table/index when a file grows towards the 
not-uncommon 2GB barrier;

- In order for the linking to physical devices to be kept under control, 
particularly if an index gets dropped and recreated, the postmaster needs to 
be able to establish the links, suggesting an extension to syntax.  At first 
blush:
 CREATE INDEX FROBOZZ_IDX LOCATION '/disk1/pgindices' on FROBOZZ(ID);

Supposing the OID number was 234231, the postmaster would then create the 
symbolic link from $PGDATA/base/234231 to the freshly-created directory 
/disk1/pgindices/234231, where the index would reside.  (And if the directory 
exists, there should be some complaint :-).)

I have made that up out of whole cloth; it _doesn't_ take into consideration 
how you would specify the location of implicitly-created indices.

But it seems a useful approach that can be robust, and where it's even 
plausible that the postmaster could cope with a request to shift a table or 
index to another location.  (Which would, quite naturally, put a lock on 
access to the object for the duration of the operation.)
--
output = reverse("gro.gultn@" "enworbbc")
http://www.ntlug.org/~cbbrowne/
"The dinosaurs died because they didn't have a space program."
-- Arthur C Clarke



pgsql-hackers by date:

Previous
From: Kevin Brown
Date:
Subject: Re: pg_clog woes with 7.3.2 - Episode 2
Next
From: cbbrowne@cbbrowne.com
Date:
Subject: Re: pg_clog woes with 7.3.2 - Episode 2