Thread: RE: Multiple Spindles ( Was: Re: [HACKERS] [hackers]development suggestion needed )

RE: Multiple Spindles ( Was: Re: [HACKERS] [hackers]development suggestion needed )

From
"Ansley, Michael"
Date:
14 January 2000, 01:44:06
But it's not quite this simple.  In our production system, we have reference
data in one tablespace, reference indices in another, working data in a
third space, and work indices in a fourth.  This is because the amount of
working data throughput is extremely high, while the reference data,
although changing reasonably frequently, changes significantly less than the
working data.  This is then normally spread across five spindles, with
Oracle being in the fifth.  On a 32-processor HP.

I think a good solution is to be able to specify where on the disk the table
is created (absolute paths only), and then postgres symlinks that file in
the main data directory, so from that point on it referenced without the
path name.  That's probably a significant start.

Alternatively, we could create "directoryspaces", which treats a directory
as a tablespace.  Then you do this:

CREATE TABLE foo (id_foo int, name varchar(30)) TABLESPACE
"/data/pgdata/sys1ref";

to create the new file /data/pgdata/sys1ref/foo, and a symlink is created in
the main db directory, so that you can just SELECT * FROM foo;

This is not difficult at all, or am I missing something?  Only real issue
(possibly) is security regarding the tablespace.  It might be an idea to
allow only the superuser to add allowed directories (i.e.: "create"
tablespaces), and assign user access to those tablespaces.


MikeA




>> -----Original Message-----
>> From: The Hermit Hacker [mailto:scrappy@hub.org]
>> Sent: Friday, January 14, 2000 6:32 AM
>> To: Don Baccus
>> Cc: Tom Lane; Xun Cheng; pgsql-hackers@postgreSQL.org
>> Subject: Multiple Spindles ( Was: Re: [HACKERS] [hackers]development
>> suggestion needed )
>> 
>> 
>> On Thu, 13 Jan 2000, Don Baccus wrote:
>> 
>> > My site's still in the experimental stage, being used by a couple
>> > dozen folks to record bird distribution data in the Pacific NW, so
>> > I don't personally have real-world data to get a feeling for how
>> > important this might become.  Still, Oracle DBA docs talk a lot
>> > about it so in some real-world scenarios being able to distribute
>> > tables and indices on different spindles must pay off.
>> 
>> What would it take to break the data/base/<database> 
>> directory down?  To
>> something like, maybe:
>> 
>> data/base/<database>/pg_*
>>                     /tables/*
>>                     /indices/*
>> 
>> Then, one could easily mount a drive as /tables and another one as
>> /indices ...
>> 
>> We know the difference between a table and an index, so I 
>> wouldn't think
>> it would be *too* hard add /tables/ internally to the existing
>> path...would it?
>> 
>> You'd basically have somethign like:
>> 
>> sprintf("%s/data/base/%s/tables/%s", data_dir, database, tablename);
>> 
>> Instead of:
>> 
>> sprintf("%s/data/base/%s/%s", data_dir, database, tablename);
>> 
>> I know, I'm being simplistic here, but...
>> 
>> Or, a different way:
>> 
>> if(table) sprintf("%s/data/base/table/%s/%s", 
>> data_dir,database,tablename);
>> else if(index) sprintf("%s/data/base/index/%s/%s", 
>> data_dir,database,tablename);
>> else sprintf("%s/data/base/sys/%s/%s", data_dir,database,sysfile);
>> 
>> This would give you the ability to put all table from all 
>> databass onto
>> one file system, and all indexes onto another, and all 
>> system files onto a
>> third...
>> 
>> I don't know, I'm oversimplying and spewing thoughts out
>> again...but...*shrug*
>> 
>> Marc G. Fournier                   ICQ#7615664               
>> IRC Nick: Scrappy
>> Systems Administrator @ hub.org 
>> primary: scrappy@hub.org           secondary: 
>> scrappy@{freebsd|postgresql}.org 
>> 
>> 
>> ************
>>