Re: Storage Location Patch Proposal for V7.3 - Mailing list pgsql-hackers

From Jim Buttafuoco
Subject Re: Storage Location Patch Proposal for V7.3
Date
Msg-id 20020303103408.M70619@buttafuoco.net
Whole thread Raw
In response to Re: Storage Location Patch Proposal for V7.3  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Storage Location Patch Proposal for V7.3  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
All,

I still believe that postgresql needs this feature.   I have many postgresql
systems that have over 500GB of data+indexes.  Using symbolic links is a BIG
pain in the A??.  Every time I run vacuum I have to go and fix the links
again.  Also I have many disks that are running out of space.  This patch
would allow me the ability to move my tables and indexes around.   I
personally don't see the difference between my patch and what people are
calling "Tablespaces" .  Oracle's definition is "A group of files that contain
database objects" ,  under my patch tablespaces and locations are the same
thing except postgresql uses file system directories to contain the group of
objects.  

To recap my patch (location = tablespace here)

Allow the DBA to create locations with a CREATE LOCATION command or CREATE
TABLESPACE command if you like tablespace instead of LOCATION.

Then for DATABASES (and schemas when available)  CREATE DATABASE WITH
DATA_LOCATION = XXX and INDEX_LOCATION  = YYY where XXX and YYY are the
DEFAULT values for OBJECT creation if not LOCATION is given.

CREATE TABLE and CREATE INDEX will create tables and indexes in the defaults
from the CREATE DATABASE/SCHEMA commands above.

CREATE TABLE WITH LOCATION=AAA and CREATE INDEX WITH LOCATION BBB would create
the table/index with the alternate location (only if the location was created
with a CREATE LOCATION command)


The create table command would also have to be change to support primary key/
unique index syntax.

create table SAMPLE
(c1    text primary key location CCC,c2          text unique  location DDD
);


I hope this explains my patch better.   As  I said before and I believe this
to be true,  This patch will enable the DBA to place tables/indexes on any
disk either for performance and/or space reasons.  Also I believe this is
another check off item for people looking at  postgresql when comparing with
Oracle/Sybase/DB2 ... 

Thanks for your time
Jim




> Jim, I see now that you submitted a new version.  Folks, do we have a
> direction for this patch.  Discussion of the patch is at:
> 
>     http://candle.pha.pa.us/cgi-bin/pgpatches2
> 
> ---------------------------------------------------------------------------
> 
> Jim Buttafuoco wrote:
> > Hi all,  
> > 
> > The following is a description of a patch I am proposing for 7.3.  
> > Please read and comment.
> > 
> > Thanks
> > Jim
> > 
> > 
> > This proposal covers the ability to allow a DBA (and general users) to 
> > specify where a database and it's individual objects will reside.  I 
> > propose to add a default data location, index and temporary locations 
> > to the pg_shadow table to allow a DBA to specify locations for each 
> > user when they create databases, tables and indexes or need temporary 
> > disk storage (either for temporary tables or sort files).  The "CREATE 
> > DATABASE" command will be changed to also take an INDEX location and 
> > temporary location. All 3 locations will default to the values from 
> > pg_shadow for the user that is creating the database.  Both the "CREATE 
> > TABLE" and "CREATE INDEX" commands will be changed to add "WITH 
> > LOCATION" optional argument (location will default to values from 
> > PG_DATABASE which were set by the "CREATE DATABASE" command).
> > 
> > The following system tables will be changed as follows
> > PG_SHADOW add dat_location, idx_location, tmp_location (all default to 
> > PG_DATA)
> > PG_DATABASE add dat_location, idx_location, tmp_location (all default 
> > to same from PG_SHADOW)
> > PG_CLASS add rellocation (default to dat_location for tables, 
> > idx_location for indexes from PG_DATABASE)
> > 
> > 
> > Add a GLOBAL table pg_locations to track valid locations
> > 
> > Add the following commands to manage locations
> > CREATE LOCATION locname PATH 'file system directory';
> > DROP LOCATION locname;   (this will have to look into each db to make 
> > sure that any objects are not using it.  Don't know how this will be 
> > done yet!)
> > 
> > I propose to change the names of the on disk directories from 999999 to 
> > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
> > PG_DATABASE).  A SYMLINK from 99999_INDEX and 99999_TEMP will be made 
> > back to 99999_DATA will be made so the WAL functions will continue to 
> > work.
> > 
> > 
> > Again from my earlier attempt at this patch, I believe this capability 
> > will not only improve performance (see my earlier emails.  Where 
> > depending on the type of disks the improvement was between 0% and 100% 
> > performance gain running pg_bench) but also give DBA's the flexibility 
> > to spread the data files over multiple disks without having to "hack" 
> > the system using symbolic links. 
> > 
> > 
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > 
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026





pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Re: Oracle vs PostgreSQL in real life : NEWS!!!
Next
From: Tom Lane
Date:
Subject: Re: new hashing function