Thread: Storage Location Patch Proposal for V7.3

Storage Location Patch Proposal for V7.3

From
"Jim Buttafuoco"
Date:
Hi all,  

The following is a description of a patch I am proposing for 7.3.  
Please read and comment.

Thanks
Jim


This proposal covers the ability to allow a DBA (and general users) to 
specify where a database and it's individual objects will reside.  I 
propose to add a default data location, index and temporary locations 
to the pg_shadow table to allow a DBA to specify locations for each 
user when they create databases, tables and indexes or need temporary 
disk storage (either for temporary tables or sort files).  The "CREATE 
DATABASE" command will be changed to also take an INDEX location and 
temporary location. All 3 locations will default to the values from 
pg_shadow for the user that is creating the database.  Both the "CREATE 
TABLE" and "CREATE INDEX" commands will be changed to add "WITH 
LOCATION" optional argument (location will default to values from 
PG_DATABASE which were set by the "CREATE DATABASE" command).

The following system tables will be changed as follows
PG_SHADOW add dat_location, idx_location, tmp_location (all default to 
PG_DATA)
PG_DATABASE add dat_location, idx_location, tmp_location (all default 
to same from PG_SHADOW)
PG_CLASS add rellocation (default to dat_location for tables, 
idx_location for indexes from PG_DATABASE)


Add a GLOBAL table pg_locations to track valid locations

Add the following commands to manage locations
CREATE LOCATION locname PATH 'file system directory';
DROP LOCATION locname;   (this will have to look into each db to make 
sure that any objects are not using it.  Don't know how this will be 
done yet!)

I propose to change the names of the on disk directories from 999999 to 
99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
PG_DATABASE).  A SYMLINK from 99999_INDEX and 99999_TEMP will be made 
back to 99999_DATA will be made so the WAL functions will continue to 
work.


Again from my earlier attempt at this patch, I believe this capability 
will not only improve performance (see my earlier emails.  Where 
depending on the type of disks the improvement was between 0% and 100% 
performance gain running pg_bench) but also give DBA's the flexibility 
to spread the data files over multiple disks without having to "hack" 
the system using symbolic links. 




Re: Storage Location Patch Proposal for V7.3

From
Tom Lane
Date:
"Jim Buttafuoco" <jim@buttafuoco.net> writes:
> I propose to add a default data location, index and temporary locations 
> to the pg_shadow table to allow a DBA to specify locations for each 
> user when they create databases, tables and indexes or need temporary 
> disk storage (either for temporary tables or sort files).

Have you read any of the previous discussions about tablespaces?
This seems to be tablespaces with an off-the-cuff syntax.  I'd
suggest taking a hard look at Oracle's tablespace facility and
seeing how closely we want to duplicate that.

> PG_SHADOW add dat_location, idx_location, tmp_location (all default to 
> PG_DATA)

What does location have to do with users?

> I propose to change the names of the on disk directories from 999999 to 
> 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
> PG_DATABASE).

No, that doesn't scale to arbitrary locations; furthermore it requires
an unseemly amount of knowledge in low-level file access code about
exactly what kind of object each table is.  The symlinks should just
be named after the OIDs of the locations' rows in pg_location.

The direction I've been envisioning for this is that each table has
a logical identification <pg_database OID>, <pg_class OID> as well
as a physical identification <pg_location OID>, <relfilenode OID>.
The path to access the table can be constructed entirely from the
physical identification: $PGDATA/base/<pg_location OID>/<relfilenode OID>.

One problem to be addressed if multiple databases can share a single
physical location is how to prevent relfilenode collisions.  Perhaps
we could avoid the issue by adding another layer of subdirectories:
$PGDATA/base/<pg_location OID>/<pg_database OID>/<relfilenode OID>.
That is, each database would have a subdirectory within each location
that it's ever used.  (This would make DROP DATABASE a lot easier,
among other things.)
        regards, tom lane


Re: Storage Location Patch Proposal for V7.3

From
"Jim Buttafuoco"
Date:
Tom,

Yes, locations = tablespaces (I really don't care if we call them
locaitons or tablespaces, I was just using LOCATIONS because that's what
we have now...)  is there a SQL standard for this???.

As for locations and user,  Under Oracle a user is assigned a default
tablespace and a temporary tablespace via the "CREATE USER" command. 
Also "CREATE DATABASE" allows you to specify the SYSTEM tablespace where
all objects will go unless a storage clause is added duration object
creation.  "CREATE TABLE" and "CREATE INDEX" both take a storage clause.

 As for the actual data file location, I believe under each loc oid we
would have pg_port #/DB OID/pg_class OID might be the way to go.  

The example below has 3 tablespaces/locations PGDATA/DB1/DB2
PG_LOCATIONS (or PG_TABLESPACES) would have the following rows
PGDATA | /usr/local/pgsql/data
DB1    | /db1
DB2    | /db2


/usr/local/pgsql/data/5432/1       <<template1                  ^-----------    <<default location/tablespace
          ^---------  <<Default PG Port
 

/db1/data/5432     ^-------------------------<< second location default PG PORT
/db1/data/5432/65894834/99999999                                ^------<< somedb/sometable
/db1/data/5432/65894834/88888888                        ^------<< somedb/someindex                
/db2/data/5432      ^-------------------------<< DB2


> "Jim Buttafuoco" <jim@buttafuoco.net> writes:
> > I propose to add a default data location, index and temporary
locations 
> > to the pg_shadow table to allow a DBA to specify locations for each 
> > user when they create databases, tables and indexes or need
temporary 
> > disk storage (either for temporary tables or sort files).
> 
> Have you read any of the previous discussions about tablespaces?
> This seems to be tablespaces with an off-the-cuff syntax.  I'd
> suggest taking a hard look at Oracle's tablespace facility and
> seeing how closely we want to duplicate that.
> 
> > PG_SHADOW add dat_location, idx_location, tmp_location (all default
to 
> > PG_DATA)
> 
> What does location have to do with users?
> 
> > I propose to change the names of the on disk directories from 999999
to 
> > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
> > PG_DATABASE).
> 
> No, that doesn't scale to arbitrary locations; furthermore it requires
> an unseemly amount of knowledge in low-level file access code about
> exactly what kind of object each table is.  The symlinks should just
> be named after the OIDs of the locations' rows in pg_location.
> 
> The direction I've been envisioning for this is that each table has
> a logical identification <pg_database OID>, <pg_class OID> as well
> as a physical identification <pg_location OID>, <relfilenode OID>.
> The path to access the table can be constructed entirely from the
> physical identification: $PGDATA/base/<pg_location OID>/<relfilenode
OID>.
> 
> One problem to be addressed if multiple databases can share a single
> physical location is how to prevent relfilenode collisions.  Perhaps
> we could avoid the issue by adding another layer of subdirectories:
> $PGDATA/base/<pg_location OID>/<pg_database OID>/<relfilenode OID>.
> That is, each database would have a subdirectory within each location
> that it's ever used.  (This would make DROP DATABASE a lot easier,
> among other things.)
> 
>             regards, tom lane
> 
> 




Re: Storage Location Patch Proposal for V7.3

From
Tom Lane
Date:
"Jim Buttafuoco" <jim@buttafuoco.net> writes:
>   As for the actual data file location, I believe under each loc oid we
> would have pg_port #/DB OID/pg_class OID might be the way to go.  

Introducing pg_port into the paths would be a bad idea, since it
would prevent restarting a postmaster with a different port number.
I think if a DBA is running multiple postmasters, it's up to him
to avoid pointing more than one of them at the same "location"
directory.  (Maybe we could enforce that with lock files?  Not
sure it's worth the trouble though.)
        regards, tom lane


Re: Storage Location Patch Proposal for V7.3

From
"Jim Buttafuoco"
Date:
Tom,

I was just looking for a way to keep different postmasters out of each
others way.  If you think the DBA can do it great or if the DBA changes
port have him/her fix the locations.

Jim

> "Jim Buttafuoco" <jim@buttafuoco.net> writes:
> >   As for the actual data file location, I believe under each loc oid
we
> > would have pg_port #/DB OID/pg_class OID might be the way to go.  
> 
> Introducing pg_port into the paths would be a bad idea, since it
> would prevent restarting a postmaster with a different port number.
> I think if a DBA is running multiple postmasters, it's up to him
> to avoid pointing more than one of them at the same "location"
> directory.  (Maybe we could enforce that with lock files?  Not
> sure it's worth the trouble though.)
> 
>             regards, tom lane
> 
> 




Re: Storage Location Patch Proposal for V7.3

From
mlw
Date:
Tom Lane wrote:
> 
> "Jim Buttafuoco" <jim@buttafuoco.net> writes:
> > I propose to add a default data location, index and temporary locations
> > to the pg_shadow table to allow a DBA to specify locations for each
> > user when they create databases, tables and indexes or need temporary
> > disk storage (either for temporary tables or sort files).
> 
> Have you read any of the previous discussions about tablespaces?
> This seems to be tablespaces with an off-the-cuff syntax.  I'd
> suggest taking a hard look at Oracle's tablespace facility and
> seeing how closely we want to duplicate that.

Sorry I missed the conversation about tablespaces. One of the reasons I think
Postgres is so usable is because it does not require the use of tablespace
files. If by tablespace, you mean to declare a directory on a device as a
tablespace, then cool. If you want to create tablespace "files" ala Oracle, you
are heading toward an administration nightmare. Don't get me wrong, the ability
to use a file as a tablespace would be kind of cool, i.e. you can probably use
raw devices, but please to not abandon the way postgres currently works.

On our Oracle server, we have run out of space on our tablespace files and not
known it was coming. I am the system architect, not the DBA, so I don't have
(nor want) direct control over the oracle database operation. Our newbe DBA did
not make the table correctly, so they did not grow. Alas he was laid off, thus
we were left trying to figure out what was happening.

Postgres is easier to configure and get right. IMHO that is one of its very
important strengths. It is almost trivial to get a working SQL system up and
running which performs well.


Re: Storage Location Patch Proposal for V7.3

From
Tom Lane
Date:
mlw <markw@mohawksoft.com> writes:
> Tom Lane wrote:
>> This seems to be tablespaces with an off-the-cuff syntax.  I'd
>> suggest taking a hard look at Oracle's tablespace facility and
>> seeing how closely we want to duplicate that.

> Sorry I missed the conversation about tablespaces. One of the reasons I think
> Postgres is so usable is because it does not require the use of tablespace
> files. If by tablespace, you mean to declare a directory on a device as a
> tablespace, then cool. If you want to create tablespace "files" ala Oracle, you
> are heading toward an administration nightmare.

No, that's not one of the parts of Oracle's facility that I want to
duplicate.

I think our idea of a tablespace/location/whatchacallit should just be
a directory somewhere that table files can be created in.  What seems
worthwhile to steal from Oracle is the syntax that assigns particular
tables to particular tablespaces.  If we're compatible on syntax, that
should ease porting of existing applications --- and as far as I can see
at the moment, there's no reason *not* to be compatible at that level.
I don't want to borrow Oracle's ideas about space management semantics,
however.
        regards, tom lane


Re: Storage Location Patch Proposal for V7.3

From
"Jim Buttafuoco"
Date:
Mark,

This is why I choose to use the term "LOCATION" instead of "TABLESPACE"
.  A "LOCATION" is a directory just like Postgresql has today.  All the
patch would add is the ability to put object under different "LOCATION"
for the same database.

Jim




> Tom Lane wrote:
> > 
> > "Jim Buttafuoco" <jim@buttafuoco.net> writes:
> > > I propose to add a default data location, index and temporary
locations
> > > to the pg_shadow table to allow a DBA to specify locations for
each
> > > user when they create databases, tables and indexes or need
temporary
> > > disk storage (either for temporary tables or sort files).
> > 
> > Have you read any of the previous discussions about tablespaces?
> > This seems to be tablespaces with an off-the-cuff syntax.  I'd
> > suggest taking a hard look at Oracle's tablespace facility and
> > seeing how closely we want to duplicate that.
> 
> Sorry I missed the conversation about tablespaces. One of the reasons
I think
> Postgres is so usable is because it does not require the use of
tablespace
> files. If by tablespace, you mean to declare a directory on a device
as a
> tablespace, then cool. If you want to create tablespace "files" ala
Oracle, you
> are heading toward an administration nightmare. Don't get me wrong,
the ability
> to use a file as a tablespace would be kind of cool, i.e. you can
probably use
> raw devices, but please to not abandon the way postgres currently
works.
> 
> On our Oracle server, we have run out of space on our tablespace files
and not
> known it was coming. I am the system architect, not the DBA, so I
don't have
> (nor want) direct control over the oracle database operation. Our
newbe DBA did
> not make the table correctly, so they did not grow. Alas he was laid
off, thus
> we were left trying to figure out what was happening.
> 
> Postgres is easier to configure and get right. IMHO that is one of its
very
> important strengths. It is almost trivial to get a working SQL system
up and
> running which performs well.
> 
> 




Re: Storage Location Patch Proposal for V7.3

From
mlw
Date:
Jim Buttafuoco wrote:
> 
> Mark,
> 
> This is why I choose to use the term "LOCATION" instead of "TABLESPACE"
> .  A "LOCATION" is a directory just like Postgresql has today.  All the
> patch would add is the ability to put object under different "LOCATION"
> for the same database.

That is a very excellent point. While I am not in the circle that makes these
decisions, I hope your words are heard.

I understand the desire to stay with "standards" and it is impossible to deny
defacto standards, but I do understand that defacto standards have to be
challenged when they don't make sense. A prime example is PostgreSQL's
inner/outer join syntax. It is incompatible with Oracle, but compatible with
the documented SQL standard.

Since "tablespace" is not part of the SQL standard, maybe it makes sense to
define a more specific syntax. The term "location" makes sense, because it is
not a tablespace as Oracle defines it. There is a real danger is trying to
support a different interpretation of an existing "defacto" syntax, in that it
will behave differently than expected.


Re: Storage Location Patch Proposal for V7.3

From
"Zeugswetter Andreas SB SD"
Date:
 
> Since "tablespace" is not part of the SQL standard, maybe it makes
sense to
> define a more specific syntax. The term "location" makes sense,
because it is
> not a tablespace as Oracle defines it.

It *is* an "OS managed tablespace" in terms of IBM DB2.
Methinks the term "TABLESPACE" is perfect for PostgreSQL.
The fact whether it is a directory, a file or even a raw device
depends on how you create the tablespace.

The point is, that the syntax for "create table" and "create index"
can be compatible in this case, imho without confusing many. 
Not the "create tablespace" syntax, but that is imho not an issue.

Andreas


Re: Storage Location Patch Proposal for V7.3

From
Stefan Rindeskar
Date:
I just wanted to affirm that Tom's description sounds like av very good
way to go.

You get the best of two worlds with the possibility to tune servers and
yet still
very easy to manage. i.e. If you don't need it, don't mess with it and
everything
will work just fine.
I don't either see any reason not to use the Oracle syntax since it is
so widely used
and it works very well for those of us that also work on Oracle (but in
postgresql
without the extent and storage clauses).

Regards
Stefan

Tom Lane wrote:

> mlw <markw@mohawksoft.com> writes:
> > Tom Lane wrote:
> >> This seems to be tablespaces with an off-the-cuff syntax.  I'd
> >> suggest taking a hard look at Oracle's tablespace facility and
> >> seeing how closely we want to duplicate that.
>
> > Sorry I missed the conversation about tablespaces. One of the reasons I think
> > Postgres is so usable is because it does not require the use of tablespace
> > files. If by tablespace, you mean to declare a directory on a device as a
> > tablespace, then cool. If you want to create tablespace "files" ala Oracle, you
> > are heading toward an administration nightmare.
>
> No, that's not one of the parts of Oracle's facility that I want to
> duplicate.
>
> I think our idea of a tablespace/location/whatchacallit should just be
> a directory somewhere that table files can be created in.  What seems
> worthwhile to steal from Oracle is the syntax that assigns particular
> tables to particular tablespaces.  If we're compatible on syntax, that
> should ease porting of existing applications --- and as far as I can see
> at the moment, there's no reason *not* to be compatible at that level.
> I don't want to borrow Oracle's ideas about space management semantics,
> however.
>
>                         regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


Re: Storage Location Patch Proposal for V7.3

From
mlw
Date:
Stefan Rindeskar wrote:
> 
> I just wanted to affirm that Tom's description sounds like av very good
> way to go.
> 
> You get the best of two worlds with the possibility to tune servers and
> yet still
> very easy to manage. i.e. If you don't need it, don't mess with it and
> everything
> will work just fine.
> I don't either see any reason not to use the Oracle syntax since it is
> so widely used
> and it works very well for those of us that also work on Oracle (but in
> postgresql
> without the extent and storage clauses).
> 

I absolutely agree with the concept of defining a location for data from within
the database. No argument.

The only two issues I can see are:

(1) Do not require the use of files as table spaces ala Oracle. That is an
admin nightmare. (Again, it would be cool, however, to be able to use table
space files so that PostgreSQL could have raw access as long as it is not a
requirement.) I don't think Tom is thinking about table space files, so I'm not
worried.

(2) I have a concern about expected behavior vs existing syntax. If PostgreSQL
uses "create tablespace" in such a way that an Oracle DBA will expect it to
work as Oracle does, it may cause a bit of confusion. We all know that
"confusion" between an open source solution and a "defacto" solution is used as
club.


Re: Storage Location Patch Proposal for V7.3

From
"Jim Buttafuoco"
Date:
I am at home now,  Tomorrow I will get the exact Oracle syntax and post
example commands for CREATE TABLESPACE, CREATE DATABASE, CREATE TABLE
and CREATE INDEX, CREATE USER.  



Jim





>  
> > Since "tablespace" is not part of the SQL standard, maybe it makes
> sense to
> > define a more specific syntax. The term "location" makes sense,
> because it is
> > not a tablespace as Oracle defines it.
> 
> It *is* an "OS managed tablespace" in terms of IBM DB2.
> Methinks the term "TABLESPACE" is perfect for PostgreSQL.
> The fact whether it is a directory, a file or even a raw device
> depends on how you create the tablespace.
> 
> The point is, that the syntax for "create table" and "create index"
> can be compatible in this case, imho without confusing many. 
> Not the "create tablespace" syntax, but that is imho not an issue.
> 
> Andreas
> 
> 




Re: Storage Location Patch Proposal for V7.3

From
Mathijs Brands
Date:
On Mon, Nov 05, 2001 at 12:26:17PM -0500, Jim Buttafuoco allegedly wrote:
> The example below has 3 tablespaces/locations PGDATA/DB1/DB2
> PG_LOCATIONS (or PG_TABLESPACES) would have the following rows
> PGDATA | /usr/local/pgsql/data
> DB1    | /db1
> DB2    | /db2

<SNIP>

> /db1/data/5432
>       ^-------------------------<< second location default PG PORT
> /db1/data/5432/65894834/99999999        
>                          ^------<< somedb/sometable
> /db1/data/5432/65894834/88888888
>                          ^------<< somedb/someindex
>                  
> /db2/data/5432 
>       ^-------------------------<< DB2

Should data/ even be in there? /db2/5432 seems to be the correct value.
Either that or change the location to /db2/data. Implicitly creating an
extra directory isn't something I would like to happen, especially if it
doesn't happen for PGDATA itself.

My $.02,

Mathijs
--
And the beast shall be made legion. Its numbers shall be increased a
thousand thousand fold. The din of a million keyboards like unto a great
storm shall cover the earth, and the followers of Mammon shall tremble.


Re: Storage Location Patch Proposal for V7.3

From
Bruce Momjian
Date:
Jim, I see now that you submitted a new version.  Folks, do we have a
direction for this patch.  Discussion of the patch is at:
http://candle.pha.pa.us/cgi-bin/pgpatches2

---------------------------------------------------------------------------

Jim Buttafuoco wrote:
> Hi all,  
> 
> The following is a description of a patch I am proposing for 7.3.  
> Please read and comment.
> 
> Thanks
> Jim
> 
> 
> This proposal covers the ability to allow a DBA (and general users) to 
> specify where a database and it's individual objects will reside.  I 
> propose to add a default data location, index and temporary locations 
> to the pg_shadow table to allow a DBA to specify locations for each 
> user when they create databases, tables and indexes or need temporary 
> disk storage (either for temporary tables or sort files).  The "CREATE 
> DATABASE" command will be changed to also take an INDEX location and 
> temporary location. All 3 locations will default to the values from 
> pg_shadow for the user that is creating the database.  Both the "CREATE 
> TABLE" and "CREATE INDEX" commands will be changed to add "WITH 
> LOCATION" optional argument (location will default to values from 
> PG_DATABASE which were set by the "CREATE DATABASE" command).
> 
> The following system tables will be changed as follows
> PG_SHADOW add dat_location, idx_location, tmp_location (all default to 
> PG_DATA)
> PG_DATABASE add dat_location, idx_location, tmp_location (all default 
> to same from PG_SHADOW)
> PG_CLASS add rellocation (default to dat_location for tables, 
> idx_location for indexes from PG_DATABASE)
> 
> 
> Add a GLOBAL table pg_locations to track valid locations
> 
> Add the following commands to manage locations
> CREATE LOCATION locname PATH 'file system directory';
> DROP LOCATION locname;   (this will have to look into each db to make 
> sure that any objects are not using it.  Don't know how this will be 
> done yet!)
> 
> I propose to change the names of the on disk directories from 999999 to 
> 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
> PG_DATABASE).  A SYMLINK from 99999_INDEX and 99999_TEMP will be made 
> back to 99999_DATA will be made so the WAL functions will continue to 
> work.
> 
> 
> Again from my earlier attempt at this patch, I believe this capability 
> will not only improve performance (see my earlier emails.  Where 
> depending on the type of disks the improvement was between 0% and 100% 
> performance gain running pg_bench) but also give DBA's the flexibility 
> to spread the data files over multiple disks without having to "hack" 
> the system using symbolic links. 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Storage Location Patch Proposal for V7.3

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Jim, I see now that you submitted a new version.  Folks, do we have a
> direction for this patch.

I didn't like it at the time, and still don't.  We are not that far away
from having proper tablespaces, and I think that kluges that provide
part of the functionality will just get in the way when it comes time
to do it right.
        regards, tom lane


Re: Storage Location Patch Proposal for V7.3

From
"Marc G. Fournier"
Date:
On Fri, 22 Feb 2002, Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Jim, I see now that you submitted a new version.  Folks, do we have a
> > direction for this patch.
>
> I didn't like it at the time, and still don't.  We are not that far away
> from having proper tablespaces, and I think that kluges that provide
> part of the functionality will just get in the way when it comes time
> to do it right.

What kind of time frame is "not that far away"?  For v7.3?

If not, and someone can clarify what I'm understanding this patch will do,
its essentially going to setup a directory structure of:

data/base/<dboid>/<tbloid>.idx/indx

?

If we aren't going to have tablespaces for v7.3, there we are talking 6->8
months before we do, and the above sounds like a reasonable interim
solution for this ...



Re: Storage Location Patch Proposal for V7.3

From
Tom Lane
Date:
"Marc G. Fournier" <scrappy@hub.org> writes:
> On Fri, 22 Feb 2002, Tom Lane wrote:
>> I didn't like it at the time, and still don't.  We are not that far away
>> from having proper tablespaces, and I think that kluges that provide
>> part of the functionality will just get in the way when it comes time
>> to do it right.

> What kind of time frame is "not that far away"?  For v7.3?

My guess is that any of the inner circle of hackers could make this
happen with about a week's work.  Whether someone will find time before
7.3 is unknown (particularly seeing that we haven't set a target date
for 7.3).  Personally, schemas are a higher priority for me ...
        regards, tom lane


Re: Storage Location Patch Proposal for V7.3

From
"Jim Buttafuoco"
Date:
All,

I still believe that postgresql needs this feature.   I have many postgresql
systems that have over 500GB of data+indexes.  Using symbolic links is a BIG
pain in the A??.  Every time I run vacuum I have to go and fix the links
again.  Also I have many disks that are running out of space.  This patch
would allow me the ability to move my tables and indexes around.   I
personally don't see the difference between my patch and what people are
calling "Tablespaces" .  Oracle's definition is "A group of files that contain
database objects" ,  under my patch tablespaces and locations are the same
thing except postgresql uses file system directories to contain the group of
objects.  

To recap my patch (location = tablespace here)

Allow the DBA to create locations with a CREATE LOCATION command or CREATE
TABLESPACE command if you like tablespace instead of LOCATION.

Then for DATABASES (and schemas when available)  CREATE DATABASE WITH
DATA_LOCATION = XXX and INDEX_LOCATION  = YYY where XXX and YYY are the
DEFAULT values for OBJECT creation if not LOCATION is given.

CREATE TABLE and CREATE INDEX will create tables and indexes in the defaults
from the CREATE DATABASE/SCHEMA commands above.

CREATE TABLE WITH LOCATION=AAA and CREATE INDEX WITH LOCATION BBB would create
the table/index with the alternate location (only if the location was created
with a CREATE LOCATION command)


The create table command would also have to be change to support primary key/
unique index syntax.

create table SAMPLE
(c1    text primary key location CCC,c2          text unique  location DDD
);


I hope this explains my patch better.   As  I said before and I believe this
to be true,  This patch will enable the DBA to place tables/indexes on any
disk either for performance and/or space reasons.  Also I believe this is
another check off item for people looking at  postgresql when comparing with
Oracle/Sybase/DB2 ... 

Thanks for your time
Jim




> Jim, I see now that you submitted a new version.  Folks, do we have a
> direction for this patch.  Discussion of the patch is at:
> 
>     http://candle.pha.pa.us/cgi-bin/pgpatches2
> 
> ---------------------------------------------------------------------------
> 
> Jim Buttafuoco wrote:
> > Hi all,  
> > 
> > The following is a description of a patch I am proposing for 7.3.  
> > Please read and comment.
> > 
> > Thanks
> > Jim
> > 
> > 
> > This proposal covers the ability to allow a DBA (and general users) to 
> > specify where a database and it's individual objects will reside.  I 
> > propose to add a default data location, index and temporary locations 
> > to the pg_shadow table to allow a DBA to specify locations for each 
> > user when they create databases, tables and indexes or need temporary 
> > disk storage (either for temporary tables or sort files).  The "CREATE 
> > DATABASE" command will be changed to also take an INDEX location and 
> > temporary location. All 3 locations will default to the values from 
> > pg_shadow for the user that is creating the database.  Both the "CREATE 
> > TABLE" and "CREATE INDEX" commands will be changed to add "WITH 
> > LOCATION" optional argument (location will default to values from 
> > PG_DATABASE which were set by the "CREATE DATABASE" command).
> > 
> > The following system tables will be changed as follows
> > PG_SHADOW add dat_location, idx_location, tmp_location (all default to 
> > PG_DATA)
> > PG_DATABASE add dat_location, idx_location, tmp_location (all default 
> > to same from PG_SHADOW)
> > PG_CLASS add rellocation (default to dat_location for tables, 
> > idx_location for indexes from PG_DATABASE)
> > 
> > 
> > Add a GLOBAL table pg_locations to track valid locations
> > 
> > Add the following commands to manage locations
> > CREATE LOCATION locname PATH 'file system directory';
> > DROP LOCATION locname;   (this will have to look into each db to make 
> > sure that any objects are not using it.  Don't know how this will be 
> > done yet!)
> > 
> > I propose to change the names of the on disk directories from 999999 to 
> > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
> > PG_DATABASE).  A SYMLINK from 99999_INDEX and 99999_TEMP will be made 
> > back to 99999_DATA will be made so the WAL functions will continue to 
> > work.
> > 
> > 
> > Again from my earlier attempt at this patch, I believe this capability 
> > will not only improve performance (see my earlier emails.  Where 
> > depending on the type of disks the improvement was between 0% and 100% 
> > performance gain running pg_bench) but also give DBA's the flexibility 
> > to spread the data files over multiple disks without having to "hack" 
> > the system using symbolic links. 
> > 
> > 
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > 
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026





Re: Storage Location Patch Proposal for V7.3

From
Bruce Momjian
Date:
I think Jim has some very good points here.  What does his
implementation lack?  Seems pretty valuable to me.

---------------------------------------------------------------------------

Jim Buttafuoco wrote:
> All,
> 
> I still believe that postgresql needs this feature.   I have many postgresql
> systems that have over 500GB of data+indexes.  Using symbolic links is a BIG
> pain in the A??.  Every time I run vacuum I have to go and fix the links
> again.  Also I have many disks that are running out of space.  This patch
> would allow me the ability to move my tables and indexes around.   I
> personally don't see the difference between my patch and what people are
> calling "Tablespaces" .  Oracle's definition is "A group of files that contain
> database objects" ,  under my patch tablespaces and locations are the same
> thing except postgresql uses file system directories to contain the group of
> objects.  
> 
> To recap my patch (location = tablespace here)
> 
> Allow the DBA to create locations with a CREATE LOCATION command or CREATE
> TABLESPACE command if you like tablespace instead of LOCATION.
> 
> Then for DATABASES (and schemas when available)  CREATE DATABASE WITH
> DATA_LOCATION = XXX and INDEX_LOCATION  = YYY where XXX and YYY are the
> DEFAULT values for OBJECT creation if not LOCATION is given.
> 
> CREATE TABLE and CREATE INDEX will create tables and indexes in the defaults
> from the CREATE DATABASE/SCHEMA commands above.
> 
> CREATE TABLE WITH LOCATION=AAA and CREATE INDEX WITH LOCATION BBB would create
> the table/index with the alternate location (only if the location was created
> with a CREATE LOCATION command)
> 
> 
> The create table command would also have to be change to support primary key/
> unique index syntax.
> 
> create table SAMPLE
> (
>     c1    text primary key location CCC,
>     c2          text unique  location DDD
> );
> 
> 
> I hope this explains my patch better.   As  I said before and I believe this
> to be true,  This patch will enable the DBA to place tables/indexes on any
> disk either for performance and/or space reasons.  Also I believe this is
> another check off item for people looking at  postgresql when comparing with
> Oracle/Sybase/DB2 ... 
> 
> Thanks for your time
> Jim
> 
> 
> 
> 
> > Jim, I see now that you submitted a new version.  Folks, do we have a
> > direction for this patch.  Discussion of the patch is at:
> > 
> >     http://candle.pha.pa.us/cgi-bin/pgpatches2
> > 
> > ---------------------------------------------------------------------------
> > 
> > Jim Buttafuoco wrote:
> > > Hi all,  
> > > 
> > > The following is a description of a patch I am proposing for 7.3.  
> > > Please read and comment.
> > > 
> > > Thanks
> > > Jim
> > > 
> > > 
> > > This proposal covers the ability to allow a DBA (and general users) to 
> > > specify where a database and it's individual objects will reside.  I 
> > > propose to add a default data location, index and temporary locations 
> > > to the pg_shadow table to allow a DBA to specify locations for each 
> > > user when they create databases, tables and indexes or need temporary 
> > > disk storage (either for temporary tables or sort files).  The "CREATE 
> > > DATABASE" command will be changed to also take an INDEX location and 
> > > temporary location. All 3 locations will default to the values from 
> > > pg_shadow for the user that is creating the database.  Both the "CREATE 
> > > TABLE" and "CREATE INDEX" commands will be changed to add "WITH 
> > > LOCATION" optional argument (location will default to values from 
> > > PG_DATABASE which were set by the "CREATE DATABASE" command).
> > > 
> > > The following system tables will be changed as follows
> > > PG_SHADOW add dat_location, idx_location, tmp_location (all default to 
> > > PG_DATA)
> > > PG_DATABASE add dat_location, idx_location, tmp_location (all default 
> > > to same from PG_SHADOW)
> > > PG_CLASS add rellocation (default to dat_location for tables, 
> > > idx_location for indexes from PG_DATABASE)
> > > 
> > > 
> > > Add a GLOBAL table pg_locations to track valid locations
> > > 
> > > Add the following commands to manage locations
> > > CREATE LOCATION locname PATH 'file system directory';
> > > DROP LOCATION locname;   (this will have to look into each db to make 
> > > sure that any objects are not using it.  Don't know how this will be 
> > > done yet!)
> > > 
> > > I propose to change the names of the on disk directories from 999999 to 
> > > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from 
> > > PG_DATABASE).  A SYMLINK from 99999_INDEX and 99999_TEMP will be made 
> > > back to 99999_DATA will be made so the WAL functions will continue to 
> > > work.
> > > 
> > > 
> > > Again from my earlier attempt at this patch, I believe this capability 
> > > will not only improve performance (see my earlier emails.  Where 
> > > depending on the type of disks the improvement was between 0% and 100% 
> > > performance gain running pg_bench) but also give DBA's the flexibility 
> > > to spread the data files over multiple disks without having to "hack" 
> > > the system using symbolic links. 
> > > 
> > > 
> > > 
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 2: you can get off all lists at once with the unregister command
> > >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > > 
> > 
> > -- 
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 853-3000
> >   +  If your life is a hard drive,     |  830 Blythe Avenue
> >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Storage Location Patch Proposal for V7.3

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I think Jim has some very good points here.  What does his
> implementation lack?

Forward compatibility to a future tablespace implementation.
If we do this, we'll be stuck with supporting this feature set,
not to mention this syntax; neither of which have garnered any
support from the assembled hackers.

I went back to look at TODO.detail/tablespaces, and find that it's
badly in need of editing.  Much of the discussion there is
back-and-forthing about the question of naming files by OID,
which is now a done deal.  But it is clear that people wanted to
have a notion of tablespaces as objects somewhat orthogonal to
databases.  I didn't see any support for hard-wiring tablespace
assignments on the basis of "tables here, indexes there", either.
        regards, tom lane


Re: Storage Location Patch Proposal for V7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think Jim has some very good points here.  What does his
> > implementation lack?
> 
> Forward compatibility to a future tablespace implementation.
> If we do this, we'll be stuck with supporting this feature set,
> not to mention this syntax; neither of which have garnered any
> support from the assembled hackers.
> 
> I went back to look at TODO.detail/tablespaces, and find that it's
> badly in need of editing.  Much of the discussion there is
> back-and-forthing about the question of naming files by OID,

Agreed.

> which is now a done deal.  But it is clear that people wanted to
> have a notion of tablespaces as objects somewhat orthogonal to
> databases.  I didn't see any support for hard-wiring tablespace
> assignments on the basis of "tables here, indexes there", either.

OK, I read through it.  Wow, it was long.  Exactly what is missing from
the patch that he can add?  It is mostly having tablespaces independent
of databases?   There were so many proposals in there I am not sure
where it all landed.  The cleaned out version link to from the TODO list
should be slightly easier reading, but it clearly goes all over the
place.  I can try and read through it and distill down the ideas if that
would help.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Storage Location Patch Proposal for V7.3

From
Thomas Lockhart
Date:
...
> Forward compatibility to a future tablespace implementation.
> If we do this, we'll be stuck with supporting this feature set,
> not to mention this syntax; neither of which have garnered any
> support from the assembled hackers.

The feature set (in some incarnation) is exactly something we should
have. "Tablespace" could mean almost anything, since (I recall that) we
are not slavishly copying the Oracle features having a similar name. The
syntax (or something similar) seems acceptable to me. I haven't looked
at the implementation itself.

So, I'll guess that the particular objection to this implementation is
along the lines of wanting to be able to manage tablespaces/locations as
a single entity? So that one could issue commands like (forgive the
syntax) "move tablespace xxx to yyy;" and be able to yank the entire
contents from one place to another in a single line?

Jim's patches don't explicitly tie the pieces residing in a single
location together. Is that the objection? In all other respects (and
perhaps in all respects period) it seems to be a good starting point at
least.

I know that you have said that you want to look at "tablespaces" for
7.3. If we get there with a feature set we all find acceptable, then
great. If we don't, then Jim's subset of features would be great to
have.

Comments?
                      - Thomas