Thread: Storage Location Patch Proposal for V7.3
Hi all, The following is a description of a patch I am proposing for 7.3. Please read and comment. Thanks Jim This proposal covers the ability to allow a DBA (and general users) to specify where a database and it's individual objects will reside. I propose to add a default data location, index and temporary locations to the pg_shadow table to allow a DBA to specify locations for each user when they create databases, tables and indexes or need temporary disk storage (either for temporary tables or sort files). The "CREATE DATABASE" command will be changed to also take an INDEX location and temporary location. All 3 locations will default to the values from pg_shadow for the user that is creating the database. Both the "CREATE TABLE" and "CREATE INDEX" commands will be changed to add "WITH LOCATION" optional argument (location will default to values from PG_DATABASE which were set by the "CREATE DATABASE" command). The following system tables will be changed as follows PG_SHADOW add dat_location, idx_location, tmp_location (all default to PG_DATA) PG_DATABASE add dat_location, idx_location, tmp_location (all default to same from PG_SHADOW) PG_CLASS add rellocation (default to dat_location for tables, idx_location for indexes from PG_DATABASE) Add a GLOBAL table pg_locations to track valid locations Add the following commands to manage locations CREATE LOCATION locname PATH 'file system directory'; DROP LOCATION locname; (this will have to look into each db to make sure that any objects are not using it. Don't know how this will be done yet!) I propose to change the names of the on disk directories from 999999 to 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from PG_DATABASE). A SYMLINK from 99999_INDEX and 99999_TEMP will be made back to 99999_DATA will be made so the WAL functions will continue to work. Again from my earlier attempt at this patch, I believe this capability will not only improve performance (see my earlier emails. Where depending on the type of disks the improvement was between 0% and 100% performance gain running pg_bench) but also give DBA's the flexibility to spread the data files over multiple disks without having to "hack" the system using symbolic links.
"Jim Buttafuoco" <jim@buttafuoco.net> writes: > I propose to add a default data location, index and temporary locations > to the pg_shadow table to allow a DBA to specify locations for each > user when they create databases, tables and indexes or need temporary > disk storage (either for temporary tables or sort files). Have you read any of the previous discussions about tablespaces? This seems to be tablespaces with an off-the-cuff syntax. I'd suggest taking a hard look at Oracle's tablespace facility and seeing how closely we want to duplicate that. > PG_SHADOW add dat_location, idx_location, tmp_location (all default to > PG_DATA) What does location have to do with users? > I propose to change the names of the on disk directories from 999999 to > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from > PG_DATABASE). No, that doesn't scale to arbitrary locations; furthermore it requires an unseemly amount of knowledge in low-level file access code about exactly what kind of object each table is. The symlinks should just be named after the OIDs of the locations' rows in pg_location. The direction I've been envisioning for this is that each table has a logical identification <pg_database OID>, <pg_class OID> as well as a physical identification <pg_location OID>, <relfilenode OID>. The path to access the table can be constructed entirely from the physical identification: $PGDATA/base/<pg_location OID>/<relfilenode OID>. One problem to be addressed if multiple databases can share a single physical location is how to prevent relfilenode collisions. Perhaps we could avoid the issue by adding another layer of subdirectories: $PGDATA/base/<pg_location OID>/<pg_database OID>/<relfilenode OID>. That is, each database would have a subdirectory within each location that it's ever used. (This would make DROP DATABASE a lot easier, among other things.) regards, tom lane
Tom, Yes, locations = tablespaces (I really don't care if we call them locaitons or tablespaces, I was just using LOCATIONS because that's what we have now...) is there a SQL standard for this???. As for locations and user, Under Oracle a user is assigned a default tablespace and a temporary tablespace via the "CREATE USER" command. Also "CREATE DATABASE" allows you to specify the SYSTEM tablespace where all objects will go unless a storage clause is added duration object creation. "CREATE TABLE" and "CREATE INDEX" both take a storage clause. As for the actual data file location, I believe under each loc oid we would have pg_port #/DB OID/pg_class OID might be the way to go. The example below has 3 tablespaces/locations PGDATA/DB1/DB2 PG_LOCATIONS (or PG_TABLESPACES) would have the following rows PGDATA | /usr/local/pgsql/data DB1 | /db1 DB2 | /db2 /usr/local/pgsql/data/5432/1 <<template1 ^----------- <<default location/tablespace ^--------- <<Default PG Port /db1/data/5432 ^-------------------------<< second location default PG PORT /db1/data/5432/65894834/99999999 ^------<< somedb/sometable /db1/data/5432/65894834/88888888 ^------<< somedb/someindex /db2/data/5432 ^-------------------------<< DB2 > "Jim Buttafuoco" <jim@buttafuoco.net> writes: > > I propose to add a default data location, index and temporary locations > > to the pg_shadow table to allow a DBA to specify locations for each > > user when they create databases, tables and indexes or need temporary > > disk storage (either for temporary tables or sort files). > > Have you read any of the previous discussions about tablespaces? > This seems to be tablespaces with an off-the-cuff syntax. I'd > suggest taking a hard look at Oracle's tablespace facility and > seeing how closely we want to duplicate that. > > > PG_SHADOW add dat_location, idx_location, tmp_location (all default to > > PG_DATA) > > What does location have to do with users? > > > I propose to change the names of the on disk directories from 999999 to > > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from > > PG_DATABASE). > > No, that doesn't scale to arbitrary locations; furthermore it requires > an unseemly amount of knowledge in low-level file access code about > exactly what kind of object each table is. The symlinks should just > be named after the OIDs of the locations' rows in pg_location. > > The direction I've been envisioning for this is that each table has > a logical identification <pg_database OID>, <pg_class OID> as well > as a physical identification <pg_location OID>, <relfilenode OID>. > The path to access the table can be constructed entirely from the > physical identification: $PGDATA/base/<pg_location OID>/<relfilenode OID>. > > One problem to be addressed if multiple databases can share a single > physical location is how to prevent relfilenode collisions. Perhaps > we could avoid the issue by adding another layer of subdirectories: > $PGDATA/base/<pg_location OID>/<pg_database OID>/<relfilenode OID>. > That is, each database would have a subdirectory within each location > that it's ever used. (This would make DROP DATABASE a lot easier, > among other things.) > > regards, tom lane > >
"Jim Buttafuoco" <jim@buttafuoco.net> writes: > As for the actual data file location, I believe under each loc oid we > would have pg_port #/DB OID/pg_class OID might be the way to go. Introducing pg_port into the paths would be a bad idea, since it would prevent restarting a postmaster with a different port number. I think if a DBA is running multiple postmasters, it's up to him to avoid pointing more than one of them at the same "location" directory. (Maybe we could enforce that with lock files? Not sure it's worth the trouble though.) regards, tom lane
Tom, I was just looking for a way to keep different postmasters out of each others way. If you think the DBA can do it great or if the DBA changes port have him/her fix the locations. Jim > "Jim Buttafuoco" <jim@buttafuoco.net> writes: > > As for the actual data file location, I believe under each loc oid we > > would have pg_port #/DB OID/pg_class OID might be the way to go. > > Introducing pg_port into the paths would be a bad idea, since it > would prevent restarting a postmaster with a different port number. > I think if a DBA is running multiple postmasters, it's up to him > to avoid pointing more than one of them at the same "location" > directory. (Maybe we could enforce that with lock files? Not > sure it's worth the trouble though.) > > regards, tom lane > >
Tom Lane wrote: > > "Jim Buttafuoco" <jim@buttafuoco.net> writes: > > I propose to add a default data location, index and temporary locations > > to the pg_shadow table to allow a DBA to specify locations for each > > user when they create databases, tables and indexes or need temporary > > disk storage (either for temporary tables or sort files). > > Have you read any of the previous discussions about tablespaces? > This seems to be tablespaces with an off-the-cuff syntax. I'd > suggest taking a hard look at Oracle's tablespace facility and > seeing how closely we want to duplicate that. Sorry I missed the conversation about tablespaces. One of the reasons I think Postgres is so usable is because it does not require the use of tablespace files. If by tablespace, you mean to declare a directory on a device as a tablespace, then cool. If you want to create tablespace "files" ala Oracle, you are heading toward an administration nightmare. Don't get me wrong, the ability to use a file as a tablespace would be kind of cool, i.e. you can probably use raw devices, but please to not abandon the way postgres currently works. On our Oracle server, we have run out of space on our tablespace files and not known it was coming. I am the system architect, not the DBA, so I don't have (nor want) direct control over the oracle database operation. Our newbe DBA did not make the table correctly, so they did not grow. Alas he was laid off, thus we were left trying to figure out what was happening. Postgres is easier to configure and get right. IMHO that is one of its very important strengths. It is almost trivial to get a working SQL system up and running which performs well.
mlw <markw@mohawksoft.com> writes: > Tom Lane wrote: >> This seems to be tablespaces with an off-the-cuff syntax. I'd >> suggest taking a hard look at Oracle's tablespace facility and >> seeing how closely we want to duplicate that. > Sorry I missed the conversation about tablespaces. One of the reasons I think > Postgres is so usable is because it does not require the use of tablespace > files. If by tablespace, you mean to declare a directory on a device as a > tablespace, then cool. If you want to create tablespace "files" ala Oracle, you > are heading toward an administration nightmare. No, that's not one of the parts of Oracle's facility that I want to duplicate. I think our idea of a tablespace/location/whatchacallit should just be a directory somewhere that table files can be created in. What seems worthwhile to steal from Oracle is the syntax that assigns particular tables to particular tablespaces. If we're compatible on syntax, that should ease porting of existing applications --- and as far as I can see at the moment, there's no reason *not* to be compatible at that level. I don't want to borrow Oracle's ideas about space management semantics, however. regards, tom lane
Mark, This is why I choose to use the term "LOCATION" instead of "TABLESPACE" . A "LOCATION" is a directory just like Postgresql has today. All the patch would add is the ability to put object under different "LOCATION" for the same database. Jim > Tom Lane wrote: > > > > "Jim Buttafuoco" <jim@buttafuoco.net> writes: > > > I propose to add a default data location, index and temporary locations > > > to the pg_shadow table to allow a DBA to specify locations for each > > > user when they create databases, tables and indexes or need temporary > > > disk storage (either for temporary tables or sort files). > > > > Have you read any of the previous discussions about tablespaces? > > This seems to be tablespaces with an off-the-cuff syntax. I'd > > suggest taking a hard look at Oracle's tablespace facility and > > seeing how closely we want to duplicate that. > > Sorry I missed the conversation about tablespaces. One of the reasons I think > Postgres is so usable is because it does not require the use of tablespace > files. If by tablespace, you mean to declare a directory on a device as a > tablespace, then cool. If you want to create tablespace "files" ala Oracle, you > are heading toward an administration nightmare. Don't get me wrong, the ability > to use a file as a tablespace would be kind of cool, i.e. you can probably use > raw devices, but please to not abandon the way postgres currently works. > > On our Oracle server, we have run out of space on our tablespace files and not > known it was coming. I am the system architect, not the DBA, so I don't have > (nor want) direct control over the oracle database operation. Our newbe DBA did > not make the table correctly, so they did not grow. Alas he was laid off, thus > we were left trying to figure out what was happening. > > Postgres is easier to configure and get right. IMHO that is one of its very > important strengths. It is almost trivial to get a working SQL system up and > running which performs well. > >
Jim Buttafuoco wrote: > > Mark, > > This is why I choose to use the term "LOCATION" instead of "TABLESPACE" > . A "LOCATION" is a directory just like Postgresql has today. All the > patch would add is the ability to put object under different "LOCATION" > for the same database. That is a very excellent point. While I am not in the circle that makes these decisions, I hope your words are heard. I understand the desire to stay with "standards" and it is impossible to deny defacto standards, but I do understand that defacto standards have to be challenged when they don't make sense. A prime example is PostgreSQL's inner/outer join syntax. It is incompatible with Oracle, but compatible with the documented SQL standard. Since "tablespace" is not part of the SQL standard, maybe it makes sense to define a more specific syntax. The term "location" makes sense, because it is not a tablespace as Oracle defines it. There is a real danger is trying to support a different interpretation of an existing "defacto" syntax, in that it will behave differently than expected.
> Since "tablespace" is not part of the SQL standard, maybe it makes sense to > define a more specific syntax. The term "location" makes sense, because it is > not a tablespace as Oracle defines it. It *is* an "OS managed tablespace" in terms of IBM DB2. Methinks the term "TABLESPACE" is perfect for PostgreSQL. The fact whether it is a directory, a file or even a raw device depends on how you create the tablespace. The point is, that the syntax for "create table" and "create index" can be compatible in this case, imho without confusing many. Not the "create tablespace" syntax, but that is imho not an issue. Andreas
I just wanted to affirm that Tom's description sounds like av very good way to go. You get the best of two worlds with the possibility to tune servers and yet still very easy to manage. i.e. If you don't need it, don't mess with it and everything will work just fine. I don't either see any reason not to use the Oracle syntax since it is so widely used and it works very well for those of us that also work on Oracle (but in postgresql without the extent and storage clauses). Regards Stefan Tom Lane wrote: > mlw <markw@mohawksoft.com> writes: > > Tom Lane wrote: > >> This seems to be tablespaces with an off-the-cuff syntax. I'd > >> suggest taking a hard look at Oracle's tablespace facility and > >> seeing how closely we want to duplicate that. > > > Sorry I missed the conversation about tablespaces. One of the reasons I think > > Postgres is so usable is because it does not require the use of tablespace > > files. If by tablespace, you mean to declare a directory on a device as a > > tablespace, then cool. If you want to create tablespace "files" ala Oracle, you > > are heading toward an administration nightmare. > > No, that's not one of the parts of Oracle's facility that I want to > duplicate. > > I think our idea of a tablespace/location/whatchacallit should just be > a directory somewhere that table files can be created in. What seems > worthwhile to steal from Oracle is the syntax that assigns particular > tables to particular tablespaces. If we're compatible on syntax, that > should ease porting of existing applications --- and as far as I can see > at the moment, there's no reason *not* to be compatible at that level. > I don't want to borrow Oracle's ideas about space management semantics, > however. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
Stefan Rindeskar wrote: > > I just wanted to affirm that Tom's description sounds like av very good > way to go. > > You get the best of two worlds with the possibility to tune servers and > yet still > very easy to manage. i.e. If you don't need it, don't mess with it and > everything > will work just fine. > I don't either see any reason not to use the Oracle syntax since it is > so widely used > and it works very well for those of us that also work on Oracle (but in > postgresql > without the extent and storage clauses). > I absolutely agree with the concept of defining a location for data from within the database. No argument. The only two issues I can see are: (1) Do not require the use of files as table spaces ala Oracle. That is an admin nightmare. (Again, it would be cool, however, to be able to use table space files so that PostgreSQL could have raw access as long as it is not a requirement.) I don't think Tom is thinking about table space files, so I'm not worried. (2) I have a concern about expected behavior vs existing syntax. If PostgreSQL uses "create tablespace" in such a way that an Oracle DBA will expect it to work as Oracle does, it may cause a bit of confusion. We all know that "confusion" between an open source solution and a "defacto" solution is used as club.
I am at home now, Tomorrow I will get the exact Oracle syntax and post example commands for CREATE TABLESPACE, CREATE DATABASE, CREATE TABLE and CREATE INDEX, CREATE USER. Jim > > > Since "tablespace" is not part of the SQL standard, maybe it makes > sense to > > define a more specific syntax. The term "location" makes sense, > because it is > > not a tablespace as Oracle defines it. > > It *is* an "OS managed tablespace" in terms of IBM DB2. > Methinks the term "TABLESPACE" is perfect for PostgreSQL. > The fact whether it is a directory, a file or even a raw device > depends on how you create the tablespace. > > The point is, that the syntax for "create table" and "create index" > can be compatible in this case, imho without confusing many. > Not the "create tablespace" syntax, but that is imho not an issue. > > Andreas > >
On Mon, Nov 05, 2001 at 12:26:17PM -0500, Jim Buttafuoco allegedly wrote: > The example below has 3 tablespaces/locations PGDATA/DB1/DB2 > PG_LOCATIONS (or PG_TABLESPACES) would have the following rows > PGDATA | /usr/local/pgsql/data > DB1 | /db1 > DB2 | /db2 <SNIP> > /db1/data/5432 > ^-------------------------<< second location default PG PORT > /db1/data/5432/65894834/99999999 > ^------<< somedb/sometable > /db1/data/5432/65894834/88888888 > ^------<< somedb/someindex > > /db2/data/5432 > ^-------------------------<< DB2 Should data/ even be in there? /db2/5432 seems to be the correct value. Either that or change the location to /db2/data. Implicitly creating an extra directory isn't something I would like to happen, especially if it doesn't happen for PGDATA itself. My $.02, Mathijs -- And the beast shall be made legion. Its numbers shall be increased a thousand thousand fold. The din of a million keyboards like unto a great storm shall cover the earth, and the followers of Mammon shall tremble.
Jim, I see now that you submitted a new version. Folks, do we have a direction for this patch. Discussion of the patch is at: http://candle.pha.pa.us/cgi-bin/pgpatches2 --------------------------------------------------------------------------- Jim Buttafuoco wrote: > Hi all, > > The following is a description of a patch I am proposing for 7.3. > Please read and comment. > > Thanks > Jim > > > This proposal covers the ability to allow a DBA (and general users) to > specify where a database and it's individual objects will reside. I > propose to add a default data location, index and temporary locations > to the pg_shadow table to allow a DBA to specify locations for each > user when they create databases, tables and indexes or need temporary > disk storage (either for temporary tables or sort files). The "CREATE > DATABASE" command will be changed to also take an INDEX location and > temporary location. All 3 locations will default to the values from > pg_shadow for the user that is creating the database. Both the "CREATE > TABLE" and "CREATE INDEX" commands will be changed to add "WITH > LOCATION" optional argument (location will default to values from > PG_DATABASE which were set by the "CREATE DATABASE" command). > > The following system tables will be changed as follows > PG_SHADOW add dat_location, idx_location, tmp_location (all default to > PG_DATA) > PG_DATABASE add dat_location, idx_location, tmp_location (all default > to same from PG_SHADOW) > PG_CLASS add rellocation (default to dat_location for tables, > idx_location for indexes from PG_DATABASE) > > > Add a GLOBAL table pg_locations to track valid locations > > Add the following commands to manage locations > CREATE LOCATION locname PATH 'file system directory'; > DROP LOCATION locname; (this will have to look into each db to make > sure that any objects are not using it. Don't know how this will be > done yet!) > > I propose to change the names of the on disk directories from 999999 to > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from > PG_DATABASE). A SYMLINK from 99999_INDEX and 99999_TEMP will be made > back to 99999_DATA will be made so the WAL functions will continue to > work. > > > Again from my earlier attempt at this patch, I believe this capability > will not only improve performance (see my earlier emails. Where > depending on the type of disks the improvement was between 0% and 100% > performance gain running pg_bench) but also give DBA's the flexibility > to spread the data files over multiple disks without having to "hack" > the system using symbolic links. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Jim, I see now that you submitted a new version. Folks, do we have a > direction for this patch. I didn't like it at the time, and still don't. We are not that far away from having proper tablespaces, and I think that kluges that provide part of the functionality will just get in the way when it comes time to do it right. regards, tom lane
On Fri, 22 Feb 2002, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Jim, I see now that you submitted a new version. Folks, do we have a > > direction for this patch. > > I didn't like it at the time, and still don't. We are not that far away > from having proper tablespaces, and I think that kluges that provide > part of the functionality will just get in the way when it comes time > to do it right. What kind of time frame is "not that far away"? For v7.3? If not, and someone can clarify what I'm understanding this patch will do, its essentially going to setup a directory structure of: data/base/<dboid>/<tbloid>.idx/indx ? If we aren't going to have tablespaces for v7.3, there we are talking 6->8 months before we do, and the above sounds like a reasonable interim solution for this ...
"Marc G. Fournier" <scrappy@hub.org> writes: > On Fri, 22 Feb 2002, Tom Lane wrote: >> I didn't like it at the time, and still don't. We are not that far away >> from having proper tablespaces, and I think that kluges that provide >> part of the functionality will just get in the way when it comes time >> to do it right. > What kind of time frame is "not that far away"? For v7.3? My guess is that any of the inner circle of hackers could make this happen with about a week's work. Whether someone will find time before 7.3 is unknown (particularly seeing that we haven't set a target date for 7.3). Personally, schemas are a higher priority for me ... regards, tom lane
All, I still believe that postgresql needs this feature. I have many postgresql systems that have over 500GB of data+indexes. Using symbolic links is a BIG pain in the A??. Every time I run vacuum I have to go and fix the links again. Also I have many disks that are running out of space. This patch would allow me the ability to move my tables and indexes around. I personally don't see the difference between my patch and what people are calling "Tablespaces" . Oracle's definition is "A group of files that contain database objects" , under my patch tablespaces and locations are the same thing except postgresql uses file system directories to contain the group of objects. To recap my patch (location = tablespace here) Allow the DBA to create locations with a CREATE LOCATION command or CREATE TABLESPACE command if you like tablespace instead of LOCATION. Then for DATABASES (and schemas when available) CREATE DATABASE WITH DATA_LOCATION = XXX and INDEX_LOCATION = YYY where XXX and YYY are the DEFAULT values for OBJECT creation if not LOCATION is given. CREATE TABLE and CREATE INDEX will create tables and indexes in the defaults from the CREATE DATABASE/SCHEMA commands above. CREATE TABLE WITH LOCATION=AAA and CREATE INDEX WITH LOCATION BBB would create the table/index with the alternate location (only if the location was created with a CREATE LOCATION command) The create table command would also have to be change to support primary key/ unique index syntax. create table SAMPLE (c1 text primary key location CCC,c2 text unique location DDD ); I hope this explains my patch better. As I said before and I believe this to be true, This patch will enable the DBA to place tables/indexes on any disk either for performance and/or space reasons. Also I believe this is another check off item for people looking at postgresql when comparing with Oracle/Sybase/DB2 ... Thanks for your time Jim > Jim, I see now that you submitted a new version. Folks, do we have a > direction for this patch. Discussion of the patch is at: > > http://candle.pha.pa.us/cgi-bin/pgpatches2 > > --------------------------------------------------------------------------- > > Jim Buttafuoco wrote: > > Hi all, > > > > The following is a description of a patch I am proposing for 7.3. > > Please read and comment. > > > > Thanks > > Jim > > > > > > This proposal covers the ability to allow a DBA (and general users) to > > specify where a database and it's individual objects will reside. I > > propose to add a default data location, index and temporary locations > > to the pg_shadow table to allow a DBA to specify locations for each > > user when they create databases, tables and indexes or need temporary > > disk storage (either for temporary tables or sort files). The "CREATE > > DATABASE" command will be changed to also take an INDEX location and > > temporary location. All 3 locations will default to the values from > > pg_shadow for the user that is creating the database. Both the "CREATE > > TABLE" and "CREATE INDEX" commands will be changed to add "WITH > > LOCATION" optional argument (location will default to values from > > PG_DATABASE which were set by the "CREATE DATABASE" command). > > > > The following system tables will be changed as follows > > PG_SHADOW add dat_location, idx_location, tmp_location (all default to > > PG_DATA) > > PG_DATABASE add dat_location, idx_location, tmp_location (all default > > to same from PG_SHADOW) > > PG_CLASS add rellocation (default to dat_location for tables, > > idx_location for indexes from PG_DATABASE) > > > > > > Add a GLOBAL table pg_locations to track valid locations > > > > Add the following commands to manage locations > > CREATE LOCATION locname PATH 'file system directory'; > > DROP LOCATION locname; (this will have to look into each db to make > > sure that any objects are not using it. Don't know how this will be > > done yet!) > > > > I propose to change the names of the on disk directories from 999999 to > > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from > > PG_DATABASE). A SYMLINK from 99999_INDEX and 99999_TEMP will be made > > back to 99999_DATA will be made so the WAL functions will continue to > > work. > > > > > > Again from my earlier attempt at this patch, I believe this capability > > will not only improve performance (see my earlier emails. Where > > depending on the type of disks the improvement was between 0% and 100% > > performance gain running pg_bench) but also give DBA's the flexibility > > to spread the data files over multiple disks without having to "hack" > > the system using symbolic links. > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
I think Jim has some very good points here. What does his implementation lack? Seems pretty valuable to me. --------------------------------------------------------------------------- Jim Buttafuoco wrote: > All, > > I still believe that postgresql needs this feature. I have many postgresql > systems that have over 500GB of data+indexes. Using symbolic links is a BIG > pain in the A??. Every time I run vacuum I have to go and fix the links > again. Also I have many disks that are running out of space. This patch > would allow me the ability to move my tables and indexes around. I > personally don't see the difference between my patch and what people are > calling "Tablespaces" . Oracle's definition is "A group of files that contain > database objects" , under my patch tablespaces and locations are the same > thing except postgresql uses file system directories to contain the group of > objects. > > To recap my patch (location = tablespace here) > > Allow the DBA to create locations with a CREATE LOCATION command or CREATE > TABLESPACE command if you like tablespace instead of LOCATION. > > Then for DATABASES (and schemas when available) CREATE DATABASE WITH > DATA_LOCATION = XXX and INDEX_LOCATION = YYY where XXX and YYY are the > DEFAULT values for OBJECT creation if not LOCATION is given. > > CREATE TABLE and CREATE INDEX will create tables and indexes in the defaults > from the CREATE DATABASE/SCHEMA commands above. > > CREATE TABLE WITH LOCATION=AAA and CREATE INDEX WITH LOCATION BBB would create > the table/index with the alternate location (only if the location was created > with a CREATE LOCATION command) > > > The create table command would also have to be change to support primary key/ > unique index syntax. > > create table SAMPLE > ( > c1 text primary key location CCC, > c2 text unique location DDD > ); > > > I hope this explains my patch better. As I said before and I believe this > to be true, This patch will enable the DBA to place tables/indexes on any > disk either for performance and/or space reasons. Also I believe this is > another check off item for people looking at postgresql when comparing with > Oracle/Sybase/DB2 ... > > Thanks for your time > Jim > > > > > > Jim, I see now that you submitted a new version. Folks, do we have a > > direction for this patch. Discussion of the patch is at: > > > > http://candle.pha.pa.us/cgi-bin/pgpatches2 > > > > --------------------------------------------------------------------------- > > > > Jim Buttafuoco wrote: > > > Hi all, > > > > > > The following is a description of a patch I am proposing for 7.3. > > > Please read and comment. > > > > > > Thanks > > > Jim > > > > > > > > > This proposal covers the ability to allow a DBA (and general users) to > > > specify where a database and it's individual objects will reside. I > > > propose to add a default data location, index and temporary locations > > > to the pg_shadow table to allow a DBA to specify locations for each > > > user when they create databases, tables and indexes or need temporary > > > disk storage (either for temporary tables or sort files). The "CREATE > > > DATABASE" command will be changed to also take an INDEX location and > > > temporary location. All 3 locations will default to the values from > > > pg_shadow for the user that is creating the database. Both the "CREATE > > > TABLE" and "CREATE INDEX" commands will be changed to add "WITH > > > LOCATION" optional argument (location will default to values from > > > PG_DATABASE which were set by the "CREATE DATABASE" command). > > > > > > The following system tables will be changed as follows > > > PG_SHADOW add dat_location, idx_location, tmp_location (all default to > > > PG_DATA) > > > PG_DATABASE add dat_location, idx_location, tmp_location (all default > > > to same from PG_SHADOW) > > > PG_CLASS add rellocation (default to dat_location for tables, > > > idx_location for indexes from PG_DATABASE) > > > > > > > > > Add a GLOBAL table pg_locations to track valid locations > > > > > > Add the following commands to manage locations > > > CREATE LOCATION locname PATH 'file system directory'; > > > DROP LOCATION locname; (this will have to look into each db to make > > > sure that any objects are not using it. Don't know how this will be > > > done yet!) > > > > > > I propose to change the names of the on disk directories from 999999 to > > > 99999_DATA, 99999_INDEX and 99999_TEMP (where 99999 is the OID from > > > PG_DATABASE). A SYMLINK from 99999_INDEX and 99999_TEMP will be made > > > back to 99999_DATA will be made so the WAL functions will continue to > > > work. > > > > > > > > > Again from my earlier attempt at this patch, I believe this capability > > > will not only improve performance (see my earlier emails. Where > > > depending on the type of disks the improvement was between 0% and 100% > > > performance gain running pg_bench) but also give DBA's the flexibility > > > to spread the data files over multiple disks without having to "hack" > > > the system using symbolic links. > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 2: you can get off all lists at once with the unregister command > > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 853-3000 > > + If your life is a hard drive, | 830 Blythe Avenue > > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I think Jim has some very good points here. What does his > implementation lack? Forward compatibility to a future tablespace implementation. If we do this, we'll be stuck with supporting this feature set, not to mention this syntax; neither of which have garnered any support from the assembled hackers. I went back to look at TODO.detail/tablespaces, and find that it's badly in need of editing. Much of the discussion there is back-and-forthing about the question of naming files by OID, which is now a done deal. But it is clear that people wanted to have a notion of tablespaces as objects somewhat orthogonal to databases. I didn't see any support for hard-wiring tablespace assignments on the basis of "tables here, indexes there", either. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I think Jim has some very good points here. What does his > > implementation lack? > > Forward compatibility to a future tablespace implementation. > If we do this, we'll be stuck with supporting this feature set, > not to mention this syntax; neither of which have garnered any > support from the assembled hackers. > > I went back to look at TODO.detail/tablespaces, and find that it's > badly in need of editing. Much of the discussion there is > back-and-forthing about the question of naming files by OID, Agreed. > which is now a done deal. But it is clear that people wanted to > have a notion of tablespaces as objects somewhat orthogonal to > databases. I didn't see any support for hard-wiring tablespace > assignments on the basis of "tables here, indexes there", either. OK, I read through it. Wow, it was long. Exactly what is missing from the patch that he can add? It is mostly having tablespaces independent of databases? There were so many proposals in there I am not sure where it all landed. The cleaned out version link to from the TODO list should be slightly easier reading, but it clearly goes all over the place. I can try and read through it and distill down the ideas if that would help. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
... > Forward compatibility to a future tablespace implementation. > If we do this, we'll be stuck with supporting this feature set, > not to mention this syntax; neither of which have garnered any > support from the assembled hackers. The feature set (in some incarnation) is exactly something we should have. "Tablespace" could mean almost anything, since (I recall that) we are not slavishly copying the Oracle features having a similar name. The syntax (or something similar) seems acceptable to me. I haven't looked at the implementation itself. So, I'll guess that the particular objection to this implementation is along the lines of wanting to be able to manage tablespaces/locations as a single entity? So that one could issue commands like (forgive the syntax) "move tablespace xxx to yyy;" and be able to yank the entire contents from one place to another in a single line? Jim's patches don't explicitly tie the pieces residing in a single location together. Is that the objection? In all other respects (and perhaps in all respects period) it seems to be a good starting point at least. I know that you have said that you want to look at "tablespaces" for 7.3. If we get there with a feature set we all find acceptable, then great. If we don't, then Jim's subset of features would be great to have. Comments? - Thomas