Thread: Alternative database locations are broken
With the new oid file naming, the alternative database location feature has disappeared. Not good. Also, is there any merit in keeping the #ifdef OLD_FILE_NAMING code path? I could probably go through and fix this, but I'm not fully aware about the larger plan of table spaces that's apparently sneaking in here (cf. RelFileNode.tblNode). -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
> With the new oid file naming, the alternative database > location feature has disappeared. Not good. > > Also, is there any merit in keeping the #ifdef > OLD_FILE_NAMING code path? No one. I've removed some of old code but not all, sorry. > I could probably go through and fix this, but I'm not fully > aware about the larger plan of table spaces that's apparently > sneaking in here (cf. RelFileNode.tblNode). This would be very appreciated. Table spaces will be in 7.2, hopefully. For the moment tblNode is just database OID (InvalidOid for shared relations). I think that to handle locations we could symlink catalogs - ln -s path_to_database_in_some_location .../base/DatabaseOid TIA, Vadim
Mikheev, Vadim writes: > > I could probably go through and fix this, but I'm not fully > > aware about the larger plan of table spaces that's apparently > > sneaking in here (cf. RelFileNode.tblNode). > > This would be very appreciated. Table spaces will be in 7.2, > hopefully. For the moment tblNode is just database OID > (InvalidOid for shared relations). I think we have a bit of a problem here. In order to restore the previously existing alternative location feature we'd somehow have to stick this information into RelFileNode. Firstly, alternative locations were referenced as text strings (usually environment variable names), which doesn't seem appropriate to stick into RelFileNode. We could make a separate system catalog (as I have suggested several times) to assign oids to these locations. But RelFileNode already claims to store the identity of the table space, being the database oid. This doesn't work because a location can contain more than one database. So effectively we'd need to redefine RelFileNode something like 'struct { locationid, dbid, relid }'. I'm afraid I feel incompetent here. RelFileNode is used in too many places that I don't understand. > I think that to handle locations we could symlink catalogs - ln -s > path_to_database_in_some_location .../base/DatabaseOid But that's a kludge. We ought to discourage people from messing with the storage internals. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > But RelFileNode already claims to store the identity of the table space, > being the database oid. This doesn't work because a location can contain > more than one database. So effectively we'd need to redefine RelFileNode > something like 'struct { locationid, dbid, relid }'. No, I don't think so. The direction we want to head in is that RelFileNode should identify a tablespace (physical storage location) and a table. There isn't any need for a hardwired association between tablespaces and databases, at least not at this level. IIRC, the proposed design that Vadim was basing this on is that the actual path to a particular file would be$PGDATA/base/TABLESPACE/TABLE or for a segmented relation$PGDATA/base/TABLESPACE/TABLE.SEGMENT where TABLESPACE, TABLE, and SEGMENT are all numeric strings --- the first two come from RelFileNode and the last from the target block #. In this design, the tablespace directories appearing under $PGDATA/base can either be plain subdirectories, or symlinks to directories that live elsewhere. The low-level file access code doesn't know or care which. The questions you are asking seem to concern design of a user interface that lets these directories or symlinks get set up via SQL commands rather than direct manual intervention. I agree that's a good thing to have, but it's completely separate from the low-level access code. The current implementation has one physical-subdirectory tablespace per database, but I don't see any reason that multiple databases couldn't share a tablespace, or that tables in a database couldn't be scattered across multiple tablespaces. We just need to design the commands that let the dbadmin control this. BTW, we could eliminate special-casing for the shared system relations if we treat them as stored in another tablespace. For example, make $PGDATA/base/0 be a symlink to ../global, or just move the stuff currently in $PGDATA/global to a subdirectory of $PGDATA/base. >> I think that to handle locations we could symlink catalogs - ln -s >> path_to_database_in_some_location .../base/DatabaseOid > But that's a kludge. We ought to discourage people from messing with the > storage internals. It's not a kluge, it's a perfectly fine implementation. The only kluge here is if people have to reach in and establish such symlinks by hand. We want to set up a user interface that hides the implementation. regards, tom lane
> >> I think that to handle locations we could symlink catalogs - ln -s > >> path_to_database_in_some_location .../base/DatabaseOid > > > But that's a kludge. We ought to discourage people from messing with the > > storage internals. > > It's not a kluge, it's a perfectly fine implementation. The only kluge > here is if people have to reach in and establish such symlinks by hand. > We want to set up a user interface that hides the implementation. Agreed. And I don't see problems with handling this at CREATE DATABASE time. Create database dir in specified location, create symlink from base dir and remember location name in pg_database.datpath. Vadim
On Sat, Nov 04, 2000 at 10:09:16PM -0800, Vadim Mikheev wrote: > > >> I think that to handle locations we could symlink catalogs - ln -s > > >> path_to_database_in_some_location .../base/DatabaseOid > > > > > But that's a kludge. We ought to discourage people from messing with the > > > storage internals. > > > > It's not a kluge, it's a perfectly fine implementation. The only kluge > > here is if people have to reach in and establish such symlinks by hand. > > We want to set up a user interface that hides the implementation. > > Agreed. And I don't see problems with handling this at CREATE DATABASE > time. Create database dir in specified location, create symlink from > base dir and remember location name in pg_database.datpath. > Hmm, I know NT's not really a target, supported OS, but enshrining symlinks in a the design of a backend feature makes it really difficult to keep even the semblance of support. Vadim's work _finally_ stomped the mixed case tablename bug ("Test" and "test" would collide because of NTFSi being case insensitive). Symlinks are, I think, only supported via a Cygwin kludge. 'Course, one could argue that running pgsql via Cygwin is all a big kludge. I'm not even sure why I keep coming to the defense of the NT port: I'm not using it myself. I keep getting the feeling that there's a real opportunity there: get pgsql onto developer's NT boxes, when their projects need a real database, rather than springing for an MS-SQL or Oracle license. Makes moving over to a _real_ operating system (when they start to notice those stability problems) that much easier. But seriously, there was a long thread concerning the appropriateness of using symlinks to manage storage, which I don't recall as coming to a conclusion. Admittedly, the opinion of those who take the bull by the horns and actually write the code matters more (rough concensus and working code, as they say). Ross -- Open source code is like a natural resource, it's the result of providing food and sunshine to programmers, and then staying out of their way. [...] [It] is not going away because it has utility for both the developers and users independent of economic motivations. Jim Flynn, Sunnyvale, Calif.
"Ross J. Reedstrom" <reedstrm@rice.edu> writes: > Hmm, I know NT's not really a target, supported OS, but enshrining > symlinks in a the design of a backend feature makes it really difficult > to keep even the semblance of support. As long as we don't require symlinks to exist for standard setups, I have no trouble at all with decreeing that alternate database locations won't work on NT. regards, tom lane
On Mon, Nov 06, 2000 at 12:13:31PM -0500, Tom Lane wrote: > "Ross J. Reedstrom" <reedstrm@rice.edu> writes: > > Hmm, I know NT's not really a target, supported OS, but enshrining > > symlinks in a the design of a backend feature makes it really difficult > > to keep even the semblance of support. > > As long as we don't require symlinks to exist for standard setups, > I have no trouble at all with decreeing that alternate database > locations won't work on NT. Just tested 'ln -s' under Cygwin bash, and at least less and cat follow the link, notepad.exe and wordpad.exe open the link file. Checking the Cygwin FAQ, I see that this was a red herring: postgres, being a cygwin app, should follow symlinks just fine. Ross -- Open source code is like a natural resource, it's the result of providing food and sunshine to programmers, and then staying out of their way. [...] [It] is not going away because it has utility for both the developers and users independent of economic motivations. Jim Flynn, Sunnyvale, Calif.
Okay, so we'll do the symlinks. CREATE DATABASE xxx WITH LOCATION='/else/where'; will clone ('cp -r') template1 in /else/where/base/<id> and create a symlink to there from $PGDATA/base/<id>. The '/else/where' location will be stored in pg_database.datpath. How do we control the allowed paths? Should we continue with the environment variables? Perhaps a config option listing the allowed directories? A system catalog? Somehow I also get the feeling that pg_dumpall should be saving these paths... -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > How do we control the allowed paths? Should we continue with the > environment variables? Perhaps a config option listing the allowed > directories? A system catalog? The environment variables are a pretty sucky mechanism, IMHO; an installation-wide catalog would be nicer. HOWEVER: I do not think it's reasonable to try to make that happen for 7.1, considering how close we are to beta. So I recommend that we continue to base allowed paths on environment variables for this release. > Somehow I also get the feeling that pg_dumpall should be saving these > paths... Yup, probably so. If you stick the LOCATION string into pg_database.datpath (which no longer has any other use) then it'd be easy to make pg_dumpall do so. regards, tom lane