Thread: database files
I am investigating the possibility of using pgsql as the database in an application. I have some unusual requirements thatI'd like to ask you all about. I apologize in advance if my terminology is a little "off", I'm not familiar with pgsql(yet). I need to be able to move the database files, as normal user-visible files, between machines. I.e. given a database on machineA, I want to be able to copy either a single file (ideally) or a single directory (less ideal but still ok) to, say,a zip drive, bring it over to another machine (with pgsql also installed), start up my application and have it accessthe copied database through pgsql. Is this sort of thing possible? Is a database stored in a single file or multiple files? Can the location of the file(s)be controlled? Are the files accessible and consistent while pgsql is running? I assume not all the time, but isthere a reliable way to make them accessible (i.e. copyable) and consistent short of shutting down pgsql? Is the file format of the pgsql database files compatible between OS's? E.g. could I take some database files from Linuxand use them on Windows? Thanks in advance for any advice, Gail Zacharias
> -----Original Message----- > From: Gail Zacharias [mailto:gz@clozure.com] > Sent: Wednesday, October 22, 2003 12:42 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] database files > > > I am investigating the possibility of using pgsql as the > database in an application. I have some unusual requirements > that I'd like to ask you all about. I apologize in advance > if my terminology is a little "off", I'm not familiar with > pgsql (yet). > > I need to be able to move the database files, as normal > user-visible files, between machines. I.e. given a database > on machine A, I want to be able to copy either a single file > (ideally) or a single directory (less ideal but still ok) to, > say, a zip drive, bring it over to another machine (with > pgsql also installed), start up my application and have it > access the copied database through pgsql. > > Is this sort of thing possible? Is a database stored in a > single file or multiple files? Can the location of the > file(s) be controlled? Are the files accessible and > consistent while pgsql is running? I assume not all the time, > but is there a reliable way to make them accessible (i.e. > copyable) and consistent short of shutting down pgsql? > > Is the file format of the pgsql database files compatible > between OS's? E.g. could I take some database files from > Linux and use them on Windows? The generic way to accomplish what you want is with the COPY command. http://developer.postgresql.org/docs/postgres/sql-copy.html
Gail Zacharias <gz@clozure.com> writes: > I am investigating the possibility of using pgsql as the database in > an application. I have some unusual requirements that I'd like to > ask you all about. I apologize in advance if my terminology is a > little "off", I'm not familiar with pgsql (yet). I think your terminology is fine. > I need to be able to move the database files, as normal user-visible > files, between machines. I.e. given a database on machine A, I want > to be able to copy either a single file (ideally) or a single > directory (less ideal but still ok) to, say, a zip drive, bring it > over to another machine (with pgsql also installed), start up my > application and have it access the copied database through pgsql. The way to do this with PostgreSQL is to make a backup of the database and then load it into the other machine. For example on the master database you would do: pg_dumpall --clean --verbose > backup.sql You would then put that backup.sql file on your zip disk or whatever and carry it to your new machine where you would do something like: psql -U postgres template1 -f backup.sql > Is this sort of thing possible? Is a database stored in a single > file or multiple files? Can the location of the file(s) be > controlled? Are the files accessible and consistent while pgsql is > running? I assume not all the time, but is there a reliable way to > make them accessible (i.e. copyable) and consistent short of > shutting down pgsql? Databases are stored in multiple files in a directory plus the log files and whatnot are stored in another part of the directory structure. It is theoretically possible to shutdown your postmaster and then copy the files to another location, but I wouldn't recommend it. pg_dumpall works well, and it is far more fullproof. > Is the file format of the pgsql database files compatible between > OS's? E.g. could I take some database files from Linux and use them > on Windows? I don't know the answer to that, but I would be interested in finding out. My theory is that file formats and other arcana are far better left to Tom Lane and the rest of the PostgreSQL hackers. This is especially true considering the fact that on disk formats change between versions. Did I mention that pg_dumpall will solve your problem handily? > Thanks in advance for any advice, Jason
jearl@bullysports.com writes: > Gail Zacharias <gz@clozure.com> writes: >> Is the file format of the pgsql database files compatible between >> OS's? E.g. could I take some database files from Linux and use them >> on Windows? > I don't know the answer to that, but I would be interested in finding > out. I don't think there are any OS dependencies per se, but there are certainly hardware dependencies (forget moving between Intel and PPC without a dump/reload, for example). And there are compiler dependencies, so theoretically you could run into problems even for two different systems on the same hardware platform. I concur with the recommendation to use pg_dump scripts as the vehicle for moving data. regards, tom lane
On 22/10/2003 20:41 Gail Zacharias wrote: > I am investigating the possibility of using pgsql as the database in an > application. I have some unusual requirements that I'd like to ask you > all about. I apologize in advance if my terminology is a little "off", > I'm not familiar with pgsql (yet). > > I need to be able to move the database files, as normal user-visible > files, between machines. I.e. given a database on machine A, I want to be > able to copy either a single file (ideally) or a single directory (less > ideal but still ok) to, say, a zip drive, bring it over to another > machine (with pgsql also installed), start up my application and have it > access the copied database through pgsql. > > Is this sort of thing possible? Is a database stored in a single file or > multiple files? Can the location of the file(s) be controlled? Are the > files accessible and consistent while pgsql is running? I assume not all > the time, but is there a reliable way to make them accessible (i.e. > copyable) and consistent short of shutting down pgsql? AFAIK, each database has its own directory and each table or index has its own file but that won't help you much as they're given numeric names names on disk (I thinks they use the objects OID) and I doubt that anyone outside of the core developers would have the knowledge to find out which files to copy, copy them and then manually edit the system catalogs on the target machine so that the data can be read. Plus of course, on a *nix machine, the data and directories are accessible only to the postgres user! The correct way to do this is using pg_dump which can dump either a whole database or a single file and then restore onto the other machine using either psql or pg_restore (which one you use depends on the options you supply to pg_dump). pg_dump runs inside a transaction which ensures a consistent view of the dumped data. > > Is the file format of the pgsql database files compatible between OS's? > E.g. could I take some database files from Linux and use them on Windows? The only issue there _might_ be is the newline character if you dump to ascii files but you can simply run unix2dos on the dump file in that case. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
> On 22/10/2003 20:41 Gail Zacharias wrote: > > >I need to be able to move the database files, as normal user-visible > >files, between machines. I.e. given a database on machine A, I want to be > >able to copy either a single file (ideally) or a single directory (less > >ideal but still ok) to, say, a zip drive, bring it over to another > >machine (with pgsql also installed), start up my application and have it > >access the copied database through pgsql. > > > >Is this sort of thing possible? Is a database stored in a single file or > >multiple files? Can the location of the file(s) be controlled? Are the > >files accessible and consistent while pgsql is running? I assume not all > >the time, but is there a reliable way to make them accessible (i.e. > >copyable) and consistent short of shutting down pgsql? Yes, multiple, yes up to a point, no, no. A possible mechanism would be: - suppose you want to copy data from server A to server B - server A is running - server B is stopped - checkpoint server A - rsync the files from server A to server B - stop the postmaster at A - rsync again (should not take much time) - start both postmasters Note that between both rsyncs the data in server B is not usable (i.e. it is corrupt). You _have_ to do the last rsync with A's postmaster stopped to make sure the files are right. Note that you have to copy the whole PGDATA, including pg_clog and pg_xlog. This means server B cannot have anything beyond what is on server A. You should probably discard the pg_dump route and erServer before trying to do this ... -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) Major Fambrough: You wish to see the frontier? John Dunbar: Yes sir, before it's gone.
On Wed, 22 Oct 2003, Gail Zacharias wrote: > I need to be able to move the database files, as normal user-visible > files, between machines. I.e. given a database on machine A, I want to > be able to copy either a single file (ideally) or a single directory > (less ideal but still ok) to, say, a zip drive, bring it over to another > machine (with pgsql also installed), start up my application and have it > access the copied database through pgsql. While you think this is the preferred method, for postgresql is most certainly is not. what you need to do is read up a bit on pg_dump and how to use it to accomplish your goals. For instance, suppose I have two machines, A and B, and I want to copy the table accounts from the test database on A to B. Assuming that the test database exists, but the table accounts doesn't, I can do this (Note these are all command line programs, not psql): pg_dump -h A test -t accounts |psql -h B test Or, if I want to move a whole single database over: createdb -h B dbname pg_dump -h A dbname |psql -h B dbname (This assumes the database dbname didn't exist.) or, the biggie, assuming B is a freshly initdb'd database, and I want to move ALL the databases from A to B: pg_dumpall -h A|psql -h B Moving individual database files around is a certifiably Bad idea.
> On 22/10/2003 20:41 Gail Zacharias wrote: > >> I am investigating the possibility of using pgsql as the database in an >> application. I have some unusual requirements that I'd like to ask you >> all about. I apologize in advance if my terminology is a little "off", >> I'm not familiar with pgsql (yet). >> >> I need to be able to move the database files, as normal user-visible >> files, between machines. I.e. given a database on machine A, I want to be >> able to copy either a single file (ideally) or a single directory (less >> ideal but still ok) to, say, a zip drive, bring it over to another >> machine (with pgsql also installed), start up my application and have it >> access the copied database through pgsql. >> >> Is this sort of thing possible? Is a database stored in a single file or >> multiple files? Can the location of the file(s) be controlled? Are the >> files accessible and consistent while pgsql is running? I assume not all >> the time, but is there a reliable way to make them accessible (i.e. >> copyable) and consistent short of shutting down pgsql? Couldn't one ... export PGDATA2 = /usr/local/database create the original database in the PGDATA2 and then when you want to copy it, stop postgres, recursively copy the database directory, start postgres. On new machine... export PGDATA2 = /usr/local/database copy original database to new machine start postrges Would not postgres be able to connect to this copied database on the new machine?
gz@clozure.com (Gail Zacharias) wrote: > I am investigating the possibility of using pgsql as the database in > an application. I have some unusual requirements that I'd like to > ask you all about. I apologize in advance if my terminology is a > little "off", I'm not familiar with pgsql (yet). > I need to be able to move the database files, as normal user-visible > files, between machines. I.e. given a database on machine A, I want > to be able to copy either a single file (ideally) or a single > directory (less ideal but still ok) to, say, a zip drive, bring it > over to another machine (with pgsql also installed), start up my > application and have it access the copied database through pgsql. > > Is this sort of thing possible? Many things are possible. Not all are sensible. The approach you seem to want to take appears to fit into the "not sensible" category. > Is a database stored in a single file or multiple files? Lots of files. > Can the location of the file(s) be controlled? Yes, to a degree, either by fancy footwork when the database is shut down, or, in the case of specific data files, via how you create them. > Are the files accessible and consistent while pgsql is running? Only if you have some sort of logical volume manager around that can copy a whole filesystem around atomically. > I assume not all the time, but is there a reliable way to make them > accessible (i.e. copyable) and consistent short of shutting down > pgsql? If you are storing all of the data atop some logical volume manager system such as Veritas or Tru64 "AdvFs" or Linux LVM, then there is probably a way, but I'm not sure there is any equivalent on Windows, so it seems unlikely that this could be practical. > Is the file format of the pgsql database files compatible between > OS's? E.g. could I take some database files from Linux and use them > on Windows? Not generally, no. It sounds as though the things you are trying to do are more or less the exact opposite of what is generally considered "reasonable usage." If you're at clozure, you're doubtless aware of the notion of taking different approaches with different languages. Good Common Lisp code isn't written the same way as colloquial Scheme which doesn't look at all like colloquial C++ or Java. There is an approach to doing this that _would_ provide consistent copies, dumped into one file, of all of the data, that could indeed be loaded onto another system without need to shut the database down. Look at the documentation for pg_dump; that does what you _actually want_, albeit not in the way you are asking to do it. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org'). http://www.ntlug.org/~cbbrowne/spiritual.html "... the most important thing in the programming language is the name. A language will not succeed without a good name. I have recently invented a very good name and now I am looking for a suitable language." -- D. E. Knuth, 1967