Thread: database files

database files

From
Gail Zacharias
Date:
I am investigating the possibility of using pgsql as the database in an application.  I have some unusual requirements
thatI'd like to ask you all about.  I apologize in advance if my terminology is a little "off", I'm not familiar with
pgsql(yet). 

I need to be able to move the database files, as normal user-visible files, between machines. I.e. given a database on
machineA, I want to be able to copy either a single file (ideally) or a single directory (less ideal but still ok) to,
say,a zip drive, bring it over to another machine (with pgsql also installed), start up my application and have it
accessthe copied database through pgsql. 

Is this sort of thing possible?  Is a database stored in a single file or multiple files?  Can the location of the
file(s)be controlled?  Are the files accessible and consistent while pgsql is running? I assume not all the time, but
isthere a reliable way to make them accessible (i.e. copyable) and consistent short of shutting down pgsql? 

Is the file format of the pgsql database files compatible between OS's?  E.g. could I take some database files from
Linuxand use them on Windows? 

Thanks in advance for any advice,

Gail Zacharias


Re: database files

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Gail Zacharias [mailto:gz@clozure.com]
> Sent: Wednesday, October 22, 2003 12:42 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] database files
>
>
> I am investigating the possibility of using pgsql as the
> database in an application.  I have some unusual requirements
> that I'd like to ask you all about.  I apologize in advance
> if my terminology is a little "off", I'm not familiar with
> pgsql (yet).
>
> I need to be able to move the database files, as normal
> user-visible files, between machines. I.e. given a database
> on machine A, I want to be able to copy either a single file
> (ideally) or a single directory (less ideal but still ok) to,
> say, a zip drive, bring it over to another machine (with
> pgsql also installed), start up my application and have it
> access the copied database through pgsql.
>
> Is this sort of thing possible?  Is a database stored in a
> single file or multiple files?  Can the location of the
> file(s) be controlled?  Are the files accessible and
> consistent while pgsql is running? I assume not all the time,
> but is there a reliable way to make them accessible (i.e.
> copyable) and consistent short of shutting down pgsql?
>
> Is the file format of the pgsql database files compatible
> between OS's?  E.g. could I take some database files from
> Linux and use them on Windows?

The generic way to accomplish what you want is with the COPY command.
http://developer.postgresql.org/docs/postgres/sql-copy.html

Re: database files

From
jearl@bullysports.com
Date:
Gail Zacharias <gz@clozure.com> writes:

> I am investigating the possibility of using pgsql as the database in
> an application.  I have some unusual requirements that I'd like to
> ask you all about.  I apologize in advance if my terminology is a
> little "off", I'm not familiar with pgsql (yet).

I think your terminology is fine.

> I need to be able to move the database files, as normal user-visible
> files, between machines. I.e. given a database on machine A, I want
> to be able to copy either a single file (ideally) or a single
> directory (less ideal but still ok) to, say, a zip drive, bring it
> over to another machine (with pgsql also installed), start up my
> application and have it access the copied database through pgsql.

The way to do this with PostgreSQL is to make a backup of the database
and then load it into the other machine.  For example on the master
database you would do:

pg_dumpall --clean --verbose > backup.sql

You would then put that backup.sql file on your zip disk or whatever
and carry it to your new machine where you would do something like:

psql -U postgres template1 -f backup.sql

> Is this sort of thing possible?  Is a database stored in a single
> file or multiple files?  Can the location of the file(s) be
> controlled?  Are the files accessible and consistent while pgsql is
> running? I assume not all the time, but is there a reliable way to
> make them accessible (i.e. copyable) and consistent short of
> shutting down pgsql?

Databases are stored in multiple files in a directory plus the log
files and whatnot are stored in another part of the directory
structure.  It is theoretically possible to shutdown your postmaster
and then copy the files to another location, but I wouldn't recommend
it.  pg_dumpall works well, and it is far more fullproof.

> Is the file format of the pgsql database files compatible between
> OS's?  E.g. could I take some database files from Linux and use them
> on Windows?

I don't know the answer to that, but I would be interested in finding
out.  My theory is that file formats and other arcana are far better
left to Tom Lane and the rest of the PostgreSQL hackers.  This is
especially true considering the fact that on disk formats change
between versions.

Did I mention that pg_dumpall will solve your problem handily?

> Thanks in advance for any advice,

Jason

Re: database files

From
Tom Lane
Date:
jearl@bullysports.com writes:
> Gail Zacharias <gz@clozure.com> writes:
>> Is the file format of the pgsql database files compatible between
>> OS's?  E.g. could I take some database files from Linux and use them
>> on Windows?

> I don't know the answer to that, but I would be interested in finding
> out.

I don't think there are any OS dependencies per se, but there are
certainly hardware dependencies (forget moving between Intel and PPC
without a dump/reload, for example).  And there are compiler
dependencies, so theoretically you could run into problems even for
two different systems on the same hardware platform.

I concur with the recommendation to use pg_dump scripts as the vehicle
for moving data.

            regards, tom lane

Re: database files

From
Paul Thomas
Date:
On 22/10/2003 20:41 Gail Zacharias wrote:
> I am investigating the possibility of using pgsql as the database in an
> application.  I have some unusual requirements that I'd like to ask you
> all about.  I apologize in advance if my terminology is a little "off",
> I'm not familiar with pgsql (yet).
>
> I need to be able to move the database files, as normal user-visible
> files, between machines. I.e. given a database on machine A, I want to be
> able to copy either a single file (ideally) or a single directory (less
> ideal but still ok) to, say, a zip drive, bring it over to another
> machine (with pgsql also installed), start up my application and have it
> access the copied database through pgsql.
>
> Is this sort of thing possible?  Is a database stored in a single file or
> multiple files?  Can the location of the file(s) be controlled?  Are the
> files accessible and consistent while pgsql is running? I assume not all
> the time, but is there a reliable way to make them accessible (i.e.
> copyable) and consistent short of shutting down pgsql?

AFAIK, each database has its own directory and each table or index has its
own file but that won't help you much as they're given numeric names names
on disk (I thinks they use the objects OID) and I doubt that anyone
outside of the core developers would have the knowledge to find out which
files to copy, copy them and then manually edit the system catalogs on the
target machine so that the data can be read. Plus of course, on a *nix
machine, the data and directories are accessible only to the postgres user!

The correct way to do this is using pg_dump which can dump either a whole
database or a single file and then restore onto the other machine using
either psql or pg_restore (which one you use depends on the options you
supply to pg_dump). pg_dump runs inside a transaction which ensures a
consistent view of the dumped data.
>
> Is the file format of the pgsql database files compatible between OS's?
> E.g. could I take some database files from Linux and use them on Windows?

The only issue there _might_ be is the newline character if you dump to
ascii files but you can simply run unix2dos on the dump file in that case.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: database files

From
Alvaro Herrera
Date:
> On 22/10/2003 20:41 Gail Zacharias wrote:
>
> >I need to be able to move the database files, as normal user-visible
> >files, between machines. I.e. given a database on machine A, I want to be
> >able to copy either a single file (ideally) or a single directory (less
> >ideal but still ok) to, say, a zip drive, bring it over to another
> >machine (with pgsql also installed), start up my application and have it
> >access the copied database through pgsql.
> >
> >Is this sort of thing possible?  Is a database stored in a single file or
> >multiple files?  Can the location of the file(s) be controlled?  Are the
> >files accessible and consistent while pgsql is running? I assume not all
> >the time, but is there a reliable way to make them accessible (i.e.
> >copyable) and consistent short of shutting down pgsql?

Yes, multiple, yes up to a point, no, no.

A possible mechanism would be:
- suppose you want to copy data from server A to server B
- server A is running
- server B is stopped
- checkpoint server A
- rsync the files from server A to server B
- stop the postmaster at A
- rsync again (should not take much time)
- start both postmasters

Note that between both rsyncs the data in server B is not usable (i.e.
it is corrupt).  You _have_ to do the last rsync with A's postmaster
stopped to make sure the files are right.

Note that you have to copy the whole PGDATA, including pg_clog and
pg_xlog.  This means server B cannot have anything beyond what is on
server A.

You should probably discard the pg_dump route and erServer before trying
to do this ...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Major Fambrough: You wish to see the frontier?
John Dunbar: Yes sir, before it's gone.

Re: database files

From
"scott.marlowe"
Date:
On Wed, 22 Oct 2003, Gail Zacharias wrote:

> I need to be able to move the database files, as normal user-visible
> files, between machines. I.e. given a database on machine A, I want to
> be able to copy either a single file (ideally) or a single directory
> (less ideal but still ok) to, say, a zip drive, bring it over to another
> machine (with pgsql also installed), start up my application and have it
> access the copied database through pgsql.

While you think this is the preferred method, for postgresql is most
certainly is not.  what you need to do is read up a bit on pg_dump and how
to use it to accomplish your goals.  For instance, suppose I have two
machines, A and B, and I want to copy the table accounts from the test
database on A to B.  Assuming that the test database exists, but the table
accounts doesn't, I can do this (Note these are all command line
programs, not psql):

pg_dump -h A test -t accounts |psql -h B test

Or, if I want to move a whole single database over:

createdb -h B dbname
pg_dump -h A dbname |psql -h B dbname

(This assumes the database dbname didn't exist.)

or, the biggie, assuming B is a freshly initdb'd database, and I want to
move ALL the databases from A to B:

pg_dumpall -h A|psql -h B

Moving individual database files around is a certifiably Bad idea.


Re: database files

From
Ken Godee
Date:
> On 22/10/2003 20:41 Gail Zacharias wrote:
>
>> I am investigating the possibility of using pgsql as the database in an
>> application.  I have some unusual requirements that I'd like to ask you
>> all about.  I apologize in advance if my terminology is a little "off",
>> I'm not familiar with pgsql (yet).
>>
>> I need to be able to move the database files, as normal user-visible
>> files, between machines. I.e. given a database on machine A, I want to be
>> able to copy either a single file (ideally) or a single directory (less
>> ideal but still ok) to, say, a zip drive, bring it over to another
>> machine (with pgsql also installed), start up my application and have it
>> access the copied database through pgsql.
>>
>> Is this sort of thing possible?  Is a database stored in a single file or
>> multiple files?  Can the location of the file(s) be controlled?  Are the
>> files accessible and consistent while pgsql is running? I assume not all
>> the time, but is there a reliable way to make them accessible (i.e.
>> copyable) and consistent short of shutting down pgsql?

Couldn't one ...
export PGDATA2 = /usr/local/database
create the original database in the PGDATA2
and then when you want to copy it, stop postgres,
recursively copy the database directory, start postgres.

On new machine...
export PGDATA2 = /usr/local/database
copy original database to new machine
start postrges

Would not postgres be able to connect to this
copied database on the new machine?




Re: database files

From
Christopher Browne
Date:
gz@clozure.com (Gail Zacharias) wrote:
> I am investigating the possibility of using pgsql as the database in
> an application.  I have some unusual requirements that I'd like to
> ask you all about.  I apologize in advance if my terminology is a
> little "off", I'm not familiar with pgsql (yet).

> I need to be able to move the database files, as normal user-visible
> files, between machines. I.e. given a database on machine A, I want
> to be able to copy either a single file (ideally) or a single
> directory (less ideal but still ok) to, say, a zip drive, bring it
> over to another machine (with pgsql also installed), start up my
> application and have it access the copied database through pgsql.
>
> Is this sort of thing possible?

Many things are possible.  Not all are sensible.  The approach you
seem to want to take appears to fit into the "not sensible" category.

> Is a database stored in a single file or multiple files?

Lots of files.

> Can the location of the file(s) be controlled?

Yes, to a degree, either by fancy footwork when the database is shut
down, or, in the case of specific data files, via how you create them.

> Are the files accessible and consistent while pgsql is running?

Only if you have some sort of logical volume manager around that can
copy a whole filesystem around atomically.

> I assume not all the time, but is there a reliable way to make them
> accessible (i.e. copyable) and consistent short of shutting down
> pgsql?

If you are storing all of the data atop some logical volume manager
system such as Veritas or Tru64 "AdvFs" or Linux LVM, then there is
probably a way, but I'm not sure there is any equivalent on Windows,
so it seems unlikely that this could be practical.

> Is the file format of the pgsql database files compatible between
> OS's?  E.g. could I take some database files from Linux and use them
> on Windows?

Not generally, no.

It sounds as though the things you are trying to do are more or less
the exact opposite of what is generally considered "reasonable usage."

If you're at clozure, you're doubtless aware of the notion of taking
different approaches with different languages.  Good Common Lisp code
isn't written the same way as colloquial Scheme which doesn't look at
all like colloquial C++ or Java.

There is an approach to doing this that _would_ provide consistent
copies, dumped into one file, of all of the data, that could indeed be
loaded onto another system without need to shut the database down.

Look at the documentation for pg_dump; that does what you _actually
want_, albeit not in the way you are asking to do it.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org').
http://www.ntlug.org/~cbbrowne/spiritual.html
"... the most important thing in the programming language is the name.  A
language  will  not succeed  without  a  good name.   I  have recently
invented  a very  good  name  and  now I   am looking for  a  suitable
language."  -- D. E. Knuth, 1967