Re: backup and restore questions - Mailing list pgsql-general

From scott.marlowe
Subject Re: backup and restore questions
Date
Msg-id Pine.LNX.4.33.0402230908330.28821-100000@css120.ihs.com
Whole thread Raw
In response to backup and restore questions  ("Sally Sally" <dedeb17@hotmail.com>)
List pgsql-general
On Fri, 20 Feb 2004, Sally Sally wrote:

> Thank you all for replying. I appreciate the tips. Apologies to those who
> were offended by the html formating.
> Scott, quick question. The reason I assumed insert would be "safer" than
> copy is because the docs say that in the case of copy it fails on a single
> corrupted row whereas insert won't?

Right.  What that means in plain terms though is that a single rows causes
an entire import of a table to fail.  While individual inserts, which are
handled by individual transaction, can individually fail.

You can, however, edit the dump / extract a portion of it and wrap it in a
begin / commit pair.  Note that postgresql will not commit any transaction
with an error, so you don't have to worry about it accidentally commiting
if the data errors out.

Also, collecting as many inserts as possible in a transaction will
generally make postgresql faster, up to a point.  While there's no great
gain in inserting any more than a few thousand rows at a time, there's no
real harm in inserting many more (million).  Unlike Oracle, which uses
rollback segments, postgresql uses the free disk space to just add new
tuples, so there's no real world limit to the size of your transactions,
except for the real world issue that a transaction taking that long to
insert rows may be an issue if you need to see the data from other clients
as it comes in.

In terms of importing, it may often be that you just want the good rows,
dump the bad, and move on.  If this is the case, individual inserts are
the best choice.  It is also fairly slow due to the fact that Postgesql
must build up and tear down a transaction for each row.

you may also have data that every row must go in, or you don't want any of
it.  If this is the case, either the copy command, or a series of inserts
inside the same transaction are the best choice.  They are also
the fastest, with copy slightly outperforming the inserts, at least in the
past.  I haven't really tested one against the other lately because with
7.4 it's all so damned fast I only take about 15 minutes to backup or
restore our whole database.


pgsql-general by date:

Previous
From: Sean Shanny
Date:
Subject: Found this in the server log on MAC OSX
Next
From: "Sally Sally"
Date:
Subject: Re: backup and restore questions