Thread: dealing with file size when archiving databases

dealing with file size when archiving databases

From
"Andrew L. Gould"
Date:
I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R.  Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media.  One of my
compressed database backups is greater than 1GB; and the results of a
gzipped pg_dumpall is approximately 3.5GB.  The processes for creating
the iso image and burning the image to DVD-R finish without any
problems; but the resulting file is unreadable/unusable.

My proposed solution is to modify my python script to:

1. use pg_dump to dump each database's tables individually, including
both the database and table name in the file name;
3. use 'pg_dumpall -g' to dump the global information; and
4. burn the backup directories, files and a recovery script to DVD-R.

The script will pipe pg_dump into gzip to compress the files.

My questions are:

1. Will 'pg_dumpall -g' dump everything not dumped by pg_dump?  Will I
be missing anything?
2. Does anyone foresee any problems with the solution above?

Thanks,

Andrew Gould

Re: dealing with file size when archiving databases

From
Tom Lane
Date:
"Andrew L. Gould" <algould@datawok.com> writes:
> I've been backing up my databases by piping pg_dump into gzip and
> burning the resulting files to a DVD-R.  Unfortunately, FreeBSD has
> problems dealing with very large files (>1GB?) on DVD media.  One of my
> compressed database backups is greater than 1GB; and the results of a
> gzipped pg_dumpall is approximately 3.5GB.  The processes for creating
> the iso image and burning the image to DVD-R finish without any
> problems; but the resulting file is unreadable/unusable.

Yech.  However, I think you are reinventing the wheel in your proposed
solution.  Why not just use split(1) to divide the output of pg_dump or
pg_dumpall into slices that the DVD software won't choke on?  See
notes at
http://developer.postgresql.org/docs/postgres/backup.html#BACKUP-DUMP-LARGE

            regards, tom lane

Re: dealing with file size when archiving databases

From
Alvaro Herrera
Date:
On Mon, Jun 20, 2005 at 09:28:51PM -0500, Andrew L. Gould wrote:
> I've been backing up my databases by piping pg_dump into gzip and
> burning the resulting files to a DVD-R.  Unfortunately, FreeBSD has
> problems dealing with very large files (>1GB?) on DVD media.  One of my
> compressed database backups is greater than 1GB; and the results of a
> gzipped pg_dumpall is approximately 3.5GB.  The processes for creating
> the iso image and burning the image to DVD-R finish without any
> problems; but the resulting file is unreadable/unusable.

Tom's response is certainly something to consider; also, note that if
you "pg_dump -t" each table separately, the dumps are not necessarily
consistent with one another, meaning that you could end up with an
unrecoverable backup if a transaction modifying two (foreign key-)
dependant tables happens to run after backing up one but before backing
up the other.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)

Re: dealing with file size when archiving databases

From
"Andrew L. Gould"
Date:
On Monday 20 June 2005 09:53 pm, Tom Lane wrote:
> "Andrew L. Gould" <algould@datawok.com> writes:
> > I've been backing up my databases by piping pg_dump into gzip and
> > burning the resulting files to a DVD-R.  Unfortunately, FreeBSD has
> > problems dealing with very large files (>1GB?) on DVD media.  One
> > of my compressed database backups is greater than 1GB; and the
> > results of a gzipped pg_dumpall is approximately 3.5GB.  The
> > processes for creating the iso image and burning the image to DVD-R
> > finish without any problems; but the resulting file is
> > unreadable/unusable.
>
> Yech.  However, I think you are reinventing the wheel in your
> proposed solution.  Why not just use split(1) to divide the output of
> pg_dump or pg_dumpall into slices that the DVD software won't choke
> on?  See notes at
> http://developer.postgresql.org/docs/postgres/backup.html#BACKUP-DUMP
>-LARGE
>
>             regards, tom lane

Thanks, Tom!  The split option also fixes the problem; whereas my
"solution", only delays the problem until a table gets too large.  Of
course, at that point, I should probably use something other than
DVD's.

Andrew Gould

Re: dealing with file size when archiving databases

From
"Andrew L. Gould"
Date:
On Monday 20 June 2005 10:14 pm, Alvaro Herrera wrote:
> On Mon, Jun 20, 2005 at 09:28:51PM -0500, Andrew L. Gould wrote:
> > I've been backing up my databases by piping pg_dump into gzip and
> > burning the resulting files to a DVD-R.  Unfortunately, FreeBSD has
> > problems dealing with very large files (>1GB?) on DVD media.  One
> > of my compressed database backups is greater than 1GB; and the
> > results of a gzipped pg_dumpall is approximately 3.5GB.  The
> > processes for creating the iso image and burning the image to DVD-R
> > finish without any problems; but the resulting file is
> > unreadable/unusable.
>
> Tom's response is certainly something to consider; also, note that if
> you "pg_dump -t" each table separately, the dumps are not necessarily
> consistent with one another, meaning that you could end up with an
> unrecoverable backup if a transaction modifying two (foreign key-)
> dependant tables happens to run after backing up one but before
> backing up the other.

Ouch!  I hadn't thought of that.  (Isn't it wonderful that I've only had
to restore dumps for upgrades?!)

Thanks.

Andrew Gould

Re: dealing with file size when archiving databases

From
Tino Wildenhain
Date:
Am Montag, den 20.06.2005, 21:28 -0500 schrieb Andrew L. Gould:
> I've been backing up my databases by piping pg_dump into gzip and
> burning the resulting files to a DVD-R.  Unfortunately, FreeBSD has
> problems dealing with very large files (>1GB?) on DVD media.  One of my
> compressed database backups is greater than 1GB; and the results of a
> gzipped pg_dumpall is approximately 3.5GB.  The processes for creating
> the iso image and burning the image to DVD-R finish without any
> problems; but the resulting file is unreadable/unusable.
>
> My proposed solution is to modify my python script to:
>
> 1. use pg_dump to dump each database's tables individually, including
> both the database and table name in the file name;
> 3. use 'pg_dumpall -g' to dump the global information; and
> 4. burn the backup directories, files and a recovery script to DVD-R.
>
> The script will pipe pg_dump into gzip to compress the files.

I'd use pg_dump -Fc instead. It is compressed and you get some more
options for restore for free (selective restore for example)

> My questions are:
>
> 1. Will 'pg_dumpall -g' dump everything not dumped by pg_dump?  Will I
> be missing anything?
> 2. Does anyone foresee any problems with the solution above?

Yes, the files might be too big for one DVD at a time.



Re: dealing with file size when archiving databases

From
Vivek Khera
Date:
On Jun 20, 2005, at 10:28 PM, Andrew L. Gould wrote:

> compressed database backups is greater than 1GB; and the results of a
> gzipped pg_dumpall is approximately 3.5GB.  The processes for creating
> the iso image and burning the image to DVD-R finish without any
> problems; but the resulting file is unreadable/unusable.

I ran into this as well.  Apparently FreeBSD will not read a large
file on an ISO file system even though on a standard UFS or UFS2 fs
it will read files larger than you can make :-).

What I used to do was "split -b 1024m my.dump my.dump-split-" to
create multiple files and burn those to the DVD.  To restore, you
"cat my.dump.split.?? | pg_restore" with appropriate options to
pg_restore.

My ultimate fix was to start burning and reading the DVD's on my
MacOS desktop instead, which can read/write these large files just
fine :-)


Vivek Khera, Ph.D.
+1-301-869-4449 x806