Re: disk backups - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: disk backups
Date
Msg-id 395CB9AE.7250D597@cupid.suninternet.com
Whole thread Raw
In response to disk backups  (mikeo <mikeo@spectrumtelecorp.com>)
Responses Re: disk backups  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane wrote:
>
> Martijn van Oosterhout <kleptog@cupid.suninternet.com> writes:
> > Is there a better way? Here pg_dumping the DB takes over half an hour
> > (mainly because pg_dump chews all available memory).
>
> pg_dump shouldn't be a performance hog if you are using the default
> COPY-based style of data export.  I'd only expect memory problems
> if you are using INSERT-based export (-d or -D switch to pg_dump).
> For now, the answer is "don't do that" ... at least not on big tables.

Aha! Thanks for that! Last time I asked here nobody answered...
So it only happens with an INSERT based export, didn't know
that (though I can't see why there would be a difference...)

Yes, we are using -D, mainly because we've had "issues" with
the COPY based export, ie, it won't read the resulting file
back. Admittedly this was a while ago now and I havn't checked
since.

The data in question was a long text field containing (long)
snippets of HTML. Quotes, backslashes, tabs, newlines, etc,
the works. The insert style dump never had a problem...

> This could be fixed in either of two ways:
>
> 1. recode pg_dump to use DECLARE CURSOR and FETCH to grab table contents
> in reasonable-size chunks (instead of with an all-at-once SELECT);
>
> 2. add an API to libpq that allows a select result to be retrieved
> on-the-fly rather than accumulating it in libpq's memory.
>
> The second is more work but would be more widely useful.

I was thinking to write my own version of pg_dump that would
do that but also allow specifying of ordering constraint, ie,
clustering. Maybe it would be better to just switch to the
other output format...

>
> However, it's not been much of a priority, since insert-based data
> export is so slow to reload that no sensible person uses it for
> big tables anyway ;-)

Well, there's slow and there's something that works. But thanks
for clearing this up because it's not mentioned anywhere...
--
Martijn van Oosterhout <kleptog@cupid.suninternet.com>
http://cupid.suninternet.com/~kleptog/

pgsql-general by date:

Previous
From: mikeo
Date:
Subject: Re: Large Tables(>1 Gb)
Next
From: Tom Lane
Date:
Subject: Re: Large Tables(>1 Gb)