Re: Backup strategies with significant bytea data - Mailing list pgsql-general

From Leigh Dyer
Subject Re: Backup strategies with significant bytea data
Date
Msg-id 4B4F39A8.20AD.00BA.0@healthscope.com.au
Whole thread Raw
In response to Backup strategies with significant bytea data  ("Leigh Dyer" <Leigh.Dyer@healthscope.com.au>)
List pgsql-general
Ivan Voras wrote:
> Leigh Dyer wrote:
>> Hi,
>>
>>For years now I've simply backed up my databases by doing a nightly
>>pg_dump, but since we added the ability for users to import binary
files
>>in to our application, which are stored in a bytea fields, the dump
>>sizes have gone through the roof — even with gzip compression,
they're
>>significantly larger than the on-disk size of the database. My guess
is
>>that this due to the way that the binary data from the bytea fields
is
>>encoded in the dump file when it's produced.
>
>Have you tried another dump format? E.g. "-F c" argument to pg_dump?

Hi Ivan,

I have tried the custom dump format, but it made no difference to the
backup sizes. Here are some numbers:

Database size on disk (according to pg_database_size): 1017MB
pg_dump size (standard format): 3475MB
pg_dump size (standard format, gzip-compressed): 1132MB
pg_dump size (custom format, ie: "-F c" option): 1134MB

This testing was done on my development server, with the database
restored a backup from the production server, so the database size on
disk might be a bit smaller than it would be if this was a running
database with some dead tuples in there. Still, there's a very big
difference between the raw data size and the dump sizes.

Thanks
Leigh



Please consider the environment before printing this message

pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Collate order on Mac OS X, text with diacritics in UTF-8
Next
From: Yan Cheng Cheok
Date:
Subject: Re: Extremely Slow Cascade Delete Operation