Re: incremental dumps - Mailing list pgsql-general

From Jeff Janes
Subject Re: incremental dumps
Date
Msg-id CAMkU=1xy1-jOYBB=pCXiw2qFPZp-sq6u=vtT=U3TU6mxUQZFNQ@mail.gmail.com
Whole thread Raw
In response to incremental dumps  (hamann.w@t-online.de)
List pgsql-general
On Thu, Aug 1, 2013 at 1:59 AM,  <hamann.w@t-online.de> wrote:
> Hi,
> I want to store copies of our data on a remote machine as a security measure.

Can you describe what your security concerns are?  Are you worried
about long-lasting malicious tampering with the data that you need to
be able to recover from?  Simple loss of data from natural disaster?

> My first attempt was a full dump (which takes too long to upload)
> followed by diffs between the pgdump files.
> This provides readable / searchable versioned data (I could alway apply
> the diffs on the remote machine and search the text file, without having
> an instance of postgres running on that machine)

I think that optimizing with the intention of not using PostgreSQL is
probably the wrong approach.    You find it valuable to use PostgreSQL
on your production server, why would you not also find it valuable to
use it on the remote?  I like the ability to use pg_dump to get human
readable data, and I use it often.  But I usually start with a binary
image recovered to the point I want, and then take a "fresh" pg_dump
out of that for inspection, rather than trying to save pg_dumps at
every time that might be of interest.

>
> However, the diff files seem to be considerably larger than one would expect.
> One obvious part of the problem is the fact that diff shows old and new text,
> so e.g. changing the amount of stock for a product with a 1kB description
> would generate at least 2kb of text in the diff file.

Usually a 1kb product description would not be in the same table as a
current stock count would.  Anyway, there are diff tools that are not
line-oriented which could compress well on this type of data, but if
the rows are not in the same order between dumps, they do poorly.
There is some more discussion of this here:

http://www.postgresql.org/message-id/CAMkU=1z0+=M-2g-N3+y=d-QJu-qcYTGXbw_h1E5g5Gr+rCny6A@mail.gmail.com

Cheers,

Jeff


pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: c++ convert wchar_t to UTF-8 for DB
Next
From: John R Pierce
Date:
Subject: Re: Need Help