PITR - "Rewind to snapshot" scheme - Mailing list pgsql-general

From Martin Langhoff
Subject PITR - "Rewind to snapshot" scheme
Date
Msg-id 46a038f90704160132p288707aaidcb28529537429a3@mail.gmail.com
Whole thread Raw
Responses Re: PITR - "Rewind to snapshot" scheme  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: PITR - "Rewind to snapshot" scheme  (Richard Huxton <dev@archonet.com>)
List pgsql-general
I have been following and experimenting a bit with PITR for a while,
and I wonder whether it is practical to use the PITR hooks to roll
back the database to a known state. The scenario is that I am
developing a script that will be massaging data in a medium size
database. A pg_restore of the pristine data takes ~35 minutes to
complete, if I can take a snapshot right after pg_restore, and use it
to later "rewind" to that point, I'll save 35 minutes every time I
need to test it.

The dev box where Pg runs has plenty of disk space to spare - and
it'll be a dedicated Pg instance. I've already raised wal_buffers to
20000 and also disabled wal fsync.

So my back-of-the-envelope plan is to

 - run pg_restore
 - setup wal archiving so that the logs aren't deleted
 - pg_start_backup('label'); cp -pr pgdata pgdata-snapshot ;
pg_stop_backup('label')
 - somehow remeber the transaction identifier

At this stage, I can run my data-garbling script, and to "rewind" I
should be able to

 - stop Pg
 - install an appropriate restore.conf that stops at the correct
transaction identifier
 - cp -pr pgdata-snapshot pgdata
 - start Pg

Would something like this work? My only worries at the moment seem trivial:

 - getting the transaction identifier
 - pruning the non-current timelines to avoid the archived logfiles
from eating me alive

cheers


martin

pgsql-general by date:

Previous
From: "Sergei Shelukhin"
Date:
Subject: Re: one more performance question I cannot test yet :(
Next
From: "Lloyd Mason"
Date:
Subject: Re: Unexplained case insensitive results