Re: CLONE DATABASE (with copy on write?) - Mailing list pgsql-general

From Clark C. Evans
Subject Re: CLONE DATABASE (with copy on write?)
Date
Msg-id 1321196860.2745.140660998306237@webmail.messagingengine.com
Whole thread Raw
In response to Re: CLONE DATABASE (with copy on write?)  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: [SPAM?]: Re: CLONE DATABASE (with copy on write?)  (Simon Riggs <simon@2ndQuadrant.com>)
Re: CLONE DATABASE (with copy on write?)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Sunday, November 13, 2011 7:33 AM, "Simon Riggs"
<simon@2ndQuadrant.com> wrote:
> On Sat, Nov 12, 2011 at 9:40 PM, Clark C. Evans <cce@clarkevans.com>
> > [We] should be using "CREATE DATABASE ... WITH TEMPLATE".
> > However, this has two big disadvantages.  First, it only works
> > if you can kick the users off the clone.  Secondly, it still
> > takes time, uses disk space, etc.  We have some big databases.
>
> An interesting proposal. Thanks for taking the time to raise this.

Thank you for responding Simon.

> The existing situation is that you need to either:
> 1) quiesce the database so it can be copied locally
> 2) take a hot backup to create a clone on another server
>
> (1) currently involves disconnection. Would a command to quiesce
> sessions without disconnection be useful? We could get sessions to
> sleep until woken after the copy. With large databases we would still
> need to copy while sessions sleep to ensure a consistent database
> after the copy.

Could their be a way to put the database in "read only" mode,
where it rejects all attempts to change database state with an
appropriate application level error message?  We could then
update our application to behave appropriately while the copy
is being performed.   Something like this could be broadly
useful in other contexts as well, for example, having a replica
that you brought up for reporting purposes.

Even so, the CREATE DATABASE... WITH TEMPLATE still has a set of
additional issues with it.  It ties up the hard drive with activity
and then extra space while it duplicates data.  Further, it causes
the shared memory cache to be split between the original and the
replica, this causes both databases to be much slower.  Finally,
it creates a ton of WAL traffic (perhaps we could suspend this?)

> Is (2) a problem for you? In what way?

Due to our configuration, yes.  Being able to CLONE the
database in the same cluster is much preferred.  Our user
configuration, deliberately, does not involve hot backups.

Hot backups to another server won't work for us since our
servers are encrypted and isolated behind client firewalls.
Data that leaves the box has to be encrypted where the
decrypt key is only available upon hardware failure, etc.
Our upstream pipe isn't huge... which is why the WAL traffic
for backups is also problematic.

Perhaps we could create two PostgreSQL clusters on each server.
One of them would be production, the other would be for staging.
This involves some logistics...  the advantage of this approach
is that we could limit resource usage on the slave and turn off
backups on it, reducing our disk usage and WAL traffic.   We'd
keep shared memory on the slave to a minimum.   This solution
still chews up 2x disk space and doubles the disk activity.

Could "WITH TEMPLATE" reach into another cluster's storage?

Best,

Clark


pgsql-general by date:

Previous
From: Andy Colson
Date:
Subject: Re: Incremental backup with RSYNC or something?
Next
From: Phoenix Kiula
Date:
Subject: Re: Incremental backup with RSYNC or something?