Home > mailing lists

Replication/cloning: rsync vs modification dates? - Mailing list pgsql-general

From	Chris Angelico
Subject	Replication/cloning: rsync vs modification dates?
Date	July 16, 2012 05:28:50
Msg-id	CAPTjJmq9jwPCBVoffFNOmnVnXfvTrGEcTLKdEhyaL0G5dv_0tw@mail.gmail.com Whole thread Raw
Responses	Re: Replication/cloning: rsync vs modification dates?
List	pgsql-general

Tree view

I'm speccing up a three-node database for reliability, making use of
streaming replication, and it's all working but I have a bit of a
performance concern.

Suppose a node dies and is removed from the cluster, but then returns
(say, a day or two later). I could, of course, utterly wipe the
existing data on that node and take a fresh copy from the master, but
that would entail transferring the entire content of the database. The
recommended option appears to be rsync, which saves on network
traffic, but still has to read and hash every byte of data.

Can the individual files' modification timestamps be relied upon? If
so, it'd potentially mean a lot of savings, as the directory entries
can be read fairly efficiently. I could still then use rsync to
transfer those files (so if it's only a small part that's changed, we
take advantage of its optimizations too).

This may be digging too deep into the internals to be dependable for
future versions. If so, I'd rather put the extra load on the servers
than risk a future upgrade breaking replication subtly.

Chris Angelico

pgsql-general by date:

From: Craig Ringer
Date: 16 July 2012, 02:46:42
Subject: How to obtain calling role within a SECURITY DEFINER function

From: Chris Bartlett
Date: 16 July 2012, 07:42:23
Subject: Re: Can't figure out how to use now() in default for tsrange column (PG 9.2)

Replication/cloning: rsync vs modification dates? - Mailing list pgsql-general

Previous

Next