Re: Streaming Replication: Observations, Questions and Comments - Mailing list pgsql-general

From Samba
Subject Re: Streaming Replication: Observations, Questions and Comments
Date
Msg-id CAKgWO9LMrmoDc=uPNOhJhZUmS7=PMeQzcJ58YOWSUSgz5PUxEw@mail.gmail.com
Whole thread Raw
In response to Re: Streaming Replication: Observations, Questions and Comments  (Alan Hodgson <ahodgson@simkin.ca>)
List pgsql-general
The problem with maintaining a separate archive is that one need to write some additional scripts to periodically remove older log files from the archive and that gets complicated with a setup having one master and multiple slaves.

I think it is a better idea to club compression and clean up in the core itself, may at a later release. A better approach to cleanup is that the walsender process decides when to cleanup a particular logfile based on the feedback from the all the registered slaves. If a slave is not reachable or falls behind for too long, then that slave should be banned from the setup (log the event in pg_replication.log ???). The replication status for each slave can be maintained in something like pg_slave_replica_status catalog table.

When it comes to compression, walsender can compress the each chunk of data that it streams (increasing the streaming_delay may improve compression ratio, hence a balance has to be struck between compression and sustainable-data-loss-in-case-of-failure)

Although I could visualise this design would be much better than leaving it to external utilities, I'm not that good at C language and hence only proposing a design and not a patch. I hope my suggestion will be received in good spirit.

Thanks and Regards,
Samba

PS:
I have wrongly stated that master server had to be restarted in case of long disconnects, sorry that was not true. But I still feel that requiring restart of standby server to resume replication should be avoided, if possible.

And, I strongly feel that a breakage in replication must be logged by both master server and  the concerned slave servers.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
On Wed, Aug 24, 2011 at 11:03 PM, Alan Hodgson <ahodgson@simkin.ca> wrote:
On August 24, 2011 08:33:17 AM Samba wrote:
> One strange thing I noticed is that the pg_xlogs on the master have
> outsized the actual data stored in the database by at least 3-4 times,
> which was quite surprising. I'm not sure if 'restore_command' has anything
> to do with it. I did not understand why transaction logs would need to be
> so many times larger than the actual size of the database, have I done
> something wrong somewhere?

If you archive them instead of keeping them in pg_xlog, you can gzip them.
They compress reasonably well.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: "Massa, Harald Armin"
Date:
Subject: Re: documentation for hashtext?
Next
From: Tom Lane
Date:
Subject: Re: Sort Method: external merge