Re: Synchronous replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Synchronous replication
Date
Msg-id AANLkTikCdC2IJeh5fGHYvhmAcLOfTF31GfQRjSMBlaVl@mail.gmail.com
Whole thread Raw
In response to Re: Synchronous replication  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Synchronous replication
List pgsql-hackers
On Thu, Jul 15, 2010 at 12:16 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jul 14, 2010 at 2:50 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> The patch have no features for performance improvement of synchronous
>> replication. I admit that currently the performance overhead in the
>> master is terrible. We need to address the following TODO items in the
>> subsequent CF.
>>
>> * Change the poll loop in the walsender
>> * Change the poll loop in the backend
>> * Change the poll loop in the startup process
>> * Change the poll loop in the walreceiver
>> * Perform the WAL write and replication concurrently
>> * Send WAL from not only disk but also WAL buffers
>
> I have a feeling that if we don't have a design for these last two
> before we start committing things, we're possibly going to regret it
> later.

Yeah, I'll give it a try.

The problem is that the standby can apply the non-fsync'd WAL on the
master. So if we allow walsender to send the non-fsync'd WAL, we should
make walsender send also the current fsync location and prevent the
standby from applying the newer WAL than the fsync location.

New message type for sending the fsync location would be required in
Streaming Replication Protocol. But sometimes it might go along with
XLogData message.

After the master crashes and walreceiver is terminated, currently the
standby attempts to replay the WAL in the pg_xlog and the archive.
Since WAL in the archive is guaranteed to have already been fsync'd by
the master, it's not problem for the standby to apply that WAL. OTOH,
WAL records in pg_xlog directory might not exist in the crashed master.
So we should always prevent the standby from applying any WAL in pg_xlog
unless walreceiver is in progress. That is, if there is no WAL available
in the archive, the standby ignores pg_xlog and starts walreceiver
process to request for WAL streaming.

This idea is a little inefficient because the already-sent WAL might
be sent again when the master is restarted. But since this ensures
that the standby will not apply the non-fsync'd WAL on the master,
it's quite safe.

What about this idea?

This idea doesn't conflict with the patch I submitted for CF 2010-07.
So please feel free to review the patch :) But if you think that the
patch is not reviewable until that idea has been implemented, I'll
try to implement that ASAP.

PS. Probably I cannot reply to the mail until July 21. Sorry.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: ERROR: argument to pg_get_expr() must come from system catalogs
Next
From: Hans-Jürgen Schönig
Date:
Subject: Re: SHOW TABLES