Re: Sync Rep: First Thoughts on Code - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Sync Rep: First Thoughts on Code
Date
Msg-id 3f0b79eb0812022238hdbd5172v6f8281e8016f5149@mail.gmail.com
Whole thread Raw
In response to Re: Sync Rep: First Thoughts on Code  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Sync Rep: First Thoughts on Code
List pgsql-hackers
Hello,

On Tue, Dec 2, 2008 at 10:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > The reaction to replication_timeout may need to be configurable. I might
>> > not want to keep on processing if the information didn't reach the
>> > standby.
>>
>> OK. I will add new GUC variable (PGC_SIGHUP) to specify the reaction for
>> the timeout.
>>
>> > I would prefer in many cases that the transactions that were
>> > waiting for walsender would abort, but the walsender kept processing.
>>
>> Is it dangerous to abort the transaction with replication continued when
>> the timeout occurs? I think that the WAL consistency between two servers
>> might be broken. Because the WAL writing and sending are done concurrently,
>> and the backend might already write the WAL to disk on the primary when
>> waiting for walsender.
>
> The issue I see is that we might want to keep wal_sender_delay small so
> that transaction times are not increased. But we also want
> wal_sender_delay high so that replication never breaks.

Are you assuming only asynch case? In synch case, since walsender is
awoken by the signal from the backend, we don't need to keep the delay
so small. And, wal_sender_delay has no relation with the mis-termination
of replication.

> It seems better
> to have the action on wal_sender_delay configurable if we have an
> unsteady network (like the internet). Marcus made some comments on line
> dropping that seem relevant here; we should listen to his experience.

OK, I would look for his comments. Please let me know which thread has
the comments if you know.

>
> Hmmm, dangerous? Well assuming we're linking commits with replication
> sends then it sounds it. We might end up committing to disk and then
> deciding to abort instead. But remember we don't remove the xid from
> procarray or mark the result in clog until the flush is over, so it is
> possible. But I think we should discuss this in more detail when the
> main patch is committed.

If the transaction is aborted while the backend is waiting for replication,
the transaction commit command returns "false" indication to the client.
But the transaction commit record might be written in the primary and
standby. As you say, it may not be dangerous as long as the primary is
alive. But, when we recover the failed primary, clog of the transaction
is marked with "success" because of the commit record. Is it safe?

And, in that case, the transaction is treated as "sucess" on the standby,
and visible for the read-only query. On the other hand, it's invisible on
the primary. Isn't it dangerous?

>
>> > Do we need to worry about periodic
>> > renegotiation of keys in be-secure.c?
>>
>> What is "keys" you mean?
>
> See the notes in that file for explanation.

Thanks! I would check it.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: "Fujii Masao"
Date:
Subject: Re: Sync Rep: First Thoughts on Code
Next
From: "Fujii Masao"
Date:
Subject: Re: V2 of PITR performance improvement for 8.4