Home > mailing lists

Re: Synchronization levels in SR - Mailing list pgsql-hackers

From	Boszormenyi Zoltan
Subject	Re: Synchronization levels in SR
Date	September 3, 2010 09:57:47
Msg-id	4C80F0BD.1080109@cybertec.at Whole thread Raw
In response to	Re: Synchronization levels in SR (Fujii Masao <masao.fujii@gmail.com>)
List	pgsql-hackers

Tree view

Fujii Masao írta:
> On Fri, Sep 3, 2010 at 6:43 PM, Boszormenyi Zoltan <zb@cybertec.at> wrote:
>   
>> In my patch, when the transactions were waiting for ack from
>> the standby, they have already released all their locks, the wait
>> happened at the latest possible point in CommitTransaction().
>>
>> In Fujii's patch (I am looking at synch_rep_0722.patch, is there
>> a newer one?)
>>     
>
> No ;)
>
> We'll have to create the patch based on the result of the recent
> discussion held on other thread.
>
>   
>> the wait happens in RecordTransactionCommit()
>> so other transactions still see the sync transaction and most
>> importantly, the locks held by the sync transaction will make
>> the async  transactions waiting for the same lock wait too.
>>     
>
> The transaction should be invisible to other transactions until
> its replication has been completed.

Invisible? How can it be invisible? You are in RecordTransactionCommit(),
even before calling ProcArrayEndTransaction(MyProc, latestXid) and
releasing the locks the transaction holds.

>  So I put the wait before
> CommitTransaction() calls ProcArrayEndTransaction(). Is this unsafe?
>   

I don't know whether it's unsafe. In my patch, I only registered the Xid
at the point where you do WaitXLogSend(), this was the safe point
to setup the waiting for sync ack. Otherwise, when the Xid registration
for the sync ack was done in CommitTransaction() later than
RecordTransactionCommit(), there was a race between the primary and
the standby. The scenario was that the standby received and processed
the COMMIT of certain Xids even before the backend  on the primary
properly registered its Xid, so the backend has set up the waiting for
sync ack after this Xid was acked by the standby. The result was stuck
backends.

My idea to split up the registration for wait and the waiting itself
would allow for transaction-level synchronous setting, i.e. in my
patch the transaction released the locks and did all the post-commit
cleanups *then* it waited for sync ack if needed. In the meantime,
because locks were already released, other transactions could
progress with their job, allowing that e.g. async transactions to
progress and theoretically finish faster than the sync transaction
that was waiting for the ack.

The solution in my patch was not racy, registration of the Xid
was done before XLogInsert() in RecordTransactionCommit().
If the standby acked the Xid to the primary before reaching the
end of CommitTransaction() then this backend didn't even needed
to wait because the Xid was found in its PGPROC structure
and the waiting for sync ack was torn down.

But with the LSNs, as you are waiting for XactLastRecEnd
which is set by XLogInsert(). I don't know if it's safe to
WaitXLogSend() after XLogFlush() in RecordTransactionCommit().
I remember that in previous instances of my patch even if I
put the waiting for sync ack directly after   latestXid = RecordTransactionCommit();
in CommitTransaction(), there were cases when I got stuck
backends after a pgbench run. I had the primary and standbys
on the same machine on different ports, so the ack was almost
instant, which wouldn't be the case with a real network. But the
race condition was still there it just doesn't show up with networks
being slower than memory.  In your patch, the waiting happens
almost at the end of RecordTransactionCommit(), so theoretically
it has the same race condition. Am I missing something?

Best regards,
Zoltán Böszörményi

> Regards,
>
>

pgsql-hackers by date:

From: Pavel Stehule
Date: 03 September 2010, 09:41:37
Subject: Re: thousand unrelated data files in pg_default tablespace

From: Robert Haas
Date: 03 September 2010, 10:24:28
Subject: Re: Streaming a base backup from master

Re: Synchronization levels in SR - Mailing list pgsql-hackers

Previous

Next