Re: Transactions involving multiple postgres foreign servers - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Transactions involving multiple postgres foreign servers
Date
Msg-id CAD21AoDxrW6HoGgJ62_Vkjd5ioqhZS5em57_TJ=Yz9X8+VSRnQ@mail.gmail.com
Whole thread Raw
In response to Re: Transactions involving multiple postgres foreign servers  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: Transactions involving multiple postgres foreign servers  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Fri, Oct 7, 2016 at 4:25 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, Oct 6, 2016 at 2:52 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2016/10/06 17:45, Ashutosh Bapat wrote:
>>> On Thu, Oct 6, 2016 at 1:34 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>> On Thu, Oct 6, 2016 at 1:41 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
>>>>>> My understanding is that basically the local server can not return
>>>>>> COMMIT to the client until 2nd phase is completed.
>>>>>
>>>>> If we do that, the local server may not return to the client at all,
>>>>> if the foreign server crashes and never comes up. Practically, it may
>>>>> take much longer to finish a COMMIT, depending upon how long it takes
>>>>> for the foreign server to reply to a COMMIT message.
>>>>
>>>> Yes, I think 2PC behaves so, please refer to [1].
>>>> To prevent local server stops forever due to communication failure.,
>>>> we could provide the timeout on coordinator side or on participant
>>>> side.
>>>
>>> This too, looks like a heuristic and shouldn't be the default
>>> behaviour and hence not part of the first version of this feature.
>>
>> At any rate, the coordinator should not return to the client until after
>> the 2nd phase is completed, which was the original point.  If COMMIT
>> taking longer is an issue, then it could be handled with one of the
>> approaches mentioned so far (even if not in the first version), but no
>> version of this feature should really return COMMIT to the client only
>> after finishing the first phase.  Am I missing something?
>
> There is small time window between actual COMMIT and a commit message
> returned. An actual commit happens when we insert a WAL saying
> transaction X committed and then we return to the client saying a
> COMMIT happened. Note that a transaction may be committed but we will
> never return to the client with a commit message, because connection
> was lost or the server crashed. I hope we agree on this.

Agree.

> COMMITTING the foreign prepared transactions happens after we COMMIT
> the local transaction. If we do it before COMMITTING local transaction
> and the local server crashes, we will roll back local transaction
> during subsequence recovery while the foreign segments have committed
> resulting in an inconsistent state.
>
> If we are successful in COMMITTING foreign transactions during
> post-commit phase, COMMIT message will be returned after we have
> committed all foreign transactions. But in case we can not reach a
> foreign server, and request times out, we can not revert back our
> decision that we are going to commit the transaction. That's my answer
> to the timeout based heuristic.

IIUC 2PC is the protocol that assumes that all of the foreign server live.
In case we can not reach a foreign server during post-commit phase,
basically the transaction and following transaction should stop until
the crashed server revived. This is the first place to implement 2PC
for FDW, I think. The heuristically determination approach I mentioned
is one of the optimization idea to avoid holding up transaction in
case a foreign server crashed.

> I don't see much point in holding up post-commit processing for a
> non-responsive foreign server, which may not respond for days
> together. Can you please elaborate a use case? Which commercial
> transaction manager does that?

For example, the client updates a data on foreign server and then
commits. And the next transaction from the same client selects new
data which was updated on previous transaction. In this case, because
the first transaction is committed the second transaction should be
able to see updated data, but it can see old data in your idea. Since
these is obviously order between first transaction and second
transaction I think that It's not problem of providing consistent
view.

I guess transaction manager of Postgres-XC behaves so, no?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Victor Wagner
Date:
Subject: Re: Patch: Implement failover on libpq connect level.
Next
From: Ashutosh Bapat
Date:
Subject: Re: Transactions involving multiple postgres foreign servers