Re: Global snapshots - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Global snapshots |
Date | |
Msg-id | 057004b6-68bf-98ee-7c57-91c418f6e221@oss.nttdata.com Whole thread Raw |
In response to | Re: Global snapshots (Alexey Kondratov <a.kondratov@postgrespro.ru>) |
Responses |
Re: Global snapshots
|
List | pgsql-hackers |
On 2020/09/05 3:31, Alexey Kondratov wrote: > Hi, > > On 2020-07-27 09:44, Andrey V. Lepikhov wrote: >> On 7/27/20 11:22 AM, tsunakawa.takay@fujitsu.com wrote: >>> >>> US8356007B2 - Distributed transaction management for database systems with multiversioning - Google Patents >>> https://patents.google.com/patent/US8356007 >>> >>> >>> If it is, can we circumvent this patent? >>> >> >> Thank you for the research (and previous links too). >> I haven't seen this patent before. This should be carefully studied. > > I had a look on the patch set, although it is quite outdated, especially on 0003. > > Two thoughts about 0003: > > First, IIUC atomicity of the distributed transaction in the postgres_fdw is achieved by the usage of 2PC. I think thatthis postgres_fdw 2PC support should be separated from global snapshots. Agreed. > It could be useful to have such atomic distributed transactions even without a proper visibility, which is guaranteed bythe global snapshot. Especially taking into account the doubts about Clock-SI and general questions about algorithm choosingcriteria above in the thread. > > Thus, I propose to split 0003 into two parts and add a separate GUC 'postgres_fdw.use_twophase', which could be turnedon independently from 'postgres_fdw.use_global_snapshots'. Of course if the latter is enabled, then 2PC should be forcedlyturned on as well. > > Second, there are some problems with errors handling in the 0003 (thanks to Arseny Sher for review). > > +error: > + if (!res) > + { > + sql = psprintf("ABORT PREPARED '%s'", fdwTransState->gid); > + BroadcastCmd(sql); > + elog(ERROR, "Failed to PREPARE transaction on remote node"); > + } > > It seems that we should never reach this point, just because BroadcastStmt will throw an ERROR if it fails to prepare transactionon the foreign server: > > + if (PQresultStatus(result) != expectedStatus || > + (handler && !handler(result, arg))) > + { > + elog(WARNING, "Failed command %s: status=%d, expected status=%d", sql, PQresultStatus(result), expectedStatus); > + pgfdw_report_error(ERROR, result, entry->conn, true, sql); > + allOk = false; > + } > > Moreover, It doesn't make much sense to try to abort prepared xacts, since if we failed to prepare it somewhere, then someforeign servers may become unavailable already and this doesn't provide us a 100% guarantee of clean up. > > + /* COMMIT open transaction of we were doing 2PC */ > + if (fdwTransState->two_phase_commit && > + (event == XACT_EVENT_PARALLEL_COMMIT || event == XACT_EVENT_COMMIT)) > + { > + BroadcastCmd(psprintf("COMMIT PREPARED '%s'", fdwTransState->gid)); > + } > > At this point, the host (local) transaction is already committed and there is no way to abort it gracefully. However, BroadcastCmdmay rise an ERROR that will cause a PANIC, since it is non-recoverable state: > > PANIC: cannot abort transaction 487, it was already committed > > Attached is a patch, which implements a plain 2PC in the postgres_fdw and adds a GUC 'postgres_fdw.use_twophase'. Alsoit solves these errors handling issues above and tries to add proper comments everywhere. I think, that 0003 should berebased on the top of it, or it could be a first patch in the set, since it may be used independently. What do you think? Thanks for the patch! Sawada-san was proposing another 2PC patch at [1]. Do you have any thoughts about pros and cons between your patch and Sawada-san's? Regards, [1] https://www.postgresql.org/message-id/CA+fd4k4z6_B1ETEvQamwQhu4RX7XsrN5ORL7OhJ4B5B6sW-RgQ@mail.gmail.com -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: