Re: Global snapshots - Mailing list pgsql-hackers

From Alexey Kondratov
Subject Re: Global snapshots
Date
Msg-id 3ef7877bfed0582019eab3d462a43275@postgrespro.ru
Whole thread Raw
In response to Re: Global snapshots  ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
Responses Re: Global snapshots  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
Hi,

On 2020-07-27 09:44, Andrey V. Lepikhov wrote:
> On 7/27/20 11:22 AM, tsunakawa.takay@fujitsu.com wrote:
>> 
>> US8356007B2 - Distributed transaction management for database systems 
>> with multiversioning - Google Patents
>> https://patents.google.com/patent/US8356007
>> 
>> 
>> If it is, can we circumvent this patent?
>> 
> 
> Thank you for the research (and previous links too).
> I haven't seen this patent before. This should be carefully studied.

I had a look on the patch set, although it is quite outdated, especially 
on 0003.

Two thoughts about 0003:

First, IIUC atomicity of the distributed transaction in the postgres_fdw 
is achieved by the usage of 2PC. I think that this postgres_fdw 2PC 
support should be separated from global snapshots. It could be useful to 
have such atomic distributed transactions even without a proper 
visibility, which is guaranteed by the global snapshot. Especially 
taking into account the doubts about Clock-SI and general questions 
about algorithm choosing criteria above in the thread.

Thus, I propose to split 0003 into two parts and add a separate GUC 
'postgres_fdw.use_twophase', which could be turned on independently from 
'postgres_fdw.use_global_snapshots'. Of course if the latter is enabled, 
then 2PC should be forcedly turned on as well.

Second, there are some problems with errors handling in the 0003 (thanks 
to Arseny Sher for review).

+error:
+            if (!res)
+            {
+                sql = psprintf("ABORT PREPARED '%s'", fdwTransState->gid);
+                BroadcastCmd(sql);
+                elog(ERROR, "Failed to PREPARE transaction on remote node");
+            }

It seems that we should never reach this point, just because 
BroadcastStmt will throw an ERROR if it fails to prepare transaction on 
the foreign server:

+            if (PQresultStatus(result) != expectedStatus ||
+                (handler && !handler(result, arg)))
+            {
+                elog(WARNING, "Failed command %s: status=%d, expected status=%d", 
sql, PQresultStatus(result), expectedStatus);
+                pgfdw_report_error(ERROR, result, entry->conn, true, sql);
+                allOk = false;
+            }

Moreover, It doesn't make much sense to try to abort prepared xacts, 
since if we failed to prepare it somewhere, then some foreign servers 
may become unavailable already and this doesn't provide us a 100% 
guarantee of clean up.

+    /* COMMIT open transaction of we were doing 2PC */
+    if (fdwTransState->two_phase_commit &&
+        (event == XACT_EVENT_PARALLEL_COMMIT || event == XACT_EVENT_COMMIT))
+    {
+        BroadcastCmd(psprintf("COMMIT PREPARED '%s'", fdwTransState->gid));
+    }

At this point, the host (local) transaction is already committed and 
there is no way to abort it gracefully. However, BroadcastCmd may rise 
an ERROR that will cause a PANIC, since it is non-recoverable state:

PANIC:  cannot abort transaction 487, it was already committed

Attached is a patch, which implements a plain 2PC in the postgres_fdw 
and adds a GUC 'postgres_fdw.use_twophase'. Also it solves these errors 
handling issues above and tries to add proper comments everywhere. I 
think, that 0003 should be rebased on the top of it, or it could be a 
first patch in the set, since it may be used independently. What do you 
think?


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: "Hsu, John"
Date:
Subject: Improve pg_dump dumping publication tables
Next
From: Jeff Davis
Date:
Subject: Re: Disk-based hash aggregate's cost model