Re: Transactions involving multiple postgres foreign servers, take 2 - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Transactions involving multiple postgres foreign servers, take 2
Date
Msg-id CAExHW5uDP9z-td+vsCJcC5gGxf1WsZRaWsXNmo+XaJNRgP1f9A@mail.gmail.com
Whole thread Raw
In response to RE: Transactions involving multiple postgres foreign servers, take 2  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: Transactions involving multiple postgres foreign servers, take 2  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
On Wed, Sep 23, 2020 at 2:13 AM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
> From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
> > parallelism here has both pros and cons. If one of the servers errors
> > out while preparing for a transaction, there is no point in preparing
> > the transaction on other servers. In parallel execution we will
> > prepare on multiple servers before realising that one of them has
> > failed to do so. On the other hand preparing on multiple servers in
> > parallel provides a speed up.
>
> And pros are dominant in practice.  If many transactions are erroring out (during prepare), the system is not
functioningfor the user.  Such an application should be corrected before they are put into production. 
>
>
> > But this can be an improvement on version 1. The current approach
> > doesn't render such an improvement impossible. So if that's something
> > hard to do, we should do that in the next version rather than
> > complicating this patch.
>
> Could you share your idea on how the current approach could enable parallelism?  This is an important point, because
(1)the FDW may not lead us to a seriously competitive scale-out DBMS, and (2) a better FDW API and/or implementation
couldbe considered for non-parallel interaction if we have the realization of parallelism in mind.  I think that kind
ofconsideration is the design (for the future). 
>

The way I am looking at is to put the parallelism in the resolution
worker and not in the FDW. If we use multiple resolution workers, they
can fire commit/abort on multiple foreign servers at a time.

But if we want parallelism within a single resolution worker, we will
need a separate FDW APIs for firing asynchronous commit/abort prepared
txn and fetching their results resp. But given the variety of FDWs,
not all of them will support asynchronous API, so we have to support
synchronous API anyway, which is what can be targeted in the first
version.

Thinking more about it, the core may support an API which accepts a
list of prepared transactions, their foreign servers and user mappings
and let FDW resolve all those either in parallel or one by one. So
parallelism is responsibility of FDW and not the core. But then we
loose parallelism across FDWs, which may not be a common case.

Given the complications around this, I think we should go ahead
supporting synchronous API first and in second version introduce
optional asynchronous API.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Report error position in partition bound check
Next
From: Amit Langote
Date:
Subject: Re: Report error position in partition bound check