Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit
Date
Msg-id CAPmGK14rr1R_t4xxWyJVng04nDjGnxi=J5fz-TBNBH9Xn0OFYQ@mail.gmail.com
Whole thread Raw
In response to Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit
List pgsql-hackers
On Mon, Nov 8, 2021 at 1:13 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> On 2021/11/07 18:06, Etsuro Fujita wrote:
> > On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >> Could you tell me why the parameter is necessary?
> >> Can't we always enable the feature?

> > I think it might be OK to enable it by
> > default.  But my concern about doing so is the remote side: during
> > those functions, if there are a lot of (sub)transactions on a single
> > remote server that need to be committed, parallel commit would
> > increase the remote server’s load at (sub)transaction end than serial
> > commit, which is the existing implementation, as the requests to
> > commit those (sub)transactions are sent to the remote server at the
> > same time; which some users might want to avoid.
>
> Thanks for explaining this! But probably I failed to get your point.
> Sorry... Whichever parallel or serial commit approach, the number of
> transactions to commit on the remote server is the same, isn't it?
> For example, please imagine the case where a client requests
> ten transactions per second to the local server. Each transaction
> accesses to the foreign table, so which means that ten transaction
> commit operations per second are requested to the remote server.
> Unless I'm missing something, this number doesn't change whether
> the foreign transaction is committed in parallel way or not.

Sorry, my explanation was not enough, but I don’t think this is always
true.  Let me explain using an example:

create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;

begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;

For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts.  This leads to sending two commit commands to the
foreign server at the same time during pre-commit.

To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled, but I might be too worried about that, so I want to hear
the opinions of people.

> IMO it's better to implement and commit these features gradually
> if possible. Which would simplify the patch and make it
> easier-to-review. So I think that it's better to implement
> this feature as a separate patch.

Ok, I'll create a patch for abort cleanup separately.

> >> +       /* Consume whatever data is available from the socket */
> >> +       if (!PQconsumeInput(conn))
> >> +               pgfdw_report_error(ERROR, NULL, conn, false, sql);
> >>
> >> Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
> >> But could you tell me why you added PQconsumeInput() there?
> >
> > The reason is that there might be the result already before calling
> > pgfdw_get_result(), in which case PQconsumeInput() followed by
> > PQisBusy() would allow us to call PQgetResult() without doing
> > WaitLatchOrSocket(), which I think is rather expensive.
>
> Understood. It's helpful to add the comment about why PQconsumeInput()
> is called there.

Ok.

> Also could you tell me how much expensive it is?

IIUC I think the overheads of WaitLatchOrSocket() incurred by a series
of epoll system calls are much larger compared to the overheads of
PQconsumeInput() incurred by a recv system call in non-blocking mode
when no data is available.  I didn’t do testing, though.

Actually, we already use this optimization in libpqrcv_receive() for
the caller of that function to avoid doing WaitLatchOrSocket()?

> >> When ignore_errors argument is true, the error reported by
> >> PQconsumeInput() should be ignored?
> >
> > I’m not sure about that, because the error might be caused by e.g.,
> > OOM in the local server, in which case I don’t think it is safe to
> > ignore it and continue the (sub)transaction-end processing.
>
> But the existing code ignores the error at all, doesn't it?
> If it's unsafe to do that, probably we should fix that at first?

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.

Thanks!

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Shinya Kato
Date:
Subject: Re: Emit a warning if the extension's GUC is set incorrectly
Next
From: Bharath Rupireddy
Date:
Subject: Re: Emit a warning if the extension's GUC is set incorrectly