Re: [HACKERS] Optimization for updating foreign tables in PostgresFDW - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] Optimization for updating foreign tables in PostgresFDW
Date
Msg-id 20170418.130022.222486079.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
At Tue, 18 Apr 2017 09:12:07 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRcRjohUZaCNon1zkP7c6fo2qd5hnfHCT6t_39SGtS_oKQ@mail.gmail.com>
> On Tue, Apr 18, 2017 at 8:12 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g@mail.gmail.com>
> >> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI
> >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoaxnNmuONgP=bXJojrgbnMPTi6Ms8OSwZBC2YQ2ueUiSg@mail.gmail.com>
> >> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier
> >> >> > <michael.paquier@gmail.com> wrote:
> >> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita
> >> >> >> <fujita.etsuro@lab.ntt.co.jp> wrote:
> >> >> >>> Attached is an updated version of the patch, which modified Michael's
> >> >> >>> version of the patch, as I proposed in [1] (see "Other changes:").  I
> >> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because
> >> >> >>> words like "non-blocking mode" there seems confusing (note that we have
> >> >> >>> PQsetnonbloking).
> >> >> >>
> >> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY
> >> >> >> are moved to their correct places. That's fine for me. Thanks for
> >> >> >> gathering everything in a single patch and correcting it.
> >> >> >
> >> >> > I have committed this patch.  Thanks for working on this.  Sorry for the delay.
> >> >>
> >> >> This 9.6-era patch, as it turns out, has a problem, which is that we
> >> >> now respond to an interrupt by sending a cancel request and a
> >> >> NON-interruptible ABORT TRANSACTION command to the remote side.  If
> >> >> the reason that the user is trying to interrupt is that the network
> >> >> connection has been cut, they interrupt the original query only to get
> >> >> stuck in a non-interruptible wait for ABORT TRANSACTION.  That is
> >> >> non-optimal.
> >> >
> >> > Agreed.
> >> >
> >> >> It is not exactly clear to me how to fix this.  Could we get by with
> >> >> just slamming the remote connection shut, instead of sending an
> >> >> explicit ABORT TRANSACTION?  The remote side ought to treat a
> >> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our
> >> >> local state would get confused.  (I have not checked.)
> >> >>
> >> >> Thoughts?
> >> >
> >> > Perhaps we will get stuck at query cancellation before ABORT
> >> > TRANSACTION in the case. A connection will be shut down when
> >> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on
> >> > aborting local transactoin . So I don't think fdw gets confused
> >> > or sholdn't be confused by shutting down there.
> >> >
> >> > The most significant issue I can see is that the same thing
> >> > happens in the case of graceful ABORT TRANSACTION. It could be a
> >> > performance issue.
> >> >
> >> > We could set timeout here but maybe we can just slamming the
> >> > connection down instead of sending a query cancellation. It is
> >> > caused only by timeout or interrupts so I suppose it is not a
> >> > problem *with a few connections*.
> >> >
> >> >
> >> > Things are a bit diffent with hundreds of connections. The
> >> > penalty of reconnection would be very high in the case.
> >> >
> >> > If we are not willing to pay such high penalty, maybe we are to
> >> > manage busy-idle time of each connection and trying graceful
> >> > abort if it is short enough, maybe having a shoft timeout.
> >> >
> >> > Furthermore, if most or all of the hundreds of connections get
> >> > stuck, such timeout will accumulate up like a mountain...
> >>
> >> Even when the transaction is aborted because a user cancels a query,
> >> we do want to preserve the connection, if possible, to avoid
> >
> > Yes.
> >
> >> reconnection. If the request to cancel the query itself fails, we
> >> should certainly drop the connection. Here's the patch to do that.
> >
> > A problem I think on this would be that we still try to make
> > another connection for canceling and it would stall for several
> > minutes per connection on a packet stall, which should be done in
> > a second on ordinary circumstances. Perhaps we might want here is
> > async canceling with timeout.
> 
> I am not sure what do you mean "make another connection for
> cancelling.". Do you mean to say that PQcancel would make another
> connection?

Yes. It will take about 3 minutes on standard settings when no
ACK returned. I thought that this discussion is based on such
situation.

> The patch proposed simply improves the condition when PQcancel has
> returned a failure. Right now we ignore that failure and try to ABORT
> the transaction, which is most probably going to get stalled or fail
> to be dispatched. With the patch such a connection will be reset.

Ah, I understand that. It is surely an improvement since it
avoids useless ABORT TRANSACTION that is known to stall.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] snapbuild woes