Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply
Date
Msg-id 20230413.110031.1757135784921343648.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply
List pgsql-hackers
At Wed, 12 Apr 2023 23:39:29 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> BTW, you can reproduce the issue even when using a TCP connection
> instead of a Unix domain socket by specifying a very large number
> in the "keepalives" connection parameter for the foreign server.
> Here is an example:
> 
> -----------------
> create server loopback foreign data wrapper postgres_fdw options (host
> '127.0.0.1', port '5432', keepalives '99999999999');
> -----------------

Mmm..

> The reason behind this issue is that PQconnectPoll() parses
> the "keepalives" parameter value by simply using strtol(),
> while PQgetCancel() uses parse_int_param(). To fix this issue,
> it might be better to update PQconnectPoll() so that it uses
> parse_int_param() for parsing the "keepalives" parameter.

Agreed, it seems to be a leftover when we moved to parse_int_param()
in that area.

> > the real issue here is that PGgetCancel is unnecessarily checking its
> > value and failing as a result. Otherwise there would be no room for
> > failure in the call to PQgetCancel() at that point in the example
> > case.
> > PQconnectPoll should remove the ignored parameters at connection or
> > PQgetCancel should ingore the unreferenced (or unchecked)
> > parameters. For example, the below diff takes the latter way and seems
> > working (for at least AF_UNIX connections)
> 
> To clarify, are you suggesting that PQgetCancel() should
> only parse the parameters for TCP connections
> if cancel->raddr.addr.ss_family != AF_UNIX?
> If so, I think that's a good idea.

You're right. I used connip in the diff because I thought it provided
the same condition, but in a simpler way.

> 
> > Of course, it's not great that pgfdw_cancel_query_begin() ignores the
> > result from PQgetCancel(), but I think we don't need another ereport.
> 
> Can you please clarify why you suggest avoiding outputting
> the warning message when PQgetCancel() returns NULL?

No.  I suggested merging the two failures that emit the "same" message
because I believed that they were identical. I had this in my mind:

  calcel = PQgetCancel();
  if (!cancel || PQcancel())
  {
     ereport(); return false;
  }
  PQfreeCancel()

However, I notcied that PQgetCancel() doesn't set errbuf.. So, I'm
fine with your proposal.

> I think it is important to inform the user when an error
> occurs and a cancel request cannot be sent, as this information
> can help them identify the cause of the problem (such as
> setting an overly large value for the keepalives parameter).

Although I view it as an internal error, I agree with emitting some
error messages in that situation.

regrads.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
Next
From: Thomas Munro
Date:
Subject: Re: Direct I/O