Thread: Issue in postgres_fdw causing unnecessary wait for cancel request reply

Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Fujii Masao
Date:
Hi,

When using postgres_fdw, in the event of a local transaction being
aborted while a query is running on a remote server,
postgres_fdw sends a cancel request to the remote server.
However, if PQgetCancel() returned NULL and no cancel request was issued,
I found that postgres_fdw could still wait for the reply to
the cancel request, causing unnecessary wait time with a 30 second timeout.

For example, the following queries can reproduce the issue:

----------------------------
create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options (tcp_user_timeout 'a');
create user mapping for public server loopback;
create view t as select 1 as i from pg_sleep(100);
create foreign table ft (i int) server loopback options (table_name 't');
select * from ft;

Press Ctrl-C while running the above SELECT query.
----------------------------

Attached patch fixes this issue. It ensures that postgres_fdw only waits
for a reply if a cancel request is actually issued. Additionally,
it improves PQgetCancel() to set error messages in certain error cases,
such as when out of memory occurs and malloc() fails. Moreover,
it enhances postgres_fdw to report a warning message when PQgetCancel()
returns NULL, explaining the reason for the NULL value.

Thought?

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachment

Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Kyotaro Horiguchi
Date:
At Wed, 12 Apr 2023 03:36:01 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> Attached patch fixes this issue. It ensures that postgres_fdw only
> waits
> for a reply if a cancel request is actually issued. Additionally,
> it improves PQgetCancel() to set error messages in certain error
> cases,
> such as when out of memory occurs and malloc() fails. Moreover,
> it enhances postgres_fdw to report a warning message when
> PQgetCancel()
> returns NULL, explaining the reason for the NULL value.
> 
> Thought?

I wondered why the connection didn't fail in the first place. After
digging into it, I found (or remembered) that local (or AF_UNIX)
connections ignore the timeout value at making a connection. I think
the real issue here is that PGgetCancel is unnecessarily checking its
value and failing as a result. Otherwise there would be no room for
failure in the call to PQgetCancel() at that point in the example
case.

PQconnectPoll should remove the ignored parameters at connection or
PQgetCancel should ingore the unreferenced (or unchecked)
parameters. For example, the below diff takes the latter way and seems
working (for at least AF_UNIX connections)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 40fef0e2c8..30e2ab54ba 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4718,6 +4718,10 @@ PQgetCancel(PGconn *conn)
        cancel->keepalives_idle = -1;
        cancel->keepalives_interval = -1;
        cancel->keepalives_count = -1;
+
+       if (conn->connip == 0)
+               return cancel;
+
        if (conn->pgtcp_user_timeout != NULL)
        {
                if (!parse_int_param(conn->pgtcp_user_timeout,

Of course, it's not great that pgfdw_cancel_query_begin() ignores the
result from PQgetCancel(), but I think we don't need another ereport.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Fujii Masao
Date:

On 2023/04/12 12:00, Kyotaro Horiguchi wrote:
> At Wed, 12 Apr 2023 03:36:01 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Attached patch fixes this issue. It ensures that postgres_fdw only
>> waits
>> for a reply if a cancel request is actually issued. Additionally,
>> it improves PQgetCancel() to set error messages in certain error
>> cases,
>> such as when out of memory occurs and malloc() fails. Moreover,
>> it enhances postgres_fdw to report a warning message when
>> PQgetCancel()
>> returns NULL, explaining the reason for the NULL value.
>>
>> Thought?
> 
> I wondered why the connection didn't fail in the first place. After
> digging into it, I found (or remembered) that local (or AF_UNIX)
> connections ignore the timeout value at making a connection. I think

BTW, you can reproduce the issue even when using a TCP connection
instead of a Unix domain socket by specifying a very large number
in the "keepalives" connection parameter for the foreign server.
Here is an example:

-----------------
create server loopback foreign data wrapper postgres_fdw options (host '127.0.0.1', port '5432', keepalives
'99999999999');
-----------------

The reason behind this issue is that PQconnectPoll() parses
the "keepalives" parameter value by simply using strtol(),
while PQgetCancel() uses parse_int_param(). To fix this issue,
it might be better to update PQconnectPoll() so that it uses
parse_int_param() for parsing the "keepalives" parameter.



> the real issue here is that PGgetCancel is unnecessarily checking its
> value and failing as a result. Otherwise there would be no room for
> failure in the call to PQgetCancel() at that point in the example
> case.
> 
> PQconnectPoll should remove the ignored parameters at connection or
> PQgetCancel should ingore the unreferenced (or unchecked)
> parameters. For example, the below diff takes the latter way and seems
> working (for at least AF_UNIX connections)

To clarify, are you suggesting that PQgetCancel() should
only parse the parameters for TCP connections
if cancel->raddr.addr.ss_family != AF_UNIX?
If so, I think that's a good idea.


> Of course, it's not great that pgfdw_cancel_query_begin() ignores the
> result from PQgetCancel(), but I think we don't need another ereport.

Can you please clarify why you suggest avoiding outputting
the warning message when PQgetCancel() returns NULL?
I think it is important to inform the user when an error
occurs and a cancel request cannot be sent, as this information
can help them identify the cause of the problem (such as
setting an overly large value for the keepalives parameter).

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Kyotaro Horiguchi
Date:
At Wed, 12 Apr 2023 23:39:29 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> BTW, you can reproduce the issue even when using a TCP connection
> instead of a Unix domain socket by specifying a very large number
> in the "keepalives" connection parameter for the foreign server.
> Here is an example:
> 
> -----------------
> create server loopback foreign data wrapper postgres_fdw options (host
> '127.0.0.1', port '5432', keepalives '99999999999');
> -----------------

Mmm..

> The reason behind this issue is that PQconnectPoll() parses
> the "keepalives" parameter value by simply using strtol(),
> while PQgetCancel() uses parse_int_param(). To fix this issue,
> it might be better to update PQconnectPoll() so that it uses
> parse_int_param() for parsing the "keepalives" parameter.

Agreed, it seems to be a leftover when we moved to parse_int_param()
in that area.

> > the real issue here is that PGgetCancel is unnecessarily checking its
> > value and failing as a result. Otherwise there would be no room for
> > failure in the call to PQgetCancel() at that point in the example
> > case.
> > PQconnectPoll should remove the ignored parameters at connection or
> > PQgetCancel should ingore the unreferenced (or unchecked)
> > parameters. For example, the below diff takes the latter way and seems
> > working (for at least AF_UNIX connections)
> 
> To clarify, are you suggesting that PQgetCancel() should
> only parse the parameters for TCP connections
> if cancel->raddr.addr.ss_family != AF_UNIX?
> If so, I think that's a good idea.

You're right. I used connip in the diff because I thought it provided
the same condition, but in a simpler way.

> 
> > Of course, it's not great that pgfdw_cancel_query_begin() ignores the
> > result from PQgetCancel(), but I think we don't need another ereport.
> 
> Can you please clarify why you suggest avoiding outputting
> the warning message when PQgetCancel() returns NULL?

No.  I suggested merging the two failures that emit the "same" message
because I believed that they were identical. I had this in my mind:

  calcel = PQgetCancel();
  if (!cancel || PQcancel())
  {
     ereport(); return false;
  }
  PQfreeCancel()

However, I notcied that PQgetCancel() doesn't set errbuf.. So, I'm
fine with your proposal.

> I think it is important to inform the user when an error
> occurs and a cancel request cannot be sent, as this information
> can help them identify the cause of the problem (such as
> setting an overly large value for the keepalives parameter).

Although I view it as an internal error, I agree with emitting some
error messages in that situation.

regrads.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Etsuro Fujita
Date:
Hi Fujii-san,

On Wed, Apr 12, 2023 at 3:36 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> However, if PQgetCancel() returned NULL and no cancel request was issued,
> I found that postgres_fdw could still wait for the reply to
> the cancel request, causing unnecessary wait time with a 30 second timeout.

Good catch!

> Attached patch fixes this issue.

I am not 100% sure that it is a good idea to use the same error
message "could not send cancel request" for the PQgetCancel() and
PQcancel() cases, because they are different functions.  How about
"could not create PGcancel structure” or something like that, for the
former case, so we can distinguish the former error from the latter?

Best regards,
Etsuro Fujita



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Fujii Masao
Date:

On 2023/04/13 11:00, Kyotaro Horiguchi wrote:
> Agreed, it seems to be a leftover when we moved to parse_int_param()
> in that area.

It looks like there was an oversight in commit e7a2217978. I've attached a patch (0002) that updates PQconnectPoll() to
useparse_int_param() for parsing the keepalives parameter.
 

As this change is not directly related to the bug fix, it may not be necessary to back-patch it to the stable versions,
Ithink. Thought?
 


>> To clarify, are you suggesting that PQgetCancel() should
>> only parse the parameters for TCP connections
>> if cancel->raddr.addr.ss_family != AF_UNIX?
>> If so, I think that's a good idea.
> 
> You're right. I used connip in the diff because I thought it provided
> the same condition, but in a simpler way.

I made a modification to the 0001 patch. It will now allow PQgetCancel() to parse and interpret TCP connection
parametersonly when the connection is not made through a Unix-domain socket.
 


> However, I notcied that PQgetCancel() doesn't set errbuf.. So, I'm
> fine with your proposal.

Ok.


>> I think it is important to inform the user when an error
>> occurs and a cancel request cannot be sent, as this information
>> can help them identify the cause of the problem (such as
>> setting an overly large value for the keepalives parameter).
> 
> Although I view it as an internal error, I agree with emitting some
> error messages in that situation.

Ok.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Attachment

Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Fujii Masao
Date:

On 2023/04/13 15:13, Etsuro Fujita wrote:
> I am not 100% sure that it is a good idea to use the same error
> message "could not send cancel request" for the PQgetCancel() and
> PQcancel() cases, because they are different functions.  How about
> "could not create PGcancel structure” or something like that, for the

The primary message basically should avoid reference to implementation details such as specific structure names like
PGcancel,shouldn't it, as per the error message style guide?
 


> former case, so we can distinguish the former error from the latter?

Although the primary message is the same, the supplemental message provides additional context that can help
distinguishwhich function is reporting the message. Therefore, I'm fine with the current primary message in the 0001
patch.However, I'm open to any better message ideas.
 

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Etsuro Fujita
Date:
On Fri, Apr 14, 2023 at 3:19 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> On 2023/04/13 15:13, Etsuro Fujita wrote:
> > I am not 100% sure that it is a good idea to use the same error
> > message "could not send cancel request" for the PQgetCancel() and
> > PQcancel() cases, because they are different functions.  How about
> > "could not create PGcancel structure” or something like that, for the
>
> The primary message basically should avoid reference to implementation details such as specific structure names like
PGcancel,shouldn't it, as per the error message style guide? 

I do not think that PGcancel is that specific, as it is described in
the user-facing documentation [1].  (In addition, the error message I
proposed was created by copying the existing error message "could not
create OpenSSL BIO structure" in contrib/sslinfo.c.)

> > former case, so we can distinguish the former error from the latter?
>
> Although the primary message is the same, the supplemental message provides additional context that can help
distinguishwhich function is reporting the message. 

If the user is familiar with the PQgetCancel/PQcancel internals, this
is true, but if not, I do not think this is always true.  Consider
this error message, for example:

2023-04-14 17:48:55.862 JST [24344] WARNING:  could not send cancel
request: invalid integer value "99999999999" for connection option
"keepalives"

It would be hard for users without the knowledge about those internals
to distinguish that from this message.  For average users, I think it
would be good to use a more distinguishable error message.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/docs/current/libpq-cancel.html



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Fujii Masao
Date:

On 2023/04/14 18:59, Etsuro Fujita wrote:
>> The primary message basically should avoid reference to implementation details such as specific structure names like
PGcancel,shouldn't it, as per the error message style guide?
 
> 
> I do not think that PGcancel is that specific, as it is described in
> the user-facing documentation [1].  (In addition, the error message I
> proposed was created by copying the existing error message "could not
> create OpenSSL BIO structure" in contrib/sslinfo.c.)

I think that mentioning PGcancel in the error message could be confusing for average users who are just running a query
ona foreign table and encounter the error message after pressing Ctrl-C. They may not understand why the PGcancel
structis referenced in the error message while accessing foreign tables. It could be viewed as an internal detail that
isnot necessary for the user to know.
 


>> Although the primary message is the same, the supplemental message provides additional context that can help
distinguishwhich function is reporting the message.
 
> 
> If the user is familiar with the PQgetCancel/PQcancel internals, this
> is true, but if not, I do not think this is always true.  Consider
> this error message, for example:
> 
> 2023-04-14 17:48:55.862 JST [24344] WARNING:  could not send cancel
> request: invalid integer value "99999999999" for connection option
> "keepalives"
> 
> It would be hard for users without the knowledge about those internals
> to distinguish that from this message.  For average users, I think it
> would be good to use a more distinguishable error message.

In this case, I believe that they should be able to understand that an invalid integer value "99999999999" was
specifiedin the "keepalives" connection option, which caused the warning message. Then, they would need to check the
settingof the "keepalives" option and correct it if necessary.
 

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Etsuro Fujita
Date:
On Fri, Apr 14, 2023 at 11:28 PM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:
> On 2023/04/14 18:59, Etsuro Fujita wrote:
> >> The primary message basically should avoid reference to implementation details such as specific structure names
likePGcancel, shouldn't it, as per the error message style guide? 
> >
> > I do not think that PGcancel is that specific, as it is described in
> > the user-facing documentation [1].  (In addition, the error message I
> > proposed was created by copying the existing error message "could not
> > create OpenSSL BIO structure" in contrib/sslinfo.c.)
>
> I think that mentioning PGcancel in the error message could be confusing for average users who are just running a
queryon a foreign table and encounter the error message after pressing Ctrl-C. They may not understand why the PGcancel
structis referenced in the error message while accessing foreign tables. It could be viewed as an internal detail that
isnot necessary for the user to know. 

Ok, understood.  I do not think it is wrong to use "could not send
cancel request” for PQgetCancel as well, but I feel that that is not
perfect for PQgetCancel, because that function never sends a cancel
request; that function just initializes the request.  So how about
"could not initialize cancel request”, instead?

> >> Although the primary message is the same, the supplemental message provides additional context that can help
distinguishwhich function is reporting the message. 
> >
> > If the user is familiar with the PQgetCancel/PQcancel internals, this
> > is true, but if not, I do not think this is always true.  Consider
> > this error message, for example:
> >
> > 2023-04-14 17:48:55.862 JST [24344] WARNING:  could not send cancel
> > request: invalid integer value "99999999999" for connection option
> > "keepalives"
> >
> > It would be hard for users without the knowledge about those internals
> > to distinguish that from this message.  For average users, I think it
> > would be good to use a more distinguishable error message.
>
> In this case, I believe that they should be able to understand that an invalid integer value "99999999999" was
specifiedin the "keepalives" connection option, which caused the warning message. Then, they would need to check the
settingof the "keepalives" option and correct it if necessary. 

Maybe my explanation was not clear.  Let me explain.  Assume that a
user want to identify the place where the above error was thrown.
Using grep with ”could not send cancel request”, the user can find the
two places emitting the message in pgfdw_cancel_query_begin: one for
PQgetCancel and one for PQcancel.  If the user are familiar with the
PQgetCancel/PQcancel internals, the user can determine, from the
supplemental message, that the error was thrown by the former.  But if
not, the user cannot do so.  To support the unfamiliar user as well, I
think it would be a good idea to use a more appropriate message for
PQgetCancel that is different from "could not send cancel request”.

(I agree that most users would not care about the places where errors
were thrown, but I think some users would, and actually, I do when
investigating unfamiliar errors.)

Best regards,
Etsuro Fujita



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Kyotaro Horiguchi
Date:
At Mon, 17 Apr 2023 15:21:02 +0900, Etsuro Fujita <etsuro.fujita@gmail.com> wrote in 
> > >> Although the primary message is the same, the supplemental message pro=
> vides additional context that can help distinguish which function is report=
> ing the message.
> > >
> > > If the user is familiar with the PQgetCancel/PQcancel internals, this
> > > is true, but if not, I do not think this is always true.  Consider
> > > this error message, for example:
> > >
> > > 2023-04-14 17:48:55.862 JST [24344] WARNING:  could not send cancel
> > > request: invalid integer value "99999999999" for connection option
> > > "keepalives"
> > >
> > > It would be hard for users without the knowledge about those internals
> > > to distinguish that from this message.  For average users, I think it
> > > would be good to use a more distinguishable error message.
> >
> > In this case, I believe that they should be able to understand that an in=
> valid integer value "99999999999" was specified in the "keepalives" connect=
> ion option, which caused the warning message. Then, they would need to chec=
> k the setting of the "keepalives" option and correct it if necessary.
> 
> Maybe my explanation was not clear.  Let me explain.  Assume that a
> user want to identify the place where the above error was thrown.
> Using grep with =E2=80=9Dcould not send cancel request=E2=80=9D, the user c=
> an find the
> two places emitting the message in pgfdw_cancel_query_begin: one for
> PQgetCancel and one for PQcancel.  If the user are familiar with the
> PQgetCancel/PQcancel internals, the user can determine, from the
> supplemental message, that the error was thrown by the former.  But if
> not, the user cannot do so.  To support the unfamiliar user as well, I
> think it would be a good idea to use a more appropriate message for
> PQgetCancel that is different from "could not send cancel request=E2=80=9D.
> 
> (I agree that most users would not care about the places where errors
> were thrown, but I think some users would, and actually, I do when
> investigating unfamiliar errors.)

If PGgetCancel() fails due to invalid keepliave-related values, It
seems like a bug that needs fixing, regardless of whether we display
an error message when PGgetCacncel() fails.  The only error case of
PGgetCancel() that could occur in pgfdw_cancel_query_begin() is a
malloc() failure, which currently does not set an error message (I'm
not sure we can do that in that case, though..).

In my opinion, PQconnectPoll and PQgetCancel should use the same
parsing function or PQconnectPoll should set parsed values, making
unnecessary for PQgetCancel to parse the same parameter
again. Additionally, PQgetCancel should set appropriate error messages
for all failure modes.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Jelte Fennema
Date:
> In my opinion, PQconnectPoll and PQgetCancel should use the same
> parsing function or PQconnectPoll should set parsed values, making
> unnecessary for PQgetCancel to parse the same parameter
> again.

Yes, I totally agree. So I think patch 0002 looks fine.

> Additionally, PQgetCancel should set appropriate error messages
> for all failure modes.

I don't think that PQgetCancel should ever set error messages on the
provided conn object though. It's not part of the documented API and
it's quite confusing since there's actually no error on the connection
itself. That this happens for the keepalive parameter was an
unintended sideeffect of 5987feb70b combined with the fact that the
parsing is different. All those parsing functions should never error,
because setting up the connection should already have checked them.

So I think the newly added libpq_append_conn_error calls in patch 0001
should be removed. The AF_UNIX check and the new WARNING in pg_fdw
seem fine though. It would probably make sense to have them be
separate patches though, because they are pretty unrelated.



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Fujii Masao
Date:

On 2023/04/21 16:39, Jelte Fennema wrote:
>> In my opinion, PQconnectPoll and PQgetCancel should use the same
>> parsing function or PQconnectPoll should set parsed values, making
>> unnecessary for PQgetCancel to parse the same parameter
>> again.
> 
> Yes, I totally agree. So I think patch 0002 looks fine.

It seems like we have reached a consensus to push the 0002 patch.
As for back-patching, although the issue it fixes is trivial,
it may be a good idea to back-patch to v12 where parse_int_param()
was added, for easier back-patching in the future. Therefore
I'm thinking to push the 0002 patch at first and back-patch to v12.


>> Additionally, PQgetCancel should set appropriate error messages
>> for all failure modes.
> 
> I don't think that PQgetCancel should ever set error messages on the
> provided conn object though. It's not part of the documented API and
> it's quite confusing since there's actually no error on the connection
> itself. That this happens for the keepalive parameter was an
> unintended sideeffect of 5987feb70b combined with the fact that the
> parsing is different. All those parsing functions should never error,
> because setting up the connection should already have checked them.
> 
> So I think the newly added libpq_append_conn_error calls in patch 0001
> should be removed. The AF_UNIX check and the new WARNING in pg_fdw
> seem fine though.

Sounds reasonable to me.

Regarding the WARNING message, another idea is to pass the return value
of PQgetCancel() directly to PQcancel() as follows. If NULL is passed,
PQcancel() will detect it and set the proper error message to errbuf.
Then the warning message "WARNING:  could not send cancel request:
PQcancel() -- no cancel object supplied" is output. This approach is
similar to how dblink_cancel_query() does. Thought?

----------------
    cancel = PQgetCancel(conn);
    if (!PQcancel(cancel, errbuf, sizeof(errbuf)))
    {
        ereport(WARNING,
                (errcode(ERRCODE_CONNECTION_FAILURE),
                 errmsg("could not send cancel request: %s",
                        errbuf)));
        PQfreeCancel(cancel);
        return false;
    }
----------------

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

From
Kuwamura Masaki
Date:
Hi, Fujii-san

> Regarding the WARNING message, another idea is to pass the return value
> of PQgetCancel() directly to PQcancel() as follows. If NULL is passed,
> PQcancel() will detect it and set the proper error message to errbuf.
> Then the warning message "WARNING:  could not send cancel request:
> PQcancel() -- no cancel object supplied" is output.

I agree to go with this.

With this approach, the information behind the error (e.g., "out of memory") will disappear, I guess.
I think we have to deal with it eventually. (I'm sorry, I don't have a good idea right now)
However, the original issue is unnecessary waiting, and this should be fixed soon.
So it is better to fix the problem this way and discuss retaining information in another patch IMO.

I'm afraid I'm new to reviewing.
If I'm misunderstanding something, please let me know.

Masaki Kuwamura
On Thu, Apr 13, 2023 at 2:04 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >> To clarify, are you suggesting that PQgetCancel() should
> >> only parse the parameters for TCP connections
> >> if cancel->raddr.addr.ss_family != AF_UNIX?
> >> If so, I think that's a good idea.
> >
> > You're right. I used connip in the diff because I thought it provided
> > the same condition, but in a simpler way.
>
> I made a modification to the 0001 patch. It will now allow PQgetCancel() to parse and interpret TCP connection
parametersonly when the connection is not made through a Unix-domain socket. 

I don't really like this change. It seems to me that what this does is
decide that it's not an error to set tcp_user_timeout='a' when making
a cancel request if the connection doesn't actually use TCP. I agree
that we shouldn't try to *use* the values if they don't apply, but I'm
not sure it's a good idea to skip *sanity-checking* them when they
don't apply. For instance you can't set work_mem=ssdgjsjdg in
postgresql.conf even if you never run a query that needs work_mem.

--
Robert Haas
EDB: http://www.enterprisedb.com



On Thu, 13 Apr 2023 at 23:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2023/04/13 11:00, Kyotaro Horiguchi wrote:
> > Agreed, it seems to be a leftover when we moved to parse_int_param()
> > in that area.
>
> It looks like there was an oversight in commit e7a2217978. I've attached a patch (0002) that updates PQconnectPoll()
touse parse_int_param() for parsing the keepalives parameter.
 
>
> As this change is not directly related to the bug fix, it may not be necessary to back-patch it to the stable
versions,I think. Thought?
 
>
>
> >> To clarify, are you suggesting that PQgetCancel() should
> >> only parse the parameters for TCP connections
> >> if cancel->raddr.addr.ss_family != AF_UNIX?
> >> If so, I think that's a good idea.
> >
> > You're right. I used connip in the diff because I thought it provided
> > the same condition, but in a simpler way.
>
> I made a modification to the 0001 patch. It will now allow PQgetCancel() to parse and interpret TCP connection
parametersonly when the connection is not made through a Unix-domain socket.
 
>
>
> > However, I notcied that PQgetCancel() doesn't set errbuf.. So, I'm
> > fine with your proposal.
>
> Ok.
>
>
> >> I think it is important to inform the user when an error
> >> occurs and a cancel request cannot be sent, as this information
> >> can help them identify the cause of the problem (such as
> >> setting an overly large value for the keepalives parameter).
> >
> > Although I view it as an internal error, I agree with emitting some
> > error messages in that situation.
>
> Ok.

I have changed the status of the patch to "Waiting on Author" as all
the issues are not addressed. Feel free to address them and change the
status accordingly.

Regards,
Vignesh



On Thu, 11 Jan 2024 at 20:00, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 13 Apr 2023 at 23:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >
> >
> >
> > On 2023/04/13 11:00, Kyotaro Horiguchi wrote:
> > > Agreed, it seems to be a leftover when we moved to parse_int_param()
> > > in that area.
> >
> > It looks like there was an oversight in commit e7a2217978. I've attached a patch (0002) that updates
PQconnectPoll()to use parse_int_param() for parsing the keepalives parameter.
 
> >
> > As this change is not directly related to the bug fix, it may not be necessary to back-patch it to the stable
versions,I think. Thought?
 
> >
> >
> > >> To clarify, are you suggesting that PQgetCancel() should
> > >> only parse the parameters for TCP connections
> > >> if cancel->raddr.addr.ss_family != AF_UNIX?
> > >> If so, I think that's a good idea.
> > >
> > > You're right. I used connip in the diff because I thought it provided
> > > the same condition, but in a simpler way.
> >
> > I made a modification to the 0001 patch. It will now allow PQgetCancel() to parse and interpret TCP connection
parametersonly when the connection is not made through a Unix-domain socket.
 
> >
> >
> > > However, I notcied that PQgetCancel() doesn't set errbuf.. So, I'm
> > > fine with your proposal.
> >
> > Ok.
> >
> >
> > >> I think it is important to inform the user when an error
> > >> occurs and a cancel request cannot be sent, as this information
> > >> can help them identify the cause of the problem (such as
> > >> setting an overly large value for the keepalives parameter).
> > >
> > > Although I view it as an internal error, I agree with emitting some
> > > error messages in that situation.
> >
> > Ok.
>
> I have changed the status of the patch to "Waiting on Author" as all
> the issues are not addressed. Feel free to address them and change the
> status accordingly.

The patch which you submitted has been awaiting your attention for
quite some time now.  As such, we have moved it to "Returned with
Feedback" and removed it from the reviewing queue. Depending on
timing, this may be reversible.  Kindly address the feedback you have
received, and resubmit the patch to the next CommitFest.

Regards,
Vignesh