Thread: Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
From
Fujii Masao
Date:
On 2024/10/01 14:11, Yuto Sasaki (Fujitsu) wrote: > Hi hackers, > I've discovered a bug in ECPG that causes database connection failures. This issue > appears to stem from libpq layer. > Found bug: The EXEC SQL CONNECT TO statement fails to connect to the database. > Specifically, the following code: > ```c > EXEC SQL CONNECT TO tcp:postgresql://localhost:5432/postgres?keepalives=1 > &keepalives_idle=1 USER yuto; > ``` > When precompiled and executed, this code fails to connect to the database. > Error message: > ``` > failed: keepalives parameter must be an integer > ``` > Steps to reproduce: The issue can be reproduced using the attached pgc file. > Root cause: The method for parsing the keepalives parameter in the useKeepalives > function of the libpq library is not appropriate. Specifically, it doesn't > account for whitespace following the numeric value. Is a connection URL with whitespace, like "tcp:postgresql://localhost:5432/postgres?keepalives=1 & ...", considered valid? If not, the issue seems to be that ecpg adds unnecessary whitespace to the connection URL, especially after the "&" character. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Fujii Masao <masao.fujii@oss.nttdata.com> writes: > On 2024/10/01 14:11, Yuto Sasaki (Fujitsu) wrote: >> Root cause: The method for parsing the keepalives parameter in the useKeepalives >> function of the libpq library is not appropriate. Specifically, it doesn't >> account for whitespace following the numeric value. > Is a connection URL with whitespace, like "tcp:postgresql://localhost:5432/postgres?keepalives=1 & ...", > considered valid? If not, the issue seems to be that ecpg adds unnecessary whitespace > to the connection URL, especially after the "&" character. I agree with Sasaki-san that useKeepalives seems rather bogus: almost every other place in fe-connect.c uses pqParseIntParam rather than calling strtol directly, so why not this one? We might have some work to do in ecpg as well, though. regards, tom lane
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
From
Fujii Masao
Date:
On 2024/10/02 11:35, Michael Paquier wrote: > On Tue, Oct 01, 2024 at 12:29:15PM -0400, Tom Lane wrote: >> Fujii Masao <masao.fujii@oss.nttdata.com> writes: >>> Is a connection URL with whitespace, like "tcp:postgresql://localhost:5432/postgres?keepalives=1 & ...", >>> considered valid? If not, the issue seems to be that ecpg adds unnecessary whitespace >>> to the connection URL, especially after the "&" character. >> >> I agree with Sasaki-san that useKeepalives seems rather bogus: almost >> every other place in fe-connect.c uses pqParseIntParam rather than >> calling strtol directly, so why not this one? I have no objection to improving the handling of the keepalives parameter. OTOH, I think ecpg might have an issue when converting the connection URI. > Yes, it is a mistake to not use pqParseIntParam(), or > parse_int_param() depending on the branch. This stuff has been > introduced by 4f4061b2dde1, where I've spent some time making sure > that leading and trailing whitespaces are discarded in this routine. > > See also these examples where whitespaces are OK in a connection URL: > https://www.postgresql.org/message-id/20191021024020.GF1542%40paquier.xyz For example, ecpg converts: EXEC SQL CONNECT TO tcp:postgresql://localhost:5432/postgres?keepalives=1&keepalives_idle=1&keepalives_interval=1&keepalives_count=2USER postgres; into: { ECPGconnect(__LINE__, 0, "tcp:postgresql://localhost:5432/postgres?keepalives=1 & keepalives_idle=1 & keepalives_interval=1& keepalives_count=2" , "postgres" , NULL , NULL, 0); In the converted URI, whitespace is added before and after the ? character. In my quick test, ECPGconnect() seems to handle this without error, but when I tried the same URI in psql, it returned an error: $ psql "postgresql://localhost:5432/postgres?keepalives=1 & keepalives_idle=1 & keepalives_interval=1 & keepalives_count=2" psql: error: invalid URI query parameter: " keepalives_idle" It seems that libpq may consider this URI invalid. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Fujii Masao <masao.fujii@oss.nttdata.com> writes: > On 2024/10/02 11:35, Michael Paquier wrote: >> On Tue, Oct 01, 2024 at 12:29:15PM -0400, Tom Lane wrote: >>> I agree with Sasaki-san that useKeepalives seems rather bogus: almost >>> every other place in fe-connect.c uses pqParseIntParam rather than >>> calling strtol directly, so why not this one? > I have no objection to improving the handling of the keepalives parameter. > OTOH, I think ecpg might have an issue when converting the connection URI. I went ahead and pushed Sasaki-san's patch. I think anything we might do in ecpg is probably just cosmetic and wouldn't get back-patched. > In the converted URI, whitespace is added before and after the ? character. > In my quick test, ECPGconnect() seems to handle this without error, > but when I tried the same URI in psql, it returned an error: > $ psql "postgresql://localhost:5432/postgres?keepalives=1 & keepalives_idle=1 & keepalives_interval=1 & keepalives_count=2" > psql: error: invalid URI query parameter: " keepalives_idle" Interesting. This is unhappy about the space before a parameter name, not the space after a parameter value, so it's a different issue. But it's weird that ecpg takes it while libpq doesn't. Could libecpg be modifying/reassembling the URI string? I didn't look. regards, tom lane
I poked into the ecpg end of this and found that the extra space is coming from one production in ecpg.trailer that's carelessly using cat_str (which inserts spaces) instead of makeN_str (which doesn't). So it's pretty trivial to fix, as attached. I do not think we could rip out ECPGconnect's logic to remove the spaces at runtime, because that would break existing ecpg applications until they're recompiled. It might be worth adding a comment there about why it's being done, though. I don't have a strong opinion one way or the other about whether we should make libpq permissive about extra spaces (as per Michael's patch). I guess you could argue that all of these fixes are consistent with the principle of "be conservative with what you send and liberal with what you accept". But at most I'd fix these remaining things in HEAD. regards, tom lane diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer index b6233e5e53..f3ab73bed6 100644 --- a/src/interfaces/ecpg/preproc/ecpg.trailer +++ b/src/interfaces/ecpg/preproc/ecpg.trailer @@ -348,7 +348,7 @@ connect_options: ColId opt_opt_value if (strcmp($3, "&") != 0) mmerror(PARSE_ERROR, ET_ERROR, "unrecognized token \"%s\"", $3); - $$ = cat_str(3, make2_str($1, $2), $3, $4); + $$ = make3_str(make2_str($1, $2), $3, $4); } ; diff --git a/src/interfaces/ecpg/test/expected/connect-test5.c b/src/interfaces/ecpg/test/expected/connect-test5.c index ec1514ed9a..fc650f5cd5 100644 --- a/src/interfaces/ecpg/test/expected/connect-test5.c +++ b/src/interfaces/ecpg/test/expected/connect-test5.c @@ -117,7 +117,7 @@ main(void) #line 56 "test5.pgc" - { ECPGconnect(__LINE__, 0, "unix:postgresql://localhost/ecpg2_regression?connect_timeout=180 & client_encoding=sql_ascii", "regress_ecpg_user1" , "connectpw" , "main", 0); } + { ECPGconnect(__LINE__, 0, "unix:postgresql://localhost/ecpg2_regression?connect_timeout=180&client_encoding=sql_ascii", "regress_ecpg_user1" , "connectpw", "main", 0); } #line 58 "test5.pgc" { ECPGdisconnect(__LINE__, "main");} diff --git a/src/interfaces/ecpg/test/expected/connect-test5.stderr b/src/interfaces/ecpg/test/expected/connect-test5.stderr index 51cc18916a..037db21758 100644 --- a/src/interfaces/ecpg/test/expected/connect-test5.stderr +++ b/src/interfaces/ecpg/test/expected/connect-test5.stderr @@ -50,7 +50,7 @@ [NO_PID]: sqlca: code: 0, state: 00000 [NO_PID]: ecpg_finish: connection main closed [NO_PID]: sqlca: code: 0, state: 00000 -[NO_PID]: ECPGconnect: opening database ecpg2_regression on <DEFAULT> port <DEFAULT> with options connect_timeout=180 &client_encoding=sql_ascii for user regress_ecpg_user1 +[NO_PID]: ECPGconnect: opening database ecpg2_regression on <DEFAULT> port <DEFAULT> with options connect_timeout=180&client_encoding=sql_asciifor user regress_ecpg_user1 [NO_PID]: sqlca: code: 0, state: 00000 [NO_PID]: ecpg_finish: connection main closed [NO_PID]: sqlca: code: 0, state: 00000
Michael Paquier <michael@paquier.xyz> writes: > On Thu, Oct 03, 2024 at 11:57:16AM -0400, Tom Lane wrote: >> I don't have a strong opinion one way or the other about whether >> we should make libpq permissive about extra spaces (as per >> Michael's patch). I guess you could argue that all of these >> fixes are consistent with the principle of "be conservative >> with what you send and liberal with what you accept". But at >> most I'd fix these remaining things in HEAD. > Removing this extra whitespace from the ECPG strings sounds good here. > FWIW, my argument about doing this in libpq is not really related to > ECPG: it feels inconsistent to apply one rule for the parameters and a > different one for the values in URIs. So I'd be OK to see how this > goes on as a HEAD-only change. OK, if there's no objections let's push both remaining patches to HEAD only. regards, tom lane
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
From
"Yuto Sasaki (Fujitsu)"
Date:
The patch looks good to me.
The URI parsing process has become more stringent, allowing for accurate
handling of leading and trailing whitespace. Additionally, a feature has
been implemented to detect errors when extraneous data is present at the
end of the URI. The proper functioning of these changes has been verified
through newly added test cases. I've also tested by removing the whitespace
handling code from ECPGConnect() and applying your patch to libpq, which
worked successfully.
差出人: Michael Paquier
送信: 2024 年 10 月 3 日 (木曜日) 9:42
宛先: Tom Lane
Cc: Fujii Masao; Sasaki, Yuto/佐佐木 悠人; pgsql-hackers@lists.postgresql.org
件名: Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
On Wed, Oct 02, 2024 at 05:39:31PM -0400, Tom Lane wrote:
> Interesting. This is unhappy about the space before a parameter name,
> not the space after a parameter value, so it's a different issue.
conninfo_uri_parse_options() parses the URI as a set of option/values,
where conninfo_uri_parse_params. If we were to be careful about
trailing and leading whitespaces for the parameter names, we need to
be careful about the special JDBC cases for "ssl" and "requiressl",
meaning that we should add more logic in conninfo_uri_decode() to
discard these. That would apply a extra layer of sanity into the
values as well.
> But it's weird that ecpg takes it while libpq doesn't. Could libecpg
> be modifying/reassembling the URI string? I didn't look.
ECPGconnect() has some custom logic to discard trailing and leading
spaces:
/* Skip spaces before keyword */
for (token1 = str; *token1 == ' '; token1++)
[...]
token1[e] = '\0'; //skips trailing spaces.
The argument for libpq where we could be consistent is appealing. How
about lifting things in libpq like the attached? I wouldn't backpatch
that, but we have tests for URIs and I didn't break anything.
--
Michael
> Interesting. This is unhappy about the space before a parameter name,
> not the space after a parameter value, so it's a different issue.
conninfo_uri_parse_options() parses the URI as a set of option/values,
where conninfo_uri_parse_params. If we were to be careful about
trailing and leading whitespaces for the parameter names, we need to
be careful about the special JDBC cases for "ssl" and "requiressl",
meaning that we should add more logic in conninfo_uri_decode() to
discard these. That would apply a extra layer of sanity into the
values as well.
> But it's weird that ecpg takes it while libpq doesn't. Could libecpg
> be modifying/reassembling the URI string? I didn't look.
ECPGconnect() has some custom logic to discard trailing and leading
spaces:
/* Skip spaces before keyword */
for (token1 = str; *token1 == ' '; token1++)
[...]
token1[e] = '\0'; //skips trailing spaces.
The argument for libpq where we could be consistent is appealing. How
about lifting things in libpq like the attached? I wouldn't backpatch
that, but we have tests for URIs and I didn't break anything.
--
Michael
Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value
From
Fujii Masao
Date:
On 2024/10/06 18:35, Michael Paquier wrote: > On Thu, Oct 03, 2024 at 08:12:28PM -0400, Tom Lane wrote: >> OK, if there's no objections let's push both remaining patches >> to HEAD only. > > Done as of f22e84df1dea and 430ce189fc45. Commit 430ce189fc45 unexpectedly caused psql to report the error "error: trailing data found" when a connection URI contains a whitespace, e.g., in a parameter value. For example, the following command used to work but no longer does after this commit: $ psql -d "postgresql://localhost:5432/postgres?application_name=a b" I'm not sure if this URI format is valid (according to RFC 3986), though. + for (const char *s = q; *s == ' '; s++) + { + q++; + continue; + } Is the "continue" really necessary? Also could we simplify it like this? for (; *q == ' '; q++); Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION