Thread: libpq stricter integer parsing

libpq stricter integer parsing

From
Fabien COELHO
Date:
Follow up on a patch and discussion with Tom, currently integer parsing on 
keywords in libpq is quite loose, resulting in trailing garbage being 
ignored and allowing to hide bugs, eg:

   sh> psql "connect_timeout=2,port=5433"

The timeout is set to 2, and the port directive is silently ignored.
However, URL parsing is stricter, eg on "port".

The attached patch checks integer syntax errors and overflows, and report 
errors.

The pros is that it helps detect bugs. The cons is that some people may 
not want to know about these if it works in the end.

-- 
Fabien.
Attachment

Re: libpq stricter integer parsing

From
Peter Eisentraut
Date:
On 17/08/2018 12:13, Fabien COELHO wrote:
>    sh> psql "connect_timeout=2,port=5433"
> 
> The timeout is set to 2, and the port directive is silently ignored.
> However, URL parsing is stricter, eg on "port".
> 
> The attached patch checks integer syntax errors and overflows, and report 
> errors.

This looks useful and the patch looks reasonable, but it doesn't apply
anymore.  Can you send in an update?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: libpq stricter integer parsing

From
Fabien COELHO
Date:
Hello Peter,

>> The timeout is set to 2, and the port directive is silently ignored.
>> However, URL parsing is stricter, eg on "port".
>>
>> The attached patch checks integer syntax errors and overflows, and report
>> errors.
>
> This looks useful and the patch looks reasonable, but it doesn't apply
> anymore.  Can you send in an update?

Hmmm. It does apply on a test I just did right know...

Usually it does not apply with "git apply" if your MUA does not know that 
MIME requires CR LF eol on text files, and that it is its responsability 
to switch to something else if desired. Pg automatic patch checker does 
not know about that so complains unduly. Some MUA send attachements which 
violates MIME (text attachements with LF only eol), thus hidding the 
issue.

Otherwise, the "patch" command should work?

-- 
Fabien.


Re: libpq stricter integer parsing

From
Michael Paquier
Date:
On Fri, Sep 07, 2018 at 05:13:17PM +0200, Fabien COELHO wrote:
> Hmmm. It does apply on a test I just did right know...

That's weird, it does not apply for me either with patch -p1 and there
is on conflict in fe-connect.c, which looks easy enough to fix by the
way.
--
Michael

Attachment

Re: libpq stricter integer parsing

From
Fabien COELHO
Date:
Hello Michael,

>> Hmmm. It does apply on a test I just did right know...
>
> That's weird, it does not apply for me either with patch -p1 and there
> is on conflict in fe-connect.c, which looks easy enough to fix by the
> way.

Weird indeed. However, even if the patch applied cleanly for me, it was 
not compiling anymore. Attached an updated version, in which I also tried 
to improve the error message on trailing garbage.

-- 
Fabien
Attachment

Re: libpq stricter integer parsing

From
Michael Paquier
Date:
On Fri, Sep 07, 2018 at 11:22:14PM +0200, Fabien COELHO wrote:
> Weird indeed. However, even if the patch applied cleanly for me, it was not
> compiling anymore. Attached an updated version, in which I also tried to
> improve the error message on trailing garbage.

At least now it applies for me, thanks.

+    Integer values expected for keywords <literal>port</literal>,
+    <literal>connect_timeout</literal>,
+    <literal>keepalives_idle</literal>,
+    <literal>keepalives_interval</literal> and
+    <literal>keepalives_timeout</literal> are parsed strictly.
Wouldn't it be better to actually document what "parsed strictly" means?

A wild though: we already have things like pg_strtoint32 or pg_atoi
which do similar error handling in smarter ways.  Wouldn't we want to
refactor the code so as libpq makes use of such routines?  This would
mean that the error string should be moved around, and allowing
frontends to use those utilities requires some extra engineering.

Not that this patch should reinvent the world...
--
Michael

Attachment

Re: libpq stricter integer parsing

From
Fabien COELHO
Date:
Hello Michael,

> On Fri, Sep 07, 2018 at 11:22:14PM +0200, Fabien COELHO wrote:
>> Weird indeed. However, even if the patch applied cleanly for me, it was not
>> compiling anymore. Attached an updated version, in which I also tried to
>> improve the error message on trailing garbage.
>
> At least now it applies for me, thanks.
>
> +    Integer values expected for keywords <literal>port</literal>,
> +    <literal>connect_timeout</literal>,
> +    <literal>keepalives_idle</literal>,
> +    <literal>keepalives_interval</literal> and
> +    <literal>keepalives_timeout</literal> are parsed strictly.

> Wouldn't it be better to actually document what "parsed strictly" means?

Hmmm. This is what the sentence following the above tries to explain 
implicitely:

   Versions of <application>libpq</application> before
   <product>PostgreSQL 12</product> accepted trailing garbage or overflows.

Maybe I can rephrase it in one sentence, eg:

"From PostgreSQL 12, integer values for keywords ... are parsed strictly, 
i.e. trailing garbage and errors on overflows are not accepted anymore."

> A wild though: we already have things like pg_strtoint32 or pg_atoi
> which do similar error handling in smarter ways.  Wouldn't we want to
> refactor the code so as libpq makes use of such routines?  This would
> mean that the error string should be moved around, and allowing
> frontends to use those utilities requires some extra engineering.

I thought of that but rejected it.

The pg_* function you mention are in the backend code, where error 
handling is managed differently, and this function is really only about 
error handling. Moreover "strtol" is already used "libpq/fe-connect.c" for 
the same purpose of parsing int and detecting errors (URL port, keep 
alives).

So the implied changes to try to factor out the functionnality looked like 
a bad idea (where should I put it [probably pgport?]? how should I manage 
errors in a client/server compatible way [no simple idea]?), hence the 
local re-inventation, also to avoid any impact on the backend code.

> Not that this patch should reinvent the world...

Sure.

-- 
Fabien.


Re: libpq stricter integer parsing

From
Michael Paquier
Date:
On Sun, Sep 09, 2018 at 09:01:15AM +0200, Fabien COELHO wrote:
> Hmmm. This is what the sentence following the above tries to explain
> implicitely:
>
>   Versions of <application>libpq</application> before
>   <product>PostgreSQL 12</product> accepted trailing garbage or overflows.
>
> Maybe I can rephrase it in one sentence, eg:
>
> "From PostgreSQL 12, integer values for keywords ... are parsed strictly,
> i.e. trailing garbage and errors on overflows are not accepted
> anymore."

Okay, I am including that formulation.  I have not put yet much thoughts
into locating this in another place of the docs.  Or perhaps we could
just discard it from the final commit.

I have been reviewing your patch a bit more, and I have found an issue:
overflows are not correctly detected.  For example by specifying
something like port=5000000000 I would have expected an error but the
parsing code failed to detect that.  Values like -1 need to be accepted
though are equivalent to an unknown state when it comes to keepalive_*.

In conclusion, I finish with the simplified patch attached.  Fabien, is
that acceptable to you?
--
Michael

Attachment

Re: libpq stricter integer parsing

From
Peter Eisentraut
Date:
On 11/09/2018 11:00, Michael Paquier wrote:
> diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
> index 5e7931ba90..bc7836d103 100644
> --- a/doc/src/sgml/libpq.sgml
> +++ b/doc/src/sgml/libpq.sgml
> @@ -1591,6 +1591,15 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
>      </varlistentry>
>      </variablelist>
>     </para>
> +
> +   <para>
> +    Integer values expected for keywords <literal>port</literal>,
> +    <literal>connect_timeout</literal>, <literal>keepalives_idle</literal>,
> +    <literal>keepalives_interval</literal> and
> +    <literal>keepalives_timeout</literal> are parsed more strictly as
> +    of <product>PostgreSQL<product> 12, i.e. values including trailing garbage
> +    or overflowing are rejected.
> +   </para>
>    </sect2>
>   </sect1>

I would leave this out.  We don't need to document every single
refinement of parsing rules.  This might better belong in the release notes.

> +    appendPQExpBuffer(&conn->errorMessage,
> +                      libpq_gettext("invalid value for keyword \"%s\"\n"),
> +                      context);

Add the actual invalid value to the error message.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: libpq stricter integer parsing

From
Fabien COELHO
Date:
>> +   <para>...</para>
>
> I would leave this out.  We don't need to document every single
> refinement of parsing rules.  This might better belong in the release notes.

Why not, I'd be fine with that. The fact that the libpq parser was quite 
fuzzy was not documented, the behavior was really a side effect of the 
implementation which never suggested that it would work.

>> +    appendPQExpBuffer(&conn->errorMessage,
>> +                      libpq_gettext("invalid value for keyword \"%s\"\n"),
>> +                      context);
>
> Add the actual invalid value to the error message.

Indeed.

Attached Michael's simplified version updated with your remarks.

-- 
Fabien.
Attachment

Re: libpq stricter integer parsing

From
Michael Paquier
Date:
On Tue, Sep 11, 2018 at 07:03:41PM +0200, Fabien COELHO wrote:
> Attached Michael's simplified version updated with your remarks.

Okay, this version looks good to me so pushed.  Thanks Fabien and
Peter.
--
Michael

Attachment