Thread: Portability concerns over pq_sendbyte?

Portability concerns over pq_sendbyte?

From
Andrew Gierth
Date:
In PG11, pq_sendbyte got changed from taking an int parameter to taking
an int8.

While that seems to work in general, it does mean that there are now
several places in the code that do the equivalent of:

    unsigned char x = 128;
    pq_sendbyte(&buf, x);

which I believe is not well-defined since pq_sendbyte takes an int8, and
conversions of unrepresentable values to _signed_ integer types are
(iirc) implementation-dependent.

There are also some cases where pq_sendint16 is being called for an
unsigned value or a value that might exceed 32767.

Would it be better for these to take unsigned values, or have unsigned
variants?

-- 
Andrew (irc:RhodiumToad)


Re: Portability concerns over pq_sendbyte?

From
Michael Paquier
Date:
On Thu, May 24, 2018 at 06:13:23PM +0100, Andrew Gierth wrote:
> In PG11, pq_sendbyte got changed from taking an int parameter to taking
> an int8.

From 1de09ad8.

> While that seems to work in general, it does mean that there are now
> several places in the code that do the equivalent of:
>
>     unsigned char x = 128;
>     pq_sendbyte(&buf, x);
>
> which I believe is not well-defined since pq_sendbyte takes an int8, and
> conversions of unrepresentable values to _signed_ integer types are
> (iirc) implementation-dependent.

Good point.

> There are also some cases where pq_sendint16 is being called for an
> unsigned value or a value that might exceed 32767.

If kept, some safeguards based on PG_INT*_[MIN|MAX] could be
appropriate.

> Would it be better for these to take unsigned values, or have unsigned
> variants?

It seems to me that it is not too late to change those interfaces so
as they use unsigned values in input so as they are consistent, it would
be dangerous on the contrary to keep those as they are.
--
Michael

Attachment

Re: Portability concerns over pq_sendbyte?

From
Alvaro Herrera
Date:
On 2018-May-24, Andrew Gierth wrote:

> In PG11, pq_sendbyte got changed from taking an int parameter to taking
> an int8.

> Would it be better for these to take unsigned values, or have unsigned
> variants?

Do you have an answer to this question?  Does anybody else?

(My guts tell me it'd be better to change these routines to take
unsigned values, without creating extra variants.  But guts frequently
misspeak.)

Andres, you own this open item, unless one of the other committers who
have participated in this thread would like to take it as their own.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Portability concerns over pq_sendbyte?

From
Michael Paquier
Date:
On Wed, Jun 06, 2018 at 12:27:58PM -0400, Alvaro Herrera wrote:
> Do you have an answer to this question?  Does anybody else?
>
> (My guts tell me it'd be better to change these routines to take
> unsigned values, without creating extra variants.  But guts frequently
> misspeak.)

My guts are telling me as well to not have more variants.  On top of
that it seems to me that we'd want to rename any new routines to include
"uint" in their name instead of "int", and for compatibility with past
code pq_sendint should not be touched.
--
Michael

Attachment

Re: Portability concerns over pq_sendbyte?

From
Michael Paquier
Date:
On Mon, Jun 11, 2018 at 02:25:44PM +0900, Michael Paquier wrote:
> On Wed, Jun 06, 2018 at 12:27:58PM -0400, Alvaro Herrera wrote:
>> Do you have an answer to this question?  Does anybody else?
>>
>> (My guts tell me it'd be better to change these routines to take
>> unsigned values, without creating extra variants.  But guts frequently
>> misspeak.)
>
> My guts are telling me as well to not have more variants.  On top of
> that it seems to me that we'd want to rename any new routines to include
> "uint" in their name instead of "int", and for compatibility with past
> code pq_sendint should not be touched.

And also pq_sendint64 needs to be kept around for compatibility.  I have
quickly looked at how much code would be involved here and there are
quite close to 240 code paths which involve the new routines.  Please
see attached for reference, I have not put much thoughts into it to be
honest, so that's really at an early stage.
--
Michael

Attachment

Re: Portability concerns over pq_sendbyte?

From
Andres Freund
Date:
Hi,

On 2018-05-24 18:13:23 +0100, Andrew Gierth wrote:
> In PG11, pq_sendbyte got changed from taking an int parameter to taking
> an int8.
> 
> While that seems to work in general, it does mean that there are now
> several places in the code that do the equivalent of:
> 
>     unsigned char x = 128;
>     pq_sendbyte(&buf, x);
> 
> which I believe is not well-defined since pq_sendbyte takes an int8, and
> conversions of unrepresentable values to _signed_ integer types are
> (iirc) implementation-dependent.

It's not implementation defined in postgres' dialect of C - we rely on
accurate signed->unsigned conversions in a number of places.  But I
doin't think we should increase that reliance, so I think you're right
we should do something about this.


> There are also some cases where pq_sendint16 is being called for an
> unsigned value or a value that might exceed 32767.

Hm, which case were you thinking of here?  The calls usually are exactly
the types that the wire protocol expects, no?


> Would it be better for these to take unsigned values, or have unsigned
> variants?

I wonder if we should just take 'int' out of the name. Say,
pg_send{8,16,32,64}(unsigned ...).

Greetings,

Andres Freund


Re: Portability concerns over pq_sendbyte?

From
Andres Freund
Date:
On 2018-06-13 14:10:37 +0900, Michael Paquier wrote:
> On Mon, Jun 11, 2018 at 02:25:44PM +0900, Michael Paquier wrote:
> > On Wed, Jun 06, 2018 at 12:27:58PM -0400, Alvaro Herrera wrote:
> >> Do you have an answer to this question?  Does anybody else?
> >> 
> >> (My guts tell me it'd be better to change these routines to take
> >> unsigned values, without creating extra variants.  But guts frequently
> >> misspeak.)
> > 
> > My guts are telling me as well to not have more variants.

Agreed.


> > On top of that it seems to me that we'd want to rename any new
> > routines to include "uint" in their name instead of "int", and for
> > compatibility with past code pq_sendint should not be touched.

I'm very doubtful about this one, unless you mean that just the
signature shouldn't be touched.  Otherwise we'll just increase code
duplication unnecessarily?


> And also pq_sendint64 needs to be kept around for compatibility.

:(. Wonder if it's better to just break people's code.

Greetings,

Andres Freund


Re: Portability concerns over pq_sendbyte?

From
Andrew Gierth
Date:
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:

 >> unsigned char x = 128;
 >> pq_sendbyte(&buf, x);
 >> 
 >> which I believe is not well-defined since pq_sendbyte takes an int8,
 >> and conversions of unrepresentable values to _signed_ integer types
 >> are (iirc) implementation-dependent.

 Andres> It's not implementation defined in postgres' dialect of C - we
 Andres> rely on accurate signed->unsigned conversions in a number of
 Andres> places.

Converting signed integer to unsigned is ok as I understand it - what's
happening here is the reverse, converting an unrepresentable unsigned
value to a signed type.

 >> There are also some cases where pq_sendint16 is being called for an
 >> unsigned value or a value that might exceed 32767.

 Andres> Hm, which case were you thinking of here? The calls usually are
 Andres> exactly the types that the wire protocol expects, no?

There are cases where it's not actually clear what the wire protocol
expects - I'm thinking in particular of the number of entries in a list
of parameter types/formats.

-- 
Andrew (irc:RhodiumToad)


Re: Portability concerns over pq_sendbyte?

From
Andres Freund
Date:
Hi,

On 2018-06-13 22:02:13 +0100, Andrew Gierth wrote:
> >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:
> 
>  >> unsigned char x = 128;
>  >> pq_sendbyte(&buf, x);
>  >> 
>  >> which I believe is not well-defined since pq_sendbyte takes an int8,
>  >> and conversions of unrepresentable values to _signed_ integer types
>  >> are (iirc) implementation-dependent.
> 
>  Andres> It's not implementation defined in postgres' dialect of C - we
>  Andres> rely on accurate signed->unsigned conversions in a number of
>  Andres> places.
> 
> Converting signed integer to unsigned is ok as I understand it - what's
> happening here is the reverse, converting an unrepresentable unsigned
> value to a signed type.

Err, yes, I was thinking about that conversion. Sorry for the confusion.


>  >> There are also some cases where pq_sendint16 is being called for an
>  >> unsigned value or a value that might exceed 32767.
> 
>  Andres> Hm, which case were you thinking of here? The calls usually are
>  Andres> exactly the types that the wire protocol expects, no?
> 
> There are cases where it's not actually clear what the wire protocol
> expects - I'm thinking in particular of the number of entries in a list
> of parameter types/formats.

But that didn't change around the pq_send* changes?  So I'm not sure I
understand how this is related?  I mean I'm all for documenting the wire
protocol more extensively, but we can't just change the width?

Greetings,

Andres Freund


Re: Portability concerns over pq_sendbyte?

From
Michael Paquier
Date:
On Wed, Jun 13, 2018 at 11:53:21AM -0700, Andres Freund wrote:
> On 2018-06-13 14:10:37 +0900, Michael Paquier wrote:
>> On Mon, Jun 11, 2018 at 02:25:44PM +0900, Michael Paquier wrote:
>>> On top of that it seems to me that we'd want to rename any new
>>> routines to include "uint" in their name instead of "int", and for
>>> compatibility with past code pq_sendint should not be touched.
>
> I'm very doubtful about this one, unless you mean that just the
> signature shouldn't be touched.  Otherwise we'll just increase code
> duplication unnecessarily?

Yeah, actually that would be assuming that many modules use it, but that
does not seem to be much the case, at least from github's point of view.

>> And also pq_sendint64 needs to be kept around for compatibility.
>
> :(. Wonder if it's better to just break people's code.

Indeed.  At least breaking compilation has the advantage of making
people directly aware of the change and think hopefully about them.

A research on github shows a bunch of people having copied of pqformat.h
as there are a bunch of copies of Postgres so with this much noise it is
not easy to find out what would be broken.  In-core contrib and test
modules don't make use of those interfaces as well, except for hstore.
So that could be acceptable.

For pq_sendint there are many matches with printsimple.c.
--
Michael

Attachment

Re: Portability concerns over pq_sendbyte?

From
Alvaro Herrera
Date:
Hello

How about not renaming the functions, but just change argument types?
Having to change all callsites to cope with some new naming convention
does not strike me as a great idea ... it may collide with any
backpatching in the area, for one.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Portability concerns over pq_sendbyte?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> How about not renaming the functions, but just change argument types?

Yeah, I didn't understand why anything else would be on the table.
We already changed their arg types for 11, no?  This is just second
thoughts about what to change them to.

            regards, tom lane


Re: Portability concerns over pq_sendbyte?

From
Andres Freund
Date:
Hi,

On 2018-06-14 16:17:28 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > How about not renaming the functions, but just change argument types?

Yea, I'm in favor of this. I don't think the 'u' in there would benefit
us, and the cast from signed to unsigned is well defined, so it's safe
to call the functions with signed input.


> Yeah, I didn't understand why anything else would be on the table.

Because it sounds a bit weird for a function with just 'int' in the name
to take unsigned ints as a parameter? I don't think that's enough
justification to rename everything, but it's not completely crazy either.

> We already changed their arg types for 11, no?

No, not really. We added new functions.

Greetings,

Andres Freund


Re: Portability concerns over pq_sendbyte?

From
Andres Freund
Date:
Hi,

On 2018-06-14 13:25:30 -0700, Andres Freund wrote:
> On 2018-06-14 16:17:28 -0400, Tom Lane wrote:
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > How about not renaming the functions, but just change argument types?
> 
> Yea, I'm in favor of this. I don't think the 'u' in there would benefit
> us, and the cast from signed to unsigned is well defined, so it's safe
> to call the functions with signed input.

Nobody argued against, thus I've pushed a patch doing so.

Looking at surrounding code I found a few more oddities, but of older
vintage:
- pq_sendfloat4 uses an uint32 in the union, but float8 uses a int64.
- same with pq_getmsgfloat[48]
- pq_getmsgint64 returns a int64, should probably also be uint64

Given they're practially harmless I'm inclined to only fix them in
master?

Greetings,

Andres Freund


Re: Portability concerns over pq_sendbyte?

From
Michael Paquier
Date:
On Tue, Jun 26, 2018 at 11:51:49PM -0700, Andres Freund wrote:
> Looking at surrounding code I found a few more oddities, but of older
> vintage:
> - pq_sendfloat4 uses an uint32 in the union, but float8 uses a int64.
> - same with pq_getmsgfloat[48]
> - pq_getmsgint64 returns a int64, should probably also be uint64
>
> Given they're practially harmless I'm inclined to only fix them in
> master?

Doing those three things on HEAD only looks like a good plan to me.  I
have not spotted more inconsistencies in pqformat.c.
--
Michael

Attachment