Thread: sockaddr_un.sun_len vs. reality

sockaddr_un.sun_len vs. reality

From
Tom Lane
Date:
For a very long time, the AIX 7.1 buildfarm members (sungazer and tern)
have complained about

ip.c:236:17: warning: conversion from 'long unsigned int' to 'uchar_t' {aka 'unsigned char'} changes value from '1025'
to'1' [-Woverflow] 

I'd ignored this as not being very interesting, but I got motivated to
recheck it today.  The warning is coming from

#ifdef HAVE_STRUCT_SOCKADDR_STORAGE_SS_LEN
    unp->sun_len = sizeof(struct sockaddr_un);
#endif

so we can infer that the sun_len field is unsigned char and the
size of struct sockaddr_un doesn't fit.

Poking around a bit, I find that sun_len exists on most BSD-ish
platforms and it seems to be unsigned char (almost?) everywhere.
But on other platforms sizeof(struct sockaddr_un) is only a bit
over 100, so there's not an overflow problem.

It's not real clear that there's any problem on AIX either;
given that the regression tests pass, I strongly suspect that
nothing is paying attention to the sun_len field.  But if we're
going to fill it in at all, we should probably try to fill it
in with something less misleading than "1".

Googling finds that this seems to be a sore spot for various
people, eg [1] mentions the existence of the SUN_LEN() macro
on some platforms and then says why you shouldn't use it.

I'm leaning to adding a compile-time clamp on the value,
that is

    unp->sun_len = Min(sizeof(struct sockaddr_un),
                       (1 << (sizeof(unp->sun_len) * 8)) - 1);

(This might need a little bit of refinement if sun_len
could be as wide as int, but in any case it should still
reduce to a compile-time constant.)

Or we could use something involving strlen(unp->sun_path), perhaps
trying to use SUN_LEN() if it exists.  But that implies expending
extra run-time cycles for strlen(), and I've seen no indication
that there's any value here except suppressing a compiler warning.

Thoughts?

            regards, tom lane

[1] http://mail-index.netbsd.org/tech-net/2006/10/11/0008.html



Re: sockaddr_un.sun_len vs. reality

From
Thomas Munro
Date:
On Mon, Feb 14, 2022 at 7:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> For a very long time, the AIX 7.1 buildfarm members (sungazer and tern)
> have complained about
>
> ip.c:236:17: warning: conversion from 'long unsigned int' to 'uchar_t' {aka 'unsigned char'} changes value from
'1025'to '1' [-Woverflow]
 
>
> I'd ignored this as not being very interesting, but I got motivated to
> recheck it today.  The warning is coming from
>
> #ifdef HAVE_STRUCT_SOCKADDR_STORAGE_SS_LEN
>         unp->sun_len = sizeof(struct sockaddr_un);
> #endif
>
> so we can infer that the sun_len field is unsigned char and the
> size of struct sockaddr_un doesn't fit.

Yeah, it's the GCC AIX animals.  I wondered if xlc might be seeing
different structs or something but no, I tried on an AIX machine and
it just doesn't warn about that.

> Poking around a bit, I find that sun_len exists on most BSD-ish
> platforms and it seems to be unsigned char (almost?) everywhere.
> But on other platforms sizeof(struct sockaddr_un) is only a bit
> over 100, so there's not an overflow problem.
>
> It's not real clear that there's any problem on AIX either;
> given that the regression tests pass, I strongly suspect that
> nothing is paying attention to the sun_len field.  But if we're
> going to fill it in at all, we should probably try to fill it
> in with something less misleading than "1".
>
> Googling finds that this seems to be a sore spot for various
> people, eg [1] mentions the existence of the SUN_LEN() macro
> on some platforms and then says why you shouldn't use it.
>
> I'm leaning to adding a compile-time clamp on the value,
> that is
>
>         unp->sun_len = Min(sizeof(struct sockaddr_un),
>                            (1 << (sizeof(unp->sun_len) * 8)) - 1);
>
> (This might need a little bit of refinement if sun_len
> could be as wide as int, but in any case it should still
> reduce to a compile-time constant.)
>
> Or we could use something involving strlen(unp->sun_path), perhaps
> trying to use SUN_LEN() if it exists.  But that implies expending
> extra run-time cycles for strlen(), and I've seen no indication
> that there's any value here except suppressing a compiler warning.
>
> Thoughts?

Any system that has sun_len, and has expanded sun_path so that the
struct size doesn't fit in sun_len, must surely be ignoring sun_len
but retaining it for binary compatibility.  Otherwise there would be
no way to use the extra bytes of sun_path!  It's tempting to suggest
setting sun_len to zero in this case...

Huh, apparently AIX expanded sun_path in 5.3, also creating a
contradiction with sockaddr_storage and crashing PostgreSQL[1].

[1] https://www.postgresql.org/docs/8.4/installation-platform-notes.html



Re: sockaddr_un.sun_len vs. reality

From
Robert Haas
Date:
On Mon, Feb 14, 2022 at 1:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> For a very long time, the AIX 7.1 buildfarm members (sungazer and tern)
> have complained about
>
> ip.c:236:17: warning: conversion from 'long unsigned int' to 'uchar_t' {aka 'unsigned char'} changes value from
'1025'to '1' [-Woverflow]
 
>
> I'd ignored this as not being very interesting, but I got motivated to
> recheck it today.  The warning is coming from
>
> #ifdef HAVE_STRUCT_SOCKADDR_STORAGE_SS_LEN
>         unp->sun_len = sizeof(struct sockaddr_un);
> #endif
>
> so we can infer that the sun_len field is unsigned char and the
> size of struct sockaddr_un doesn't fit.
>
> Poking around a bit, I find that sun_len exists on most BSD-ish
> platforms and it seems to be unsigned char (almost?) everywhere.
> But on other platforms sizeof(struct sockaddr_un) is only a bit
> over 100, so there's not an overflow problem.
>
> It's not real clear that there's any problem on AIX either;
> given that the regression tests pass, I strongly suspect that
> nothing is paying attention to the sun_len field.  But if we're
> going to fill it in at all, we should probably try to fill it
> in with something less misleading than "1".

It's not real clear to me that it's worth complicating the code to
avoid a harmless compiler warning on an 11-year-old operating system
with minimal real-world usage. On the other hand, if you really feel
motivated to do something about it, I'm not here to argue. My one
suggestion is that if you do add some strange incantation here along
the lines you propose, you should at least add a comment explaining
that it was done to suppress a warning on AIX 7.1. That way, there's a
chance someone might be brave enough to try removing it in the future
at such time as nobody cares about AIX 7.1 any more.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: sockaddr_un.sun_len vs. reality

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Feb 14, 2022 at 1:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> For a very long time, the AIX 7.1 buildfarm members (sungazer and tern)
>> have complained about
>> ip.c:236:17: warning: conversion from 'long unsigned int' to 'uchar_t' {aka 'unsigned char'} changes value from
'1025'to '1' [-Woverflow] 

> It's not real clear to me that it's worth complicating the code to
> avoid a harmless compiler warning on an 11-year-old operating system
> with minimal real-world usage. On the other hand, if you really feel
> motivated to do something about it, I'm not here to argue.

Basically, yesterday's discussion motivated me to try to clean up some
of the stray buildfarm warnings I've been ignoring for a long time.
I agree it doesn't look like this change would have any real-world
impact.  But we do set some value on warning-free builds, if only to
save ourselves having to remember "this warning is harmless".

> My one
> suggestion is that if you do add some strange incantation here along
> the lines you propose, you should at least add a comment explaining
> that it was done to suppress a warning on AIX 7.1.

But of course.

            regards, tom lane



Re: sockaddr_un.sun_len vs. reality

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Feb 14, 2022 at 7:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm leaning to adding a compile-time clamp on the value,
>> that is
>> 
>>     unp->sun_len = Min(sizeof(struct sockaddr_un),
>>                        (1 << (sizeof(unp->sun_len) * 8)) - 1);

> Any system that has sun_len, and has expanded sun_path so that the
> struct size doesn't fit in sun_len, must surely be ignoring sun_len
> but retaining it for binary compatibility.  Otherwise there would be
> no way to use the extra bytes of sun_path!  It's tempting to suggest
> setting sun_len to zero in this case...

Yeah, I thought about doing that or just skipping the assignment
altogether.  However, we'd need just as much code, because the
change would have to be conditional on more or less this same
computation as to whether sizeof(struct sockaddr_un) fits into
the field.

            regards, tom lane



Re: sockaddr_un.sun_len vs. reality

From
Thomas Munro
Date:
On Tue, Feb 15, 2022 at 3:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
> It's not real clear to me that it's worth complicating the code to
> avoid a harmless compiler warning on an 11-year-old operating system
> with minimal real-world usage. On the other hand, if you really feel
> motivated to do something about it, I'm not here to argue. My one
> suggestion is that if you do add some strange incantation here along
> the lines you propose, you should at least add a comment explaining
> that it was done to suppress a warning on AIX 7.1. That way, there's a
> chance someone might be brave enough to try removing it in the future
> at such time as nobody cares about AIX 7.1 any more.

Just for the record, it's not just 11-year-old 7.1; 7.2 contains this
contradiction too, but our 7.2 animal doesn't complain because it's
using a different compiler.  I bet you one internet point 7.3 (which
just dropped in December, but isn't yet available for open source
hackers to scrounge an account on in the GCC farm) is the same.

By the way, speaking of open source on AIX, this distribution is
coming to an end:

http://www.bullfreeware.com/newsPage



Re: sockaddr_un.sun_len vs. reality

From
Thomas Munro
Date:
On Tue, Feb 15, 2022 at 4:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Mon, Feb 14, 2022 at 7:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I'm leaning to adding a compile-time clamp on the value,
> >> that is
> >>
> >>      unp->sun_len = Min(sizeof(struct sockaddr_un),
> >>                         (1 << (sizeof(unp->sun_len) * 8)) - 1);
>
> > Any system that has sun_len, and has expanded sun_path so that the
> > struct size doesn't fit in sun_len, must surely be ignoring sun_len
> > but retaining it for binary compatibility.  Otherwise there would be
> > no way to use the extra bytes of sun_path!  It's tempting to suggest
> > setting sun_len to zero in this case...
>
> Yeah, I thought about doing that or just skipping the assignment
> altogether.  However, we'd need just as much code, because the
> change would have to be conditional on more or less this same
> computation as to whether sizeof(struct sockaddr_un) fits into
> the field.

I was nerd-sniped by the historical context of this single line of
code.  I'd already wondered many times (not just in PostgreSQL)
whether and when that became a cargo-cult practice, replicated from
other software and older books like Stevens.  I failed to find any
sign of an OS that needs it today, or likely even needed it this
millennium.  Now I'd like to propose removing it.

I believe we have the complete set of surviving systems with sun_len.
These are the systems with sockets descended from 4.4BSD, plus AIX
when using its socket API in 4.4-compatible mode:

AIX: no sun_len if -DCOMPAT_43[1], so surely ignored by kernel!
NetBSD: manual says it's ignored[2]
OpenBSD: ditto[3]
FreeBSD: ditto[4]
DragonFlyBSD: clobbered[5] (like FreeBSD, just not documented)
macOS: ditto[6]

I know that between '88 and '97 some of these would fail with EINVAL
if sun_len was out of range.  The code is still there to do that in
some cases, but at various points in the 90s they started clobbering
it on entry to connect() and bind(), probably to ease porting pain
from Solaris and Linux.  I think it'd be pretty clear on the build
farm if it turns out I'm wrong here, because the zero-initialised
sun_len would be rejected as invalid on a hypothetical system that
didn't change that policy.  I've tested on all but AIX.

[1] https://www.postgresql.org/message-id/IKEPJJEJDCJPKMLEECEDGEIHCCAA.vvanwynsberghe%40ccncsi.net
[2] https://man.netbsd.org/unix.4
[3] https://man.openbsd.org/connect.2
[4] https://www.freebsd.org/cgi/man.cgi?connect
[5] https://github.com/DragonFlyBSD/DragonFlyBSD/blob/master/sys/kern/uipc_syscalls.c#L1516
[6] https://github.com/apple/darwin-xnu/blob/main/bsd/kern/uipc_syscalls.c#L2817

Attachment

Re: sockaddr_un.sun_len vs. reality

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> I was nerd-sniped by the historical context of this single line of
> code.  I'd already wondered many times (not just in PostgreSQL)
> whether and when that became a cargo-cult practice, replicated from
> other software and older books like Stevens.  I failed to find any
> sign of an OS that needs it today, or likely even needed it this
> millennium.  Now I'd like to propose removing it.

Seems worth a try.

            regards, tom lane



Re: sockaddr_un.sun_len vs. reality

From
Thomas Munro
Date:
On Tue, Aug 23, 2022 at 3:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > I was nerd-sniped by the historical context of this single line of
> > code.  I'd already wondered many times (not just in PostgreSQL)
> > whether and when that became a cargo-cult practice, replicated from
> > other software and older books like Stevens.  I failed to find any
> > sign of an OS that needs it today, or likely even needed it this
> > millennium.  Now I'd like to propose removing it.
>
> Seems worth a try.

Pushed, and build farm looks good.  For the benefit of anyone else
researching this topic, I should add that Stevens in fact said it's OK
to skip this, and if I had opened UNIX Network Programming (3rd ed)
volume I to page 99 I could have saved myself some time: "Even if the
length field is present, we need never set it and need never examine
it, unless we are dealing with routing sockets ...".