Thread: AIX 4.3 getaddrinfo busted

AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
AIX 4.3 was released in late 1999, so I thought it was worth mentioning.  I only have AIX 4.3 and 6.1, so I have no
ideahow AIX 5 handles this.  AIX 6.1 works fine.
 

Anyways, the service argument to getaddrinfo is busted on AIX 4.3, thus  src/backend/libpq/ip.c pg_getaddrinfo_all() is
bustedon this 
 
platform.  It fails with EAI_NODATA "Host not found".   If this argument 
is left NULL, everything works.

I can supply a patch to fix this.  My suggestion would be to always 
supply a NULL service to getaddrinfo.  After a successful call, set the 
port if it was provided ... htons((unsigned short)atoi(servname)).  This 
approach avoids a configure check.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Bruce Momjian
Date:
Andrew Chernow wrote:
> AIX 4.3 was released in late 1999, so I thought it was worth mentioning. 
>   I only have AIX 4.3 and 6.1, so I have no idea how AIX 5 handles this. 
>   AIX 6.1 works fine.
> 
> Anyways, the service argument to getaddrinfo is busted on AIX 4.3, thus 
>   src/backend/libpq/ip.c pg_getaddrinfo_all() is busted on this 
> platform.  It fails with EAI_NODATA "Host not found".   If this argument 
> is left NULL, everything works.
> 
> I can supply a patch to fix this.  My suggestion would be to always 
> supply a NULL service to getaddrinfo.  After a successful call, set the 
> port if it was provided ... htons((unsigned short)atoi(servname)).  This 
> approach avoids a configure check.

Why would we risk breaking other platforms to avoid an AIX bug?  At best
we can put a code comment in that section of the code.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: AIX 4.3 getaddrinfo busted

From
Christopher Browne
Date:
On Fri, Jan 23, 2009 at 9:18 AM, Andrew Chernow <ac@esilo.com> wrote:
> AIX 4.3 was released in late 1999, so I thought it was worth mentioning.  I
> only have AIX 4.3 and 6.1, so I have no idea how AIX 5 handles this.  AIX
> 6.1 works fine.
>
> Anyways, the service argument to getaddrinfo is busted on AIX 4.3, thus
>  src/backend/libpq/ip.c pg_getaddrinfo_all() is busted on this platform.  It
> fails with EAI_NODATA "Host not found".   If this argument is left NULL,
> everything works.
>
> I can supply a patch to fix this.  My suggestion would be to always supply a
> NULL service to getaddrinfo.  After a successful call, set the port if it
> was provided ... htons((unsigned short)atoi(servname)).  This approach
> avoids a configure check.

FYI:  There are AIX 5.3 nodes on BuildFarm - if the change is a
regression, it will be noticed :-).

-- 
http://linuxfinances.info/info/linuxdistributions.html
George Burns  - "You can't help getting older, but you don't have to get old."


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Bruce Momjian wrote:
> Andrew Chernow wrote:
>> AIX 4.3 was released in late 1999, so I thought it was worth mentioning. 
>>   I only have AIX 4.3 and 6.1, so I have no idea how AIX 5 handles this. 
>>   AIX 6.1 works fine.
>>
>> Anyways, the service argument to getaddrinfo is busted on AIX 4.3, thus 
>>   src/backend/libpq/ip.c pg_getaddrinfo_all() is busted on this 
>> platform.  It fails with EAI_NODATA "Host not found".   If this argument 
>> is left NULL, everything works.
>>
>> I can supply a patch to fix this.  My suggestion would be to always 
>> supply a NULL service to getaddrinfo.  After a successful call, set the 
>> port if it was provided ... htons((unsigned short)atoi(servname)).  This 
>> approach avoids a configure check.
> 
> Why would we risk breaking other platforms to avoid an AIX bug?  At best
> we can put a code comment in that section of the code.
> 

IMO, there is no risk.  getaddrinfo allows a NULL second argument on 
every platform I have worked with.  I can't see this breaking anything.  Our internal libraries do it this way for this
exactreason, and it 
 
works on a large number of platforms.

But, it can be done so that it only affects AIX machines.  There is 
already a #ifdef _AIX block in this function so we can handle all AIX 
versions as I have suggested and let everyone else run the same old code.

#ifdef _AIX
getaddrinfo(hostname, NULL /* servname */, ....);
// manually set the port
#else
getaddrinfo(hostname, servname, ....); /* same code as now */
#endif

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Christopher Browne wrote:
> FYI:  There are AIX 5.3 nodes on BuildFarm - if the change is a
> regression, it will be noticed :-).
> 

This confirms that its an isolated AIX 4.3 bug.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Bruce Momjian
Date:
Andrew Chernow wrote:
> Bruce Momjian wrote:
> > Andrew Chernow wrote:
> >> AIX 4.3 was released in late 1999, so I thought it was worth mentioning. 
> >>   I only have AIX 4.3 and 6.1, so I have no idea how AIX 5 handles this. 
> >>   AIX 6.1 works fine.
> >>
> >> Anyways, the service argument to getaddrinfo is busted on AIX 4.3, thus 
> >>   src/backend/libpq/ip.c pg_getaddrinfo_all() is busted on this 
> >> platform.  It fails with EAI_NODATA "Host not found".   If this argument 
> >> is left NULL, everything works.
> >>
> >> I can supply a patch to fix this.  My suggestion would be to always 
> >> supply a NULL service to getaddrinfo.  After a successful call, set the 
> >> port if it was provided ... htons((unsigned short)atoi(servname)).  This 
> >> approach avoids a configure check.
> > 
> > Why would we risk breaking other platforms to avoid an AIX bug?  At best
> > we can put a code comment in that section of the code.
> > 
> 
> IMO, there is no risk.  getaddrinfo allows a NULL second argument on 
> every platform I have worked with.  I can't see this breaking anything. 
>   Our internal libraries do it this way for this exact reason, and it 
> works on a large number of platforms.
> 
> But, it can be done so that it only affects AIX machines.  There is 
> already a #ifdef _AIX block in this function so we can handle all AIX 
> versions as I have suggested and let everyone else run the same old code.
> 
> #ifdef _AIX
> getaddrinfo(hostname, NULL /* servname */, ....);
> // manually set the port
> #else
> getaddrinfo(hostname, servname, ....); /* same code as now */
> #endif

Well, the platform is from 1999 and we have never had bug reports on
this, even though the code has been there for quite a few years;  I just
don't see the motivation for the change.

If you really want this platform to work, I would submit a patch that
tests for a C compiler symbol or #define that is only defined for that
platform and make service = null in that case.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: AIX 4.3 getaddrinfo busted

From
Bruce Momjian
Date:
Andrew Chernow wrote:
> Christopher Browne wrote:
> > FYI:  There are AIX 5.3 nodes on BuildFarm - if the change is a
> > regression, it will be noticed :-).
> > 
> 
> This confirms that its an isolated AIX 4.3 bug.

It confirms that the bug exists in 4.3 but not on 5.3;  not sure how you
can make assumptions about intermediate releases.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Bruce Momjian wrote:
> 
> If you really want this platform to work, I would submit a patch that
> tests for a C compiler symbol or #define that is only defined for that
> platform and make service = null in that case.
> 

I am not aware of such an animal.  I looked at the output of " touch x.c 
&& gcc -v -dM -E x.c" but didn't find anything when compared to AIX 6.1.  We could set one in configure (like
_AIXVER43)or use _AIX + uname().
 

How about checking for the failure case, where getaddrinfo returns 
EAI_NODATA on _AIX with a non-NULL servname.  If that happens, call 
getaddrinfo again with a NULL servname.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Tom Lane
Date:
Andrew Chernow <ac@esilo.com> writes:
> Bruce Momjian wrote:
>> Why would we risk breaking other platforms to avoid an AIX bug?  At best
>> we can put a code comment in that section of the code.

> IMO, there is no risk.  getaddrinfo allows a NULL second argument on 
> every platform I have worked with.

The portability risk is in the "manually set the port" part.
        regards, tom lane


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Tom Lane wrote:
> Andrew Chernow <ac@esilo.com> writes:
>> Bruce Momjian wrote:
>>> Why would we risk breaking other platforms to avoid an AIX bug?  At best
>>> we can put a code comment in that section of the code.
> 
>> IMO, there is no risk.  getaddrinfo allows a NULL second argument on 
>> every platform I have worked with.
> 
> The portability risk is in the "manually set the port" part.
> 

Right.  If this method is limited to _AIX and only when the failure case 
occurs, there are no portability issues.  I've already confirmed my fix 
works on 4.3 and 6.1.  The buildfarm will tell us if 5.3 works (very 
likely it will).

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Tom Lane
Date:
Andrew Chernow <ac@esilo.com> writes:
> Tom Lane wrote:
>> The portability risk is in the "manually set the port" part.

> Right.  If this method is limited to _AIX and only when the failure case 
> occurs, there are no portability issues.

That seems like unnecessary complexity (which carries its own risks).
I thought this sketch was about right:

#ifdef _AIX
getaddrinfo(hostname, NULL /* servname */, ....);
// manually set the port
#else
getaddrinfo(hostname, servname, ....); /* same code as now */
#endif

although please flip it around to ifndef.  The simpler and more general
case should usually be first to aid the reader.
        regards, tom lane


Re: AIX 4.3 getaddrinfo busted

From
Tom Lane
Date:
BTW, what about the comments in ip.c to the effect that some versions of
AIX fail when getaddrinfo's second argument *is* null?

Right at the moment I'm wondering why we are going to change the code
now to support a ten-year-old OS version that evidently no one has tried
to use Postgres on before.
        regards, tom lane


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Tom Lane wrote:
> BTW, what about the comments in ip.c to the effect that some versions of
> AIX fail when getaddrinfo's second argument *is* null?
> 

For starters, it indicates that sin_port is not zero'd properly.  That 
wouldn't matter here since the plan is to manually set the port in this 
case, since providing it to getaddrinfo is broken.

Also, this is why I suggested only doing a NULL 2nd arg within the 
failure case.

rc = getaddrinfo(hostname, service, ....);

#ifdef _AIX
if(rc == EAI_NODATA && servname && *servname)  if(!(rc = getaddrinfo(hostname, NULL, ....))) /* try again */    /* set
sin_portor sin6_port */
 
#endif

// code continues as is from here down.

Or, set a define in configure indicating aix 4.3 or indicating the 2nd 
arg to getaddrinfo must be NULL (the former being much easier).  I don't 
think we have to resort to configure checks though.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Merlin Moncure
Date:
On 1/23/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  Right at the moment I'm wondering why we are going to change the code
>  now to support a ten-year-old OS version that evidently no one has tried
>  to use Postgres on before.

I'd like to address this observation.  You may have noticed that eSilo
has been contributing a number of patches to Postgres involving legacy
systems.  Understandably, there is very little overlap between modern
versions of Postgres and these aging unixen.  However, quite a few of
these systems are still in production serving legacy applications.

eSilo provides backup software.  Based on customer feedback we came up
with a list of systems that are under-represented by current backup
solutions in the industry .  For various reasons, we decided to
involve libpq in our backup client and market aggressively to
platforms for which there are very few backup options.  libpq is quite
portable, but there are a few understandable nits that have popped up
here and there over time for older systems.  We are providing fixes
for those nits to the community.

merlin


Re: AIX 4.3 getaddrinfo busted

From
Tom Lane
Date:
Andrew Chernow <ac@esilo.com> writes:
> Tom Lane wrote:
>> BTW, what about the comments in ip.c to the effect that some versions of
>> AIX fail when getaddrinfo's second argument *is* null?

> For starters, it indicates that sin_port is not zero'd properly.  That 
> wouldn't matter here since the plan is to manually set the port in this 
> case, since providing it to getaddrinfo is broken.

Hmm ... so actually we could get *rid* of that special case if we added
this one.  Okay, I withdraw the complaint.  We should simply not rely on
getaddrinfo to do anything right at all w.r.t. the port if we are
running on AIX.  Pass NULL for servname and set the port ourselves in
all cases.
        regards, tom lane


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Tom Lane wrote:
> Andrew Chernow <ac@esilo.com> writes:
>> Tom Lane wrote:
>>> BTW, what about the comments in ip.c to the effect that some versions of
>>> AIX fail when getaddrinfo's second argument *is* null?
>
>> For starters, it indicates that sin_port is not zero'd properly.  That
>> wouldn't matter here since the plan is to manually set the port in this
>> case, since providing it to getaddrinfo is broken.
>
> Hmm ... so actually we could get *rid* of that special case if we added
> this one.  Okay, I withdraw the complaint.  We should simply not rely on
> getaddrinfo to do anything right at all w.r.t. the port if we are
> running on AIX.  Pass NULL for servname and set the port ourselves in
> all cases.
>
>             regards, tom lane
>

Done.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/
Index: src/backend/libpq/ip.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/libpq/ip.c,v
retrieving revision 1.43
diff -C6 -r1.43 ip.c
*** src/backend/libpq/ip.c    1 Jan 2009 17:23:42 -0000    1.43
--- src/backend/libpq/ip.c    23 Jan 2009 19:14:07 -0000
***************
*** 71,112 ****

  #ifdef HAVE_UNIX_SOCKETS
      if (hintp->ai_family == AF_UNIX)
          return getaddrinfo_unix(servname, hintp, result);
  #endif

      /* NULL has special meaning to getaddrinfo(). */
      rc = getaddrinfo((!hostname || hostname[0] == '\0') ? NULL : hostname,
                       servname, hintp, result);

! #ifdef _AIX

      /*
       * It seems some versions of AIX's getaddrinfo don't reliably zero
!      * sin_port when servname is NULL, so clean up after it.
       */
!     if (servname == NULL && rc == 0)
      {
          struct addrinfo *addr;

          for (addr = *result; addr; addr = addr->ai_next)
          {
              switch (addr->ai_family)
              {
                  case AF_INET:
!                     ((struct sockaddr_in *) addr->ai_addr)->sin_port = htons(0);
                      break;
  #ifdef HAVE_IPV6
                  case AF_INET6:
!                     ((struct sockaddr_in6 *) addr->ai_addr)->sin6_port = htons(0);
                      break;
  #endif
              }
          }
      }
! #endif

      return rc;
  }


  /*
--- 71,124 ----

  #ifdef HAVE_UNIX_SOCKETS
      if (hintp->ai_family == AF_UNIX)
          return getaddrinfo_unix(servname, hintp, result);
  #endif

+ #ifndef _AIX
      /* NULL has special meaning to getaddrinfo(). */
      rc = getaddrinfo((!hostname || hostname[0] == '\0') ? NULL : hostname,
                       servname, hintp, result);

! #else
!     /* NULL hostname has special meaning to getaddrinfo().  We have to
!      * set servname to NULL because some AIX versions, like 4.3, always
!      * fail with EAI_NODATA if not NULL.
!    */
!     rc = getaddrinfo((!hostname || hostname[0] == '\0') ? NULL : hostname,
!                      NULL, hintp, result);

      /*
       * It seems some versions of AIX's getaddrinfo don't reliably zero
!      * sin_port when servname is NULL, so clean up after it.  Also,
!      * manually set the port when servname is provided.
       */
!     if(rc == 0)
      {
          struct addrinfo *addr;
+         unsigned short port = 0;
+
+         if(servname && *servname)
+             port = htons((unsigned short)atoi(servname));

          for (addr = *result; addr; addr = addr->ai_next)
          {
              switch (addr->ai_family)
              {
                  case AF_INET:
!                     ((struct sockaddr_in *) addr->ai_addr)->sin_port = port;
                      break;
  #ifdef HAVE_IPV6
                  case AF_INET6:
!                     ((struct sockaddr_in6 *) addr->ai_addr)->sin6_port = port;
                      break;
  #endif
              }
          }
      }
! #endif /* !_AIX */

      return rc;
  }


  /*

Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Merlin Moncure wrote:
> On 1/23/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>  Right at the moment I'm wondering why we are going to change the code
>>  now to support a ten-year-old OS version that evidently no one has tried
>>  to use Postgres on before.
> 
> I'd like to address this observation.  You may have noticed that eSilo
> has been contributing a number of patches to Postgres involving legacy
> systems.  Understandably, there is very little overlap between modern
> versions of Postgres and these aging unixen.  However, quite a few of
> these systems are still in production serving legacy applications.
> 
> eSilo provides backup software.  Based on customer feedback we came up
> with a list of systems that are under-represented by current backup
> solutions in the industry .  For various reasons, we decided to
> involve libpq in our backup client and market aggressively to
> platforms for which there are very few backup options.  libpq is quite
> portable, but there are a few understandable nits that have popped up
> here and there over time for older systems.  We are providing fixes
> for those nits to the community.
> 
> merlin
> 

I will add that its not our hobby to find obscure libpq and/or platform 
bugs on fossil machines.  Actually, its a rather horrible and 
frustrating task.  We simply have some unusual requirements for our new 
product.  What we found amazing is how many requests we get for these 
old systems.  We are just painfully filling that need.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Tom Lane
Date:
Andrew Chernow <ac@esilo.com> writes:
> Tom Lane wrote:
>> Hmm ... so actually we could get *rid* of that special case if we added
>> this one.  Okay, I withdraw the complaint.

> Done.

Were you hoping this would get back-patched, and if so how far?
        regards, tom lane


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Tom Lane wrote:
> Andrew Chernow <ac@esilo.com> writes:
>> Tom Lane wrote:
>>> Hmm ... so actually we could get *rid* of that special case if we added
>>> this one.  Okay, I withdraw the complaint.
> 
>> Done.
> 
> Were you hoping this would get back-patched, and if so how far?
> 
> 

No, we don't need a back-patch.  We need way too many features in 8.4 
... like the really amazing libpq-events feature :)

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/


Re: AIX 4.3 getaddrinfo busted

From
Tom Lane
Date:
Andrew Chernow <ac@esilo.com> writes:
> Tom Lane wrote:
>> Were you hoping this would get back-patched, and if so how far?

> No, we don't need a back-patch.  We need way too many features in 8.4 
> ... like the really amazing libpq-events feature :)

OK, applied to HEAD with some tiny editorialization.
        regards, tom lane


Re: AIX 4.3 getaddrinfo busted

From
Bruce Momjian
Date:
Merlin Moncure wrote:
> On 1/23/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >  Right at the moment I'm wondering why we are going to change the code
> >  now to support a ten-year-old OS version that evidently no one has tried
> >  to use Postgres on before.
> 
> I'd like to address this observation.  You may have noticed that eSilo
> has been contributing a number of patches to Postgres involving legacy
> systems.  Understandably, there is very little overlap between modern
> versions of Postgres and these aging unixen.  However, quite a few of
> these systems are still in production serving legacy applications.
> 
> eSilo provides backup software.  Based on customer feedback we came up
> with a list of systems that are under-represented by current backup
> solutions in the industry .  For various reasons, we decided to
> involve libpq in our backup client and market aggressively to
> platforms for which there are very few backup options.  libpq is quite
> portable, but there are a few understandable nits that have popped up
> here and there over time for older systems.  We are providing fixes
> for those nits to the community.

Well, this helps explain why were are getting these problems reports
only now.  How many hacks do you have that we don't support, aside from
the threading one for HPUX?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: AIX 4.3 getaddrinfo busted

From
Andrew Chernow
Date:
Bruce Momjian wrote:
> 
> Well, this helps explain why were are getting these problems reports
> only now.  How many hacks do you have that we don't support, aside from
> the threading one for HPUX?
> 

We submit them as we find them.  We've submitted for windows, hpux, 
solaris and aix.  Still have not tested SCO openserver and unixware yet. 
The only thing we have that we have not submitted, are features we don't 
feel are of general interest: query white list system and bandwidth 
throttling in the backend on a per session basis.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/