Re: [CORE] 7.4RC2 regression failur and not running stats collector process - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [CORE] 7.4RC2 regression failur and not running stats collector process
Date
Msg-id 5039.1068842569@sss.pgh.pa.us
Whole thread Raw
In response to Re: [CORE] 7.4RC2 regression failur and not running stats collector process  (Christopher Browne <cbbrowne@libertyrms.info>)
Responses Re: [CORE] 7.4RC2 regression failur and not running stats  ("Joshua D. Drake" <jd@commandprompt.com>)
List pgsql-hackers
Christopher Browne <cbbrowne@libertyrms.info> writes:
> For what it's worth, I have been running regression on Solaris with
> numerous of the betas, and RC1 and [just now] RC2, with NO problems.

It seems clear that some Solaris installations are affected and some
are not.  Presumably there is some version difference or some local
configuration difference ... but since we don't know what the critical
factor is, we have no basis for guessing what fraction of Solaris
installations will see the problem.

> (And in that case, I would be quick to test the patch to ensure it
> causes no adverse side-effects.)

Here is the proposed patch --- please test it ASAP if you can.
This is against RC2.

            regards, tom lane

*** src/backend/postmaster/pgstat.c.orig    Fri Nov  7 16:55:50 2003
--- src/backend/postmaster/pgstat.c    Fri Nov 14 15:02:14 2003
***************
*** 203,208 ****
--- 203,216 ----
          goto startup_failed;
      }

+     /*
+      * On some platforms, getaddrinfo_all() may return multiple addresses
+      * only one of which will actually work (eg, both IPv6 and IPv4 addresses
+      * when kernel will reject IPv6).  Worse, the failure may occur at the
+      * bind() or perhaps even connect() stage.  So we must loop through the
+      * results till we find a working combination.  We will generate LOG
+      * messages, but no error, for bogus combinations.
+      */
      for (addr = addrs; addr; addr = addr->ai_next)
      {
  #ifdef HAVE_UNIX_SOCKETS
***************
*** 210,262 ****
          if (addr->ai_family == AF_UNIX)
              continue;
  #endif
!         if ((pgStatSock = socket(addr->ai_family, SOCK_DGRAM, 0)) >= 0)
!             break;
!     }

!     if (!addr || pgStatSock < 0)
!     {
!         ereport(LOG,
!                 (errcode_for_socket_access(),
!                  errmsg("could not create socket for statistics collector: %m")));
!         goto startup_failed;
!     }

!     /*
!      * Bind it to a kernel assigned port on localhost and get the assigned
!      * port via getsockname().
!      */
!     if (bind(pgStatSock, addr->ai_addr, addr->ai_addrlen) < 0)
!     {
!         ereport(LOG,
!                 (errcode_for_socket_access(),
!                  errmsg("could not bind socket for statistics collector: %m")));
!         goto startup_failed;
!     }

!     freeaddrinfo_all(hints.ai_family, addrs);
!     addrs = NULL;

!     alen = sizeof(pgStatAddr);
!     if (getsockname(pgStatSock, (struct sockaddr *) & pgStatAddr, &alen) < 0)
!     {
!         ereport(LOG,
!                 (errcode_for_socket_access(),
!           errmsg("could not get address of socket for statistics collector: %m")));
!         goto startup_failed;
      }

!     /*
!      * Connect the socket to its own address.  This saves a few cycles by
!      * not having to respecify the target address on every send. This also
!      * provides a kernel-level check that only packets from this same
!      * address will be received.
!      */
!     if (connect(pgStatSock, (struct sockaddr *) & pgStatAddr, alen) < 0)
      {
          ereport(LOG,
                  (errcode_for_socket_access(),
!                  errmsg("could not connect socket for statistics collector: %m")));
          goto startup_failed;
      }

--- 218,285 ----
          if (addr->ai_family == AF_UNIX)
              continue;
  #endif
!         /*
!          * Create the socket.
!          */
!         if ((pgStatSock = socket(addr->ai_family, SOCK_DGRAM, 0)) < 0)
!         {
!             ereport(LOG,
!                     (errcode_for_socket_access(),
!                      errmsg("could not create socket for statistics collector: %m")));
!             continue;
!         }

!         /*
!          * Bind it to a kernel assigned port on localhost and get the assigned
!          * port via getsockname().
!          */
!         if (bind(pgStatSock, addr->ai_addr, addr->ai_addrlen) < 0)
!         {
!             ereport(LOG,
!                     (errcode_for_socket_access(),
!                      errmsg("could not bind socket for statistics collector: %m")));
!             closesocket(pgStatSock);
!             pgStatSock = -1;
!             continue;
!         }

!         alen = sizeof(pgStatAddr);
!         if (getsockname(pgStatSock, (struct sockaddr *) &pgStatAddr, &alen) < 0)
!         {
!             ereport(LOG,
!                     (errcode_for_socket_access(),
!                      errmsg("could not get address of socket for statistics collector: %m")));
!             closesocket(pgStatSock);
!             pgStatSock = -1;
!             continue;
!         }

!         /*
!          * Connect the socket to its own address.  This saves a few cycles by
!          * not having to respecify the target address on every send. This also
!          * provides a kernel-level check that only packets from this same
!          * address will be received.
!          */
!         if (connect(pgStatSock, (struct sockaddr *) &pgStatAddr, alen) < 0)
!         {
!             ereport(LOG,
!                     (errcode_for_socket_access(),
!                      errmsg("could not connect socket for statistics collector: %m")));
!             closesocket(pgStatSock);
!             pgStatSock = -1;
!             continue;
!         }

!         /* If we get here, we have a working socket */
!         break;
      }

!     /* Did we find a working address? */
!     if (!addr || pgStatSock < 0)
      {
          ereport(LOG,
                  (errcode_for_socket_access(),
!                  errmsg("disabling statistics collector for lack of working socket")));
          goto startup_failed;
      }

***************
*** 284,289 ****
--- 307,314 ----
            errmsg("could not create pipe for statistics collector: %m")));
          goto startup_failed;
      }
+
+     freeaddrinfo_all(hints.ai_family, addrs);

      return;


pgsql-hackers by date:

Previous
From: Christopher Browne
Date:
Subject: Re: [CORE] 7.4RC2 regression failur and not running stats collector process
Next
From: "Joshua D. Drake"
Date:
Subject: Re: [CORE] 7.4RC2 regression failur and not running stats