Thread: [HACKERS] Upgrading postmaster's log messages about bind/listen errors

[HACKERS] Upgrading postmaster's log messages about bind/listen errors

From
Tom Lane
Date:
Over in
https://www.postgresql.org/message-id/flat/201703072317.01345.john.iliffe%40iliffe.ca
we spent quite a lot of effort to diagnose what turned out to be a simple
networking misconfiguration.  It would probably have taken a lot less
effort if the postmaster were more forthcoming about exactly what address
it's trying to bind to.  I seem to recall having wanted to include that
info in the messages many years ago, but at the time we lacked any
reasonably-portable way to decode a struct addrinfo.  Now we have
pg_getnameinfo_all(), so PFA a patch to include the specific address in
any complaint about failures in the socket/bind/listen sequence.

For good measure I also added a DEBUG1 log message reporting successful
binding to a port.  I'm not sure if there's an argument for putting this
out at LOG level (i.e. by default) --- any thoughts about that?

There are probably a couple of example messages in the SGML docs that
would need to be updated, but I've not trawled for them yet.

            regards, tom lane

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 7939b1f..3f3b2f2 100644
*** a/src/backend/libpq/pqcomm.c
--- b/src/backend/libpq/pqcomm.c
*************** StreamServerPort(int family, char *hostN
*** 319,324 ****
--- 319,326 ----
      char        portNumberStr[32];
      const char *familyDesc;
      char        familyDescBuf[64];
+     const char *addrDesc;
+     char        addrBuf[NI_MAXHOST];
      char       *service;
      struct addrinfo *addrs = NULL,
                 *addr;
*************** StreamServerPort(int family, char *hostN
*** 407,413 ****
              break;
          }

!         /* set up family name for possible error messages */
          switch (addr->ai_family)
          {
              case AF_INET:
--- 409,415 ----
              break;
          }

!         /* set up address family name for log messages */
          switch (addr->ai_family)
          {
              case AF_INET:
*************** StreamServerPort(int family, char *hostN
*** 431,443 ****
                  break;
          }

          if ((fd = socket(addr->ai_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
          {
              ereport(LOG,
                      (errcode_for_socket_access(),
!             /* translator: %s is IPv4, IPv6, or Unix */
!                      errmsg("could not create %s socket: %m",
!                             familyDesc)));
              continue;
          }

--- 433,460 ----
                  break;
          }

+         /* set up text form of address for log messages */
+ #ifdef HAVE_UNIX_SOCKETS
+         if (addr->ai_family == AF_UNIX)
+             addrDesc = unixSocketPath;
+         else
+ #endif
+         {
+             pg_getnameinfo_all((const struct sockaddr_storage *) addr->ai_addr,
+                                addr->ai_addrlen,
+                                addrBuf, sizeof(addrBuf),
+                                NULL, 0,
+                                NI_NUMERICHOST);
+             addrDesc = addrBuf;
+         }
+
          if ((fd = socket(addr->ai_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
          {
              ereport(LOG,
                      (errcode_for_socket_access(),
!             /* translator: first %s is IPv4, IPv6, or Unix */
!                   errmsg("could not create %s socket for address \"%s\": %m",
!                          familyDesc, addrDesc)));
              continue;
          }

*************** StreamServerPort(int family, char *hostN
*** 461,467 ****
              {
                  ereport(LOG,
                          (errcode_for_socket_access(),
!                          errmsg("setsockopt(SO_REUSEADDR) failed: %m")));
                  closesocket(fd);
                  continue;
              }
--- 478,486 ----
              {
                  ereport(LOG,
                          (errcode_for_socket_access(),
!                 /* translator: first %s is IPv4, IPv6, or Unix */
!                          errmsg("setsockopt(SO_REUSEADDR) failed for %s address \"%s\": %m",
!                                 familyDesc, addrDesc)));
                  closesocket(fd);
                  continue;
              }
*************** StreamServerPort(int family, char *hostN
*** 476,482 ****
              {
                  ereport(LOG,
                          (errcode_for_socket_access(),
!                          errmsg("setsockopt(IPV6_V6ONLY) failed: %m")));
                  closesocket(fd);
                  continue;
              }
--- 495,503 ----
              {
                  ereport(LOG,
                          (errcode_for_socket_access(),
!                 /* translator: first %s is IPv4, IPv6, or Unix */
!                          errmsg("setsockopt(IPV6_V6ONLY) failed for %s address \"%s\": %m",
!                                 familyDesc, addrDesc)));
                  closesocket(fd);
                  continue;
              }
*************** StreamServerPort(int family, char *hostN
*** 494,502 ****
          {
              ereport(LOG,
                      (errcode_for_socket_access(),
!             /* translator: %s is IPv4, IPv6, or Unix */
!                      errmsg("could not bind %s socket: %m",
!                             familyDesc),
                       (IS_AF_UNIX(addr->ai_family)) ?
                    errhint("Is another postmaster already running on port %d?"
                            " If not, remove socket file \"%s\" and retry.",
--- 515,523 ----
          {
              ereport(LOG,
                      (errcode_for_socket_access(),
!             /* translator: first %s is IPv4, IPv6, or Unix */
!                      errmsg("could not bind %s address \"%s\": %m",
!                             familyDesc, addrDesc),
                       (IS_AF_UNIX(addr->ai_family)) ?
                    errhint("Is another postmaster already running on port %d?"
                            " If not, remove socket file \"%s\" and retry.",
*************** StreamServerPort(int family, char *hostN
*** 533,544 ****
          {
              ereport(LOG,
                      (errcode_for_socket_access(),
!             /* translator: %s is IPv4, IPv6, or Unix */
!                      errmsg("could not listen on %s socket: %m",
!                             familyDesc)));
              closesocket(fd);
              continue;
          }
          ListenSocket[listen_index] = fd;
          added++;
      }
--- 554,571 ----
          {
              ereport(LOG,
                      (errcode_for_socket_access(),
!             /* translator: first %s is IPv4, IPv6, or Unix */
!                      errmsg("could not listen on %s address \"%s\": %m",
!                             familyDesc, addrDesc)));
              closesocket(fd);
              continue;
          }
+
+         ereport(DEBUG1,
+         /* translator: first %s is IPv4, IPv6, or Unix */
+                 (errmsg("listening on %s address \"%s\"",
+                         familyDesc, addrDesc)));
+
          ListenSocket[listen_index] = fd;
          added++;
      }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading postmaster's log messages about bind/listenerrors

From
Joe Conway
Date:
On 03/09/2017 12:27 PM, Tom Lane wrote:
> Over in
> https://www.postgresql.org/message-id/flat/201703072317.01345.john.iliffe%40iliffe.ca
> we spent quite a lot of effort to diagnose what turned out to be a simple
> networking misconfiguration.  It would probably have taken a lot less
> effort if the postmaster were more forthcoming about exactly what address
> it's trying to bind to.  I seem to recall having wanted to include that
> info in the messages many years ago, but at the time we lacked any
> reasonably-portable way to decode a struct addrinfo.  Now we have
> pg_getnameinfo_all(), so PFA a patch to include the specific address in
> any complaint about failures in the socket/bind/listen sequence.
>
> For good measure I also added a DEBUG1 log message reporting successful
> binding to a port.  I'm not sure if there's an argument for putting this
> out at LOG level (i.e. by default) --- any thoughts about that?

+1 for making it LOG instead of DEBUG1

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Re: [HACKERS] Upgrading postmaster's log messages about bind/listen errors

From
Robert Haas
Date:
On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway <mail@joeconway.com> wrote:
> On 03/09/2017 12:27 PM, Tom Lane wrote:
>> Over in
>> https://www.postgresql.org/message-id/flat/201703072317.01345.john.iliffe%40iliffe.ca
>> we spent quite a lot of effort to diagnose what turned out to be a simple
>> networking misconfiguration.  It would probably have taken a lot less
>> effort if the postmaster were more forthcoming about exactly what address
>> it's trying to bind to.  I seem to recall having wanted to include that
>> info in the messages many years ago, but at the time we lacked any
>> reasonably-portable way to decode a struct addrinfo.  Now we have
>> pg_getnameinfo_all(), so PFA a patch to include the specific address in
>> any complaint about failures in the socket/bind/listen sequence.
>>
>> For good measure I also added a DEBUG1 log message reporting successful
>> binding to a port.  I'm not sure if there's an argument for putting this
>> out at LOG level (i.e. by default) --- any thoughts about that?
>
> +1 for making it LOG instead of DEBUG1

I would tend to vote against that, because startup is getting
gradually chattier and chattier, and I think this isn't likely to be
of interest to very many people most of the time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway <mail@joeconway.com> wrote:
>> On 03/09/2017 12:27 PM, Tom Lane wrote:
>>> For good measure I also added a DEBUG1 log message reporting successful
>>> binding to a port.  I'm not sure if there's an argument for putting this
>>> out at LOG level (i.e. by default) --- any thoughts about that?

>> +1 for making it LOG instead of DEBUG1

> I would tend to vote against that, because startup is getting
> gradually chattier and chattier, and I think this isn't likely to be
> of interest to very many people most of the time.

Yeah, my thought was that if we've gotten along without this for 20 years,
it's probably not of interest to most people most of the time.

However, if we're measuring this on a scale of usefulness to the average
DBA, I would argue that it's of more interest than any of these messages
that currently appear by default:

2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound protections are now enabled
2017-03-09 23:40:12.335 EST [19339] LOG:  autovacuum launcher started
2017-03-09 23:40:12.336 EST [19341] LOG:  logical replication launcher started

The first of those is surely past its sell-by date.  As for the other two,
we should log *failure* to start, but not the normal case.
        regards, tom lane



Moin,

On Thu, March 9, 2017 11:43 pm, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway <mail@joeconway.com> wrote:
>>> On 03/09/2017 12:27 PM, Tom Lane wrote:
>>>> For good measure I also added a DEBUG1 log message reporting
>>>> successful
>>>> binding to a port.  I'm not sure if there's an argument for putting
>>>> this
>>>> out at LOG level (i.e. by default) --- any thoughts about that?
>
>>> +1 for making it LOG instead of DEBUG1
>
>> I would tend to vote against that, because startup is getting
>> gradually chattier and chattier, and I think this isn't likely to be
>> of interest to very many people most of the time.
>
> Yeah, my thought was that if we've gotten along without this for 20 years,
> it's probably not of interest to most people most of the time.
>
> However, if we're measuring this on a scale of usefulness to the average
> DBA, I would argue that it's of more interest than any of these messages
> that currently appear by default:

My 0.02$:

I'd argue that from a security standpoint it is important to log at
startup what addresses the service binds to, just so it is visible,
explicit and logged.

Especially on machines with multiple interfaces to multiple networks it
can be confusing, see ipv6 vs ipv4, or bound interfaces with multiple
hosts and switches.

Granted, there should be firewall rules preventing access, but
misconfigurations, or simple changes can happen and go unnoticed. If later
the postmaster bind address changes, maybe due to an update or human
error,  you got the stars aligned just right for an unauthorized access.

OTOH, that the "logical replication launcher started" isn't really useful
to know to me as a user, I'd rather know when it failed to launch.

Best regards,

Tels



Re: [HACKERS] Upgrading postmaster's log messages about bind/listenerrors

From
Stephen Frost
Date:
Greetings,

* Tels (nospam-pg-abuse@bloodgate.com) wrote:
> On Thu, March 9, 2017 11:43 pm, Tom Lane wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway <mail@joeconway.com> wrote:
> >>> +1 for making it LOG instead of DEBUG1
> >
> >> I would tend to vote against that, because startup is getting
> >> gradually chattier and chattier, and I think this isn't likely to be
> >> of interest to very many people most of the time.
> >
> > Yeah, my thought was that if we've gotten along without this for 20 years,
> > it's probably not of interest to most people most of the time.
> >
> > However, if we're measuring this on a scale of usefulness to the average
> > DBA, I would argue that it's of more interest than any of these messages
> > that currently appear by default:
>
> My 0.02$:
>
> I'd argue that from a security standpoint it is important to log at
> startup what addresses the service binds to, just so it is visible,
> explicit and logged.

It's also terribly useful for realizing there's an issue.  I'd
definitely like to see what address we're binding to on startup in the
log and I agree that the other messages mentioned aren't nearly as
useful to the DBA.

Perhaps we could compromise by simply including the bind information in
the 'ready to accept connections' message, like so:

database system is ready to accept connections on (1.2.3.4)

Or something along those lines?

> Granted, there should be firewall rules preventing access, but
> misconfigurations, or simple changes can happen and go unnoticed. If later
> the postmaster bind address changes, maybe due to an update or human
> error,  you got the stars aligned just right for an unauthorized access.

I was helping someone with a "connection refused" message just this week
where it would have been quite helpful if the log had said which IP we
were binding on.

> OTOH, that the "logical replication launcher started" isn't really useful
> to know to me as a user, I'd rather know when it failed to launch.

Agreed.

Thanks!

Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Tels (nospam-pg-abuse@bloodgate.com) wrote:
>> I'd argue that from a security standpoint it is important to log at
>> startup what addresses the service binds to, just so it is visible,
>> explicit and logged.

> It's also terribly useful for realizing there's an issue.

Good points both.

> Perhaps we could compromise by simply including the bind information in
> the 'ready to accept connections' message, like so:

> database system is ready to accept connections on (1.2.3.4)

That would be a problem from a couple of directions.  First, it wouldn't
be unusual for there to be half a dozen addresses to list, not just one.
Even a default configuration would probably read like

database system is ready to accept connections on (127.0.0.1, ::1, /tmp/.s.PGSQL.5432)

which doesn't seem very appetizing to me.  Second, it would be
considerably messier to implement because the "ready to accept
connections" message comes out physically distant from the
StreamServerPort function, and we don't save the struct addrinfo list
past the end of that function.  So I think the logging setup I had in
my patch is pretty much the only sane way to do it, and we just have
to decide whether it's worth exposing at default log level or not.
        regards, tom lane



Re: [HACKERS] Upgrading postmaster's log messages about bind/listen errors

From
Euler Taveira
Date:
2017-03-10 1:43 GMT-03:00 Tom Lane <tgl@sss.pgh.pa.us>:
Yeah, my thought was that if we've gotten along without this for 20 years,
it's probably not of interest to most people most of the time.

+1 for DEBUG1.
 
2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound protections are now enabled
It should be DEBUG1 as soon as 9.3 is deprecated.
 
2017-03-09 23:40:12.335 EST [19339] LOG:  autovacuum launcher started
2017-03-09 23:40:12.336 EST [19341] LOG:  logical replication launcher started

+1 for DEBUG1.


--
   Euler Taveira                                   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: [HACKERS] Upgrading postmaster's log messages about bind/listenerrors

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > database system is ready to accept connections on (1.2.3.4)
>
> That would be a problem from a couple of directions.  First, it wouldn't
> be unusual for there to be half a dozen addresses to list, not just one.
> Even a default configuration would probably read like
>
> database system is ready to accept connections on (127.0.0.1, ::1, /tmp/.s.PGSQL.5432)

Yeah, that's probably a bit much to have all on one line.

> which doesn't seem very appetizing to me.  Second, it would be
> considerably messier to implement because the "ready to accept
> connections" message comes out physically distant from the
> StreamServerPort function, and we don't save the struct addrinfo list
> past the end of that function.  So I think the logging setup I had in
> my patch is pretty much the only sane way to do it, and we just have
> to decide whether it's worth exposing at default log level or not.

I definitely think we should include it at the default log level.  We
certainly wouldn't be the first daemon process to do so (bind9 comes to
mind, but I notice ntpd, nrpe, and strongswan do also, and probably some
others).

Thanks!

Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> ... So I think the logging setup I had in
>> my patch is pretty much the only sane way to do it, and we just have
>> to decide whether it's worth exposing at default log level or not.

> I definitely think we should include it at the default log level.  We
> certainly wouldn't be the first daemon process to do so (bind9 comes to
> mind, but I notice ntpd, nrpe, and strongswan do also, and probably some
> others).

I'm leaning in that direction as well now.  I think we could address
Robert's concern about startup chattiness by downgrading the other
mentioned messages to DEBUG1.  I will check, but I'm pretty sure that
there is already adequate logging for subprocess startup failure ---
and if there is not, that would be a bug in itself.
        regards, tom lane



Re: [HACKERS] Upgrading postmaster's log messages about bind/listenerrors

From
Peter Eisentraut
Date:
On 3/9/17 23:43, Tom Lane wrote:
> However, if we're measuring this on a scale of usefulness to the average
> DBA, I would argue that it's of more interest than any of these messages
> that currently appear by default:
> 
> 2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound protections are now enabled
> 2017-03-09 23:40:12.335 EST [19339] LOG:  autovacuum launcher started
> 2017-03-09 23:40:12.336 EST [19341] LOG:  logical replication launcher started
> 
> The first of those is surely past its sell-by date.  As for the other two,
> we should log *failure* to start, but not the normal case.

I'm OK with removing these.  No news is good news.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Euler Taveira <euler@timbira.com.br> writes:
> 2017-03-10 1:43 GMT-03:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> 2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound
>> protections are now enabled

> It should be DEBUG1 as soon as 9.3 is deprecated.

Uh, what's that got to do with it?  I'm not proposing to downgrade this
message in 9.3, or indeed any current release branch.  But it's hard
to believe that a 9.3 installation that had the problem would manage
to make it all the way to a v10 upgrade without the problem having been
fixed.  Even if there's one or two incompetent DBAs out there whom
that would apply to, I don't think the rest of the world needs to keep
seeing this message by default.
        regards, tom lane



Re: [HACKERS] Upgrading postmaster's log messages about bind/listenerrors

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Euler Taveira <euler@timbira.com.br> writes:
> > 2017-03-10 1:43 GMT-03:00 Tom Lane <tgl@sss.pgh.pa.us>:
> >> 2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound
> >> protections are now enabled
> 
> > It should be DEBUG1 as soon as 9.3 is deprecated.
> 
> Uh, what's that got to do with it?  I'm not proposing to downgrade this
> message in 9.3, or indeed any current release branch.  But it's hard
> to believe that a 9.3 installation that had the problem would manage
> to make it all the way to a v10 upgrade without the problem having been
> fixed.  Even if there's one or two incompetent DBAs out there whom
> that would apply to, I don't think the rest of the world needs to keep
> seeing this message by default.

Well, you could take a broken 9.3 installation and pg_upgrade it to
pg10.  It wouldn't be any less broken.

There's still people running buggy 9.3 releases, as we see in these
lists every now and then.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Uh, what's that got to do with it?  I'm not proposing to downgrade this
>> message in 9.3, or indeed any current release branch.  But it's hard
>> to believe that a 9.3 installation that had the problem would manage
>> to make it all the way to a v10 upgrade without the problem having been
>> fixed.  Even if there's one or two incompetent DBAs out there whom
>> that would apply to, I don't think the rest of the world needs to keep
>> seeing this message by default.

> Well, you could take a broken 9.3 installation and pg_upgrade it to
> pg10.  It wouldn't be any less broken.

Sure, but does the rest of the world need to keep looking at this message?
Anybody who's actually in that situation probably never looks at the
postmaster log at all.

I think that what would actually be of some use nowadays is a LOG-level
message emitted if the wraparound *isn't* activated immediately at start.
But otherwise, we should follow the rule that silence is golden.
        regards, tom lane



I wrote:
> I think that what would actually be of some use nowadays is a LOG-level
> message emitted if the wraparound *isn't* activated immediately at start.
> But otherwise, we should follow the rule that silence is golden.

Concretely, how about the attached?  It preserves the original
"protections are now enabled" message at LOG level, but emits it only
when oldestOffsetKnown becomes true *after* startup.  Meanwhile, if
oldestOffsetKnown is still not true at the conclusion of TrimMultiXact,
then it emits a new LOG message about "protections are not active".
In this way we have LOG messages but they're only emitted in "interesting"
cases.

I dropped the IsUnderPostmaster test because I see no good reason not
to warn in standalone backends as well.

I think this might actually be a reasonable candidate to back-patch,
because a deficiency of the existing code is that it fails to warn
you when something's wrong.  But in any case I'd like to put it in HEAD.

            regards, tom lane

diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 59d1252..3f66cdb 100644
*** a/src/backend/access/transam/multixact.c
--- b/src/backend/access/transam/multixact.c
*************** static void ExtendMultiXactOffset(MultiX
*** 363,369 ****
  static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
  static bool MultiXactOffsetWouldWrap(MultiXactOffset boundary,
                           MultiXactOffset start, uint32 distance);
! static bool SetOffsetVacuumLimit(void);
  static bool find_multixact_start(MultiXactId multi, MultiXactOffset *result);
  static void WriteMZeroPageXlogRec(int pageno, uint8 info);
  static void WriteMTruncateXlogRec(Oid oldestMultiDB,
--- 363,369 ----
  static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
  static bool MultiXactOffsetWouldWrap(MultiXactOffset boundary,
                           MultiXactOffset start, uint32 distance);
! static bool SetOffsetVacuumLimit(bool is_startup);
  static bool find_multixact_start(MultiXactId multi, MultiXactOffset *result);
  static void WriteMZeroPageXlogRec(int pageno, uint8 info);
  static void WriteMTruncateXlogRec(Oid oldestMultiDB,
*************** TrimMultiXact(void)
*** 2095,2101 ****
      LWLockRelease(MultiXactGenLock);

      /* Now compute how far away the next members wraparound is. */
!     SetMultiXactIdLimit(oldestMXact, oldestMXactDB);
  }

  /*
--- 2095,2101 ----
      LWLockRelease(MultiXactGenLock);

      /* Now compute how far away the next members wraparound is. */
!     SetMultiXactIdLimit(oldestMXact, oldestMXactDB, true);
  }

  /*
*************** MultiXactSetNextMXact(MultiXactId nextMu
*** 2186,2194 ****
   * Determine the last safe MultiXactId to allocate given the currently oldest
   * datminmxid (ie, the oldest MultiXactId that might exist in any database
   * of our cluster), and the OID of the (or a) database with that value.
   */
  void
! SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid)
  {
      MultiXactId multiVacLimit;
      MultiXactId multiWarnLimit;
--- 2186,2198 ----
   * Determine the last safe MultiXactId to allocate given the currently oldest
   * datminmxid (ie, the oldest MultiXactId that might exist in any database
   * of our cluster), and the OID of the (or a) database with that value.
+  *
+  * is_startup is true when we are just starting the cluster, false when we
+  * are updating state in a running cluster.  This only affects log messages.
   */
  void
! SetMultiXactIdLimit(MultiXactId oldest_datminmxid, Oid oldest_datoid,
!                     bool is_startup)
  {
      MultiXactId multiVacLimit;
      MultiXactId multiWarnLimit;
*************** SetMultiXactIdLimit(MultiXactId oldest_d
*** 2277,2283 ****
      Assert(!InRecovery);

      /* Set limits for offset vacuum. */
!     needs_offset_vacuum = SetOffsetVacuumLimit();

      /*
       * If past the autovacuum force point, immediately signal an autovac
--- 2281,2287 ----
      Assert(!InRecovery);

      /* Set limits for offset vacuum. */
!     needs_offset_vacuum = SetOffsetVacuumLimit(is_startup);

      /*
       * If past the autovacuum force point, immediately signal an autovac
*************** MultiXactAdvanceOldest(MultiXactId oldes
*** 2370,2376 ****
      Assert(InRecovery);

      if (MultiXactIdPrecedes(MultiXactState->oldestMultiXactId, oldestMulti))
!         SetMultiXactIdLimit(oldestMulti, oldestMultiDB);
  }

  /*
--- 2374,2380 ----
      Assert(InRecovery);

      if (MultiXactIdPrecedes(MultiXactState->oldestMultiXactId, oldestMulti))
!         SetMultiXactIdLimit(oldestMulti, oldestMultiDB, false);
  }

  /*
*************** GetOldestMultiXactId(void)
*** 2537,2543 ****
   * otherwise.
   */
  static bool
! SetOffsetVacuumLimit(void)
  {
      MultiXactId oldestMultiXactId;
      MultiXactId nextMXact;
--- 2541,2547 ----
   * otherwise.
   */
  static bool
! SetOffsetVacuumLimit(bool is_startup)
  {
      MultiXactId oldestMultiXactId;
      MultiXactId nextMXact;
*************** SetOffsetVacuumLimit(void)
*** 2619,2627 ****
          /* always leave one segment before the wraparound point */
          offsetStopLimit -= (MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT);

!         if (!prevOldestOffsetKnown && IsUnderPostmaster)
              ereport(LOG,
                      (errmsg("MultiXact member wraparound protections are now enabled")));
          ereport(DEBUG1,
          (errmsg("MultiXact member stop limit is now %u based on MultiXact %u",
                  offsetStopLimit, oldestMultiXactId)));
--- 2623,2632 ----
          /* always leave one segment before the wraparound point */
          offsetStopLimit -= (MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT);

!         if (!prevOldestOffsetKnown && !is_startup)
              ereport(LOG,
                      (errmsg("MultiXact member wraparound protections are now enabled")));
+
          ereport(DEBUG1,
          (errmsg("MultiXact member stop limit is now %u based on MultiXact %u",
                  offsetStopLimit, oldestMultiXactId)));
*************** SetOffsetVacuumLimit(void)
*** 2639,2644 ****
--- 2644,2654 ----
          offsetStopLimit = prevOffsetStopLimit;
      }

+     /* Warn at startup if wraparound protection isn't active */
+     if (is_startup && !oldestOffsetKnown)
+         ereport(LOG,
+          (errmsg("MultiXact member wraparound protections are not active")));
+
      /* Install the computed values */
      LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
      MultiXactState->oldestOffset = oldestOffset;
*************** multixact_redo(XLogReaderState *record)
*** 3312,3318 ****
           * Advance the horizon values, so they're current at the end of
           * recovery.
           */
!         SetMultiXactIdLimit(xlrec.endTruncOff, xlrec.oldestMultiDB);

          PerformMembersTruncation(xlrec.startTruncMemb, xlrec.endTruncMemb);

--- 3322,3328 ----
           * Advance the horizon values, so they're current at the end of
           * recovery.
           */
!         SetMultiXactIdLimit(xlrec.endTruncOff, xlrec.oldestMultiDB, false);

          PerformMembersTruncation(xlrec.startTruncMemb, xlrec.endTruncMemb);

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 744360c..299ce30 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*************** BootStrapXLOG(void)
*** 4995,5001 ****
      ShmemVariableCache->oidCount = 0;
      MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
      SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
!     SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB);
      SetCommitTsLimit(InvalidTransactionId, InvalidTransactionId);

      /* Set up the XLOG page header */
--- 4995,5001 ----
      ShmemVariableCache->oidCount = 0;
      MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
      SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
!     SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
      SetCommitTsLimit(InvalidTransactionId, InvalidTransactionId);

      /* Set up the XLOG page header */
*************** StartupXLOG(void)
*** 6597,6603 ****
      ShmemVariableCache->oidCount = 0;
      MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
      SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
!     SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB);
      SetCommitTsLimit(checkPoint.oldestCommitTsXid,
                       checkPoint.newestCommitTsXid);
      XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
--- 6597,6603 ----
      ShmemVariableCache->oidCount = 0;
      MultiXactSetNextMXact(checkPoint.nextMulti, checkPoint.nextMultiOffset);
      SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
!     SetMultiXactIdLimit(checkPoint.oldestMulti, checkPoint.oldestMultiDB, true);
      SetCommitTsLimit(checkPoint.oldestCommitTsXid,
                       checkPoint.newestCommitTsXid);
      XLogCtl->ckptXidEpoch = checkPoint.nextXidEpoch;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 3a9b965..3b3dfee 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** vac_truncate_clog(TransactionId frozenXI
*** 1205,1211 ****
       * signalling twice?
       */
      SetTransactionIdLimit(frozenXID, oldestxid_datoid);
!     SetMultiXactIdLimit(minMulti, minmulti_datoid);
  }


--- 1205,1211 ----
       * signalling twice?
       */
      SetTransactionIdLimit(frozenXID, oldestxid_datoid);
!     SetMultiXactIdLimit(minMulti, minmulti_datoid, false);
  }


diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 1d01988..85997a4 100644
*** a/src/include/access/multixact.h
--- b/src/include/access/multixact.h
*************** extern void StartupMultiXact(void);
*** 127,133 ****
  extern void TrimMultiXact(void);
  extern void ShutdownMultiXact(void);
  extern void SetMultiXactIdLimit(MultiXactId oldest_datminmxid,
!                     Oid oldest_datoid);
  extern void MultiXactGetCheckptMulti(bool is_shutdown,
                           MultiXactId *nextMulti,
                           MultiXactOffset *nextMultiOffset,
--- 127,134 ----
  extern void TrimMultiXact(void);
  extern void ShutdownMultiXact(void);
  extern void SetMultiXactIdLimit(MultiXactId oldest_datminmxid,
!                     Oid oldest_datoid,
!                     bool is_startup);
  extern void MultiXactGetCheckptMulti(bool is_shutdown,
                           MultiXactId *nextMulti,
                           MultiXactOffset *nextMultiOffset,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading postmaster's log messages about bind/listen errors

From
Craig Ringer
Date:
On 11 March 2017 at 03:24, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 3/9/17 23:43, Tom Lane wrote:
>> However, if we're measuring this on a scale of usefulness to the average
>> DBA, I would argue that it's of more interest than any of these messages
>> that currently appear by default:
>>
>> 2017-03-09 23:40:12.334 EST [19335] LOG:  MultiXact member wraparound protections are now enabled
>> 2017-03-09 23:40:12.335 EST [19339] LOG:  autovacuum launcher started
>> 2017-03-09 23:40:12.336 EST [19341] LOG:  logical replication launcher started
>>
>> The first of those is surely past its sell-by date.  As for the other two,
>> we should log *failure* to start, but not the normal case.
>
> I'm OK with removing these.  No news is good news.

Right, and it doesn't tell us that autovacuum will actually do
anything useful. The user could've set ridiculous thresholds/delays,
per-table overrides, etc.

So it's not much use for remote support/diagnostics, and might as well go away.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



I wrote:
>> I think that what would actually be of some use nowadays is a LOG-level
>> message emitted if the wraparound *isn't* activated immediately at start.
>> But otherwise, we should follow the rule that silence is golden.

> Concretely, how about the attached?  It preserves the original
> "protections are now enabled" message at LOG level, but emits it only
> when oldestOffsetKnown becomes true *after* startup.  Meanwhile, if
> oldestOffsetKnown is still not true at the conclusion of TrimMultiXact,
> then it emits a new LOG message about "protections are not active".

I realized that the second of these is not necessary because it's
redundant with the message about "MultiXact member wraparound protections
are disabled because oldest checkpointed MultiXact %u does not exist on
disk".  Pushed without that.
        regards, tom lane



Re: [HACKERS] Upgrading postmaster's log messages about bind/listenerrors

From
Bruce Momjian
Date:
On Tue, Mar 14, 2017 at 12:50:09PM -0400, Tom Lane wrote:
> I wrote:
> >> I think that what would actually be of some use nowadays is a LOG-level
> >> message emitted if the wraparound *isn't* activated immediately at start.
> >> But otherwise, we should follow the rule that silence is golden.
> 
> > Concretely, how about the attached?  It preserves the original
> > "protections are now enabled" message at LOG level, but emits it only
> > when oldestOffsetKnown becomes true *after* startup.  Meanwhile, if
> > oldestOffsetKnown is still not true at the conclusion of TrimMultiXact,
> > then it emits a new LOG message about "protections are not active".
> 
> I realized that the second of these is not necessary because it's
> redundant with the message about "MultiXact member wraparound protections
> are disabled because oldest checkpointed MultiXact %u does not exist on
> disk".  Pushed without that.

Gee, I kind of like the new messages:
LOG:  listening on IPv4 address "127.0.0.1", port 5432LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +