Thread: Re: [HACKERS] Stats processor not restarting

Re: [HACKERS] Stats processor not restarting

From
Bruce Momjian
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > Bah, sorry about the noise. It was the effect of
> > PGSTAT_RESTART_INTERVAL.
> > Do we want to add some logging when we don't restart it due to repeated
> > failures?
>
> Not really, but maybe it would be sensible to reset last_pgstat_start_time
> when doing a database-wide restart?  The motivation for the timeout was
> to reduce cycle wastage if pgstat crashed by itself, but when you've
> deliberately SIGQUITed it, that hardly seems to apply ...

You mean like this, attached?

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.149
diff -c -c -r1.149 pgstat.c
*** src/backend/postmaster/pgstat.c    16 Mar 2007 17:57:36 -0000    1.149
--- src/backend/postmaster/pgstat.c    22 Mar 2007 02:03:31 -0000
***************
*** 1929,1934 ****
--- 1929,1937 ----
  static void
  pgstat_exit(SIGNAL_ARGS)
  {
+     /* allow stats to restart immediately after SIGQUIT */
+     last_pgstat_start_time = 0;
+
      need_exit = true;
  }


Re: [HACKERS] Stats processor not restarting

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Not really, but maybe it would be sensible to reset last_pgstat_start_time
>> when doing a database-wide restart?

> You mean like this, attached?

I'd be fairly surprised if resetting the variable in the child process
had any effect on the postmaster...

I think a correct fix would involve exposing either the variable or a
routine to zero it from pgstat.c, and having postmaster.c do that at
the points where it intentionally SIGQUITs the stats collector.

            regards, tom lane

Re: [HACKERS] Stats processor not restarting

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> Not really, but maybe it would be sensible to reset last_pgstat_start_time
> >> when doing a database-wide restart?
>
> > You mean like this, attached?
>
> I'd be fairly surprised if resetting the variable in the child process
> had any effect on the postmaster...

Yep, me too.  ;-)

> I think a correct fix would involve exposing either the variable or a
> routine to zero it from pgstat.c, and having postmaster.c do that at
> the points where it intentionally SIGQUITs the stats collector.

OK, patch attached.  I just reset the value in all places where we were
killing the pgstat process and not immediately exiting ourselves.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.149
diff -c -c -r1.149 pgstat.c
*** src/backend/postmaster/pgstat.c    16 Mar 2007 17:57:36 -0000    1.149
--- src/backend/postmaster/pgstat.c    22 Mar 2007 14:40:36 -0000
***************
*** 572,577 ****
--- 572,581 ----
      return 0;
  }

+ void allow_immediate_pgstat_restart(void)
+ {
+         last_pgstat_start_time = 0;
+ }

  /* ------------------------------------------------------------
   * Public functions used by backends follow
Index: src/backend/postmaster/postmaster.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.526
diff -c -c -r1.526 postmaster.c
*** src/backend/postmaster/postmaster.c    7 Mar 2007 13:35:02 -0000    1.526
--- src/backend/postmaster/postmaster.c    22 Mar 2007 14:40:37 -0000
***************
*** 1896,1902 ****
--- 1896,1905 ----
                  signal_child(PgArchPID, SIGQUIT);
              /* Tell pgstat to shut down too; nothing left for it to do */
              if (PgStatPID != 0)
+             {
                  signal_child(PgStatPID, SIGQUIT);
+                 allow_immediate_pgstat_restart();
+             }
              /* Tell autovac launcher to shut down too */
              if (AutoVacPID != 0)
                  signal_child(AutoVacPID, SIGTERM);
***************
*** 1952,1958 ****
--- 1955,1964 ----
                  signal_child(PgArchPID, SIGQUIT);
              /* Tell pgstat to shut down too; nothing left for it to do */
              if (PgStatPID != 0)
+             {
                  signal_child(PgStatPID, SIGQUIT);
+                 allow_immediate_pgstat_restart();
+             }
              /* Tell autovac launcher to shut down too */
              if (AutoVacPID != 0)
                  signal_child(AutoVacPID, SIGTERM);
***************
*** 2241,2247 ****
--- 2247,2256 ----
              signal_child(PgArchPID, SIGQUIT);
          /* Tell pgstat to shut down too; nothing left for it to do */
          if (PgStatPID != 0)
+         {
              signal_child(PgStatPID, SIGQUIT);
+             allow_immediate_pgstat_restart();
+         }
          /* Tell autovac launcher to shut down too */
          if (AutoVacPID != 0)
              signal_child(AutoVacPID, SIGTERM);
***************
*** 2404,2409 ****
--- 2413,2419 ----
                                   "SIGQUIT",
                                   (int) PgStatPID)));
          signal_child(PgStatPID, SIGQUIT);
+         allow_immediate_pgstat_restart();
      }

      /* We do NOT restart the syslogger */
Index: src/include/pgstat.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/pgstat.h,v
retrieving revision 1.55
diff -c -c -r1.55 pgstat.h
*** src/include/pgstat.h    16 Mar 2007 17:57:36 -0000    1.55
--- src/include/pgstat.h    22 Mar 2007 14:40:38 -0000
***************
*** 369,375 ****
  extern void pgstat_init(void);
  extern int    pgstat_start(void);
  extern void pgstat_reset_all(void);
!
  #ifdef EXEC_BACKEND
  extern void PgstatCollectorMain(int argc, char *argv[]);
  #endif
--- 369,375 ----
  extern void pgstat_init(void);
  extern int    pgstat_start(void);
  extern void pgstat_reset_all(void);
! extern void allow_immediate_pgstat_restart(void);
  #ifdef EXEC_BACKEND
  extern void PgstatCollectorMain(int argc, char *argv[]);
  #endif

Re: [HACKERS] Stats processor not restarting

From
Bruce Momjian
Date:
Applied.  I did the case where we exit right away too, just for
consistency.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > Tom Lane wrote:
> > >> Not really, but maybe it would be sensible to reset last_pgstat_start_time
> > >> when doing a database-wide restart?
> >
> > > You mean like this, attached?
> >
> > I'd be fairly surprised if resetting the variable in the child process
> > had any effect on the postmaster...
>
> Yep, me too.  ;-)
>
> > I think a correct fix would involve exposing either the variable or a
> > routine to zero it from pgstat.c, and having postmaster.c do that at
> > the points where it intentionally SIGQUITs the stats collector.
>
> OK, patch attached.  I just reset the value in all places where we were
> killing the pgstat process and not immediately exiting ourselves.
>
> --
>   Bruce Momjian  <bruce@momjian.us>          http://momjian.us
>   EnterpriseDB                               http://www.enterprisedb.com
>
>   + If your life is a hard drive, Christ can be your backup. +

> Index: src/backend/postmaster/pgstat.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
> retrieving revision 1.149
> diff -c -c -r1.149 pgstat.c
> *** src/backend/postmaster/pgstat.c    16 Mar 2007 17:57:36 -0000    1.149
> --- src/backend/postmaster/pgstat.c    22 Mar 2007 14:40:36 -0000
> ***************
> *** 572,577 ****
> --- 572,581 ----
>       return 0;
>   }
>
> + void allow_immediate_pgstat_restart(void)
> + {
> +         last_pgstat_start_time = 0;
> + }
>
>   /* ------------------------------------------------------------
>    * Public functions used by backends follow
> Index: src/backend/postmaster/postmaster.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
> retrieving revision 1.526
> diff -c -c -r1.526 postmaster.c
> *** src/backend/postmaster/postmaster.c    7 Mar 2007 13:35:02 -0000    1.526
> --- src/backend/postmaster/postmaster.c    22 Mar 2007 14:40:37 -0000
> ***************
> *** 1896,1902 ****
> --- 1896,1905 ----
>                   signal_child(PgArchPID, SIGQUIT);
>               /* Tell pgstat to shut down too; nothing left for it to do */
>               if (PgStatPID != 0)
> +             {
>                   signal_child(PgStatPID, SIGQUIT);
> +                 allow_immediate_pgstat_restart();
> +             }
>               /* Tell autovac launcher to shut down too */
>               if (AutoVacPID != 0)
>                   signal_child(AutoVacPID, SIGTERM);
> ***************
> *** 1952,1958 ****
> --- 1955,1964 ----
>                   signal_child(PgArchPID, SIGQUIT);
>               /* Tell pgstat to shut down too; nothing left for it to do */
>               if (PgStatPID != 0)
> +             {
>                   signal_child(PgStatPID, SIGQUIT);
> +                 allow_immediate_pgstat_restart();
> +             }
>               /* Tell autovac launcher to shut down too */
>               if (AutoVacPID != 0)
>                   signal_child(AutoVacPID, SIGTERM);
> ***************
> *** 2241,2247 ****
> --- 2247,2256 ----
>               signal_child(PgArchPID, SIGQUIT);
>           /* Tell pgstat to shut down too; nothing left for it to do */
>           if (PgStatPID != 0)
> +         {
>               signal_child(PgStatPID, SIGQUIT);
> +             allow_immediate_pgstat_restart();
> +         }
>           /* Tell autovac launcher to shut down too */
>           if (AutoVacPID != 0)
>               signal_child(AutoVacPID, SIGTERM);
> ***************
> *** 2404,2409 ****
> --- 2413,2419 ----
>                                    "SIGQUIT",
>                                    (int) PgStatPID)));
>           signal_child(PgStatPID, SIGQUIT);
> +         allow_immediate_pgstat_restart();
>       }
>
>       /* We do NOT restart the syslogger */
> Index: src/include/pgstat.h
> ===================================================================
> RCS file: /cvsroot/pgsql/src/include/pgstat.h,v
> retrieving revision 1.55
> diff -c -c -r1.55 pgstat.h
> *** src/include/pgstat.h    16 Mar 2007 17:57:36 -0000    1.55
> --- src/include/pgstat.h    22 Mar 2007 14:40:38 -0000
> ***************
> *** 369,375 ****
>   extern void pgstat_init(void);
>   extern int    pgstat_start(void);
>   extern void pgstat_reset_all(void);
> !
>   #ifdef EXEC_BACKEND
>   extern void PgstatCollectorMain(int argc, char *argv[]);
>   #endif
> --- 369,375 ----
>   extern void pgstat_init(void);
>   extern int    pgstat_start(void);
>   extern void pgstat_reset_all(void);
> ! extern void allow_immediate_pgstat_restart(void);
>   #ifdef EXEC_BACKEND
>   extern void PgstatCollectorMain(int argc, char *argv[]);
>   #endif

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +