Thread: Checkpoint logging, revised patch

Checkpoint logging, revised patch

From

Heikki Linnakangas

Date:

29 June 2007, 17:32:27

Here's a reworked version Greg Smith's patch for logging checkpoints. It
adds a GUC variable, log_checkpoints, along the lines of log_connections
and log_autovacuum. Enabling it causes two lines to be printed to the
system log per checkpoint, like this:

LOG:  checkpoint starting
LOG:  checkpoint complete; buffers written=5869 (35.8%); write=2.081 s,
sync=4.851 s, total=7.066 s

The purpose of the first message is that by inspecting the logs, you can
easily see when a checkpoint is/was active. And if you experience
mysterious slowdowns etc., you can see if they coincide with a
checkpoint if you have log_min_duration enabled. The completion message
prints stats conveniently in a single line so it can be easily grepped
and parsed from the logs.

The implementation uses a global CheckpointStats-struct where data is
collected during the checkpoint. At the end, it's spit out to the log as
a nicely formatted message. I refactored PreallocXlogFiles and
RemoveXlogFiles to use that struct as well for reporting the # of WAL
segments removed, added and recycled. Global variables are generally not
considered good style, but IMHO in this case it's the cleanest approach.
Another options would be to pass down a reference to the struct to all
functions involved in checkpointing, or return individual numbers from
all functions and collect them together in CreateCheckPoint. Both
alternatives would complicate the function signatures significantly.
There's precedent for this approach in BgWriterStats and the buffer
usage counters (BufferHitCount et al).

One thing that's missing, that I originally hoped to achieve with this,
is logging the cause of a checkpoint. Doing that would have required
either sprinkling elogs to all callers of RequestCheckpoint, or adding a
"reason" parameter to it and decoding that to a log message inside
RequestCheckpoint, which seems a bit ugly just for logging purposes.
Also, that would have meant printing one more log message per
checkpoint, making it more chatty than I'd like it to be. Instead, I'm
just printing a different log message for immediate checkpoint requests.
Differentiating between immediate checkpoints triggered by a command
like CHECKPOINT or CREATE DATABASE and normal checkpoints should be
enough in practice.

I'm done with this.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.129
diff -c -r1.129 config.sgml
*** doc/src/sgml/config.sgml    28 Jun 2007 00:02:37 -0000    1.129
--- doc/src/sgml/config.sgml    28 Jun 2007 17:15:41 -0000
***************
*** 1547,1552 ****
--- 1547,1570 ----
        </listitem>
       </varlistentry>

+      <varlistentry id="guc-log-checkpoints" xreflabel="log_checkpoints">
+       <term><varname>log_checkpoints</varname> (<type>boolean</type>)</term>
+       <indexterm>
+        <primary><varname>log_checkpoints</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         Logs the start and end of each checkpoint in the server log. Some
+         statistics of the checkpoint are included in the log message,
+         including the number of dirty buffers in the buffer cache when the
+         checkpoint started, and time spent writing them.
+
+         This parameter can only be set in the <filename>postgresql.conf</>
+         file or on the server command line. The default is off.
+        </para>
+       </listitem>
+      </varlistentry>
+
       </variablelist>
       </sect2>
       <sect2 id="runtime-config-wal-archiving">
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.273
diff -c -r1.273 xlog.c
*** src/backend/access/transam/xlog.c    28 Jun 2007 00:02:37 -0000    1.273
--- src/backend/access/transam/xlog.c    28 Jun 2007 17:16:37 -0000
***************
*** 66,71 ****
--- 66,72 ----
  char       *XLOG_sync_method = NULL;
  const char    XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
  bool        fullPageWrites = true;
+ bool        log_checkpoints = false;

  #ifdef WAL_DEBUG
  bool        XLOG_DEBUG = false;
***************
*** 93,98 ****
--- 94,106 ----


  /*
+  * Statistics from current checkpoint are collected to this global struct.
+  * Because only the background writer or a stand-alone backend can perform
+  * checkpoints, this will be unused in normal backends.
+  */
+ CheckpointStatsData CheckpointStats;
+
+ /*
   * ThisTimeLineID will be same in all backends --- it identifies current
   * WAL timeline for the database system.
   */
***************
*** 414,422 ****
  static void XLogFileClose(void);
  static bool RestoreArchivedFile(char *path, const char *xlogfname,
                      const char *recovername, off_t expectedSize);
! static int    PreallocXlogFiles(XLogRecPtr endptr);
! static void RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr,
!                 int *nsegsremoved, int *nsegsrecycled);
  static void CleanupBackupHistory(void);
  static XLogRecord *ReadRecord(XLogRecPtr *RecPtr, int emode);
  static bool ValidXLOGHeader(XLogPageHeader hdr, int emode);
--- 422,429 ----
  static void XLogFileClose(void);
  static bool RestoreArchivedFile(char *path, const char *xlogfname,
                      const char *recovername, off_t expectedSize);
! static void PreallocXlogFiles(XLogRecPtr endptr);
! static void RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr);
  static void CleanupBackupHistory(void);
  static XLogRecord *ReadRecord(XLogRecPtr *RecPtr, int emode);
  static bool ValidXLOGHeader(XLogPageHeader hdr, int emode);
***************
*** 2525,2534 ****
   * recycled log segments, but the startup transient is likely to include
   * a lot of segment creations by foreground processes, which is not so good.
   */
! static int
  PreallocXlogFiles(XLogRecPtr endptr)
  {
-     int            nsegsadded = 0;
      uint32        _logId;
      uint32        _logSeg;
      int            lf;
--- 2532,2540 ----
   * recycled log segments, but the startup transient is likely to include
   * a lot of segment creations by foreground processes, which is not so good.
   */
! static void
  PreallocXlogFiles(XLogRecPtr endptr)
  {
      uint32        _logId;
      uint32        _logSeg;
      int            lf;
***************
*** 2543,2551 ****
          lf = XLogFileInit(_logId, _logSeg, &use_existent, true);
          close(lf);
          if (!use_existent)
!             nsegsadded++;
      }
-     return nsegsadded;
  }

  /*
--- 2549,2556 ----
          lf = XLogFileInit(_logId, _logSeg, &use_existent, true);
          close(lf);
          if (!use_existent)
!             CheckpointStats.ckpt_segs_added++;
      }
  }

  /*
***************
*** 2555,2562 ****
   * whether we want to recycle rather than delete no-longer-wanted log files.
   */
  static void
! RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr,
!                    int *nsegsremoved, int *nsegsrecycled)
  {
      uint32        endlogId;
      uint32        endlogSeg;
--- 2560,2566 ----
   * whether we want to recycle rather than delete no-longer-wanted log files.
   */
  static void
! RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr)
  {
      uint32        endlogId;
      uint32        endlogSeg;
***************
*** 2566,2574 ****
      char        lastoff[MAXFNAMELEN];
      char        path[MAXPGPATH];

-     *nsegsremoved = 0;
-     *nsegsrecycled = 0;
-
      /*
       * Initialize info about where to try to recycle to.  We allow recycling
       * segments up to XLOGfileslop segments beyond the current XLOG location.
--- 2570,2575 ----
***************
*** 2617,2623 ****
                      ereport(DEBUG2,
                              (errmsg("recycled transaction log file \"%s\"",
                                      xlde->d_name)));
!                     (*nsegsrecycled)++;
                      /* Needn't recheck that slot on future iterations */
                      if (max_advance > 0)
                      {
--- 2618,2624 ----
                      ereport(DEBUG2,
                              (errmsg("recycled transaction log file \"%s\"",
                                      xlde->d_name)));
!                     CheckpointStats.ckpt_segs_recycled++;
                      /* Needn't recheck that slot on future iterations */
                      if (max_advance > 0)
                      {
***************
*** 2632,2638 ****
                              (errmsg("removing transaction log file \"%s\"",
                                      xlde->d_name)));
                      unlink(path);
!                     (*nsegsremoved)++;
                  }

                  XLogArchiveCleanup(xlde->d_name);
--- 2633,2639 ----
                              (errmsg("removing transaction log file \"%s\"",
                                      xlde->d_name)));
                      unlink(path);
!                     CheckpointStats.ckpt_segs_removed++;
                  }

                  XLogArchiveCleanup(xlde->d_name);
***************
*** 5421,5426 ****
--- 5422,5474 ----
              (errmsg("database system is shut down")));
  }

+ /*
+  * Reset checkpoint statistics data.
+  */
+ static void
+ ResetCheckpointStats()
+ {
+     memset(&CheckpointStats, 0, sizeof(CheckpointStats));
+ }
+
+ /*
+  * Prints a summary of a checkpoint that was just finished.
+  */
+ static void
+ PrintCheckpointStats()
+ {
+     if (log_checkpoints)
+     {
+         long    write_secs, sync_secs, total_secs;
+         int        write_usecs, sync_usecs, total_usecs;
+
+         TimestampDifference(CheckpointStats.ckpt_start_t,
+                             CheckpointStats.ckpt_end_t,
+                             &total_secs, &total_usecs);
+
+         TimestampDifference(CheckpointStats.ckpt_write_t,
+                             CheckpointStats.ckpt_sync_t,
+                             &write_secs, &write_usecs);
+
+         TimestampDifference(CheckpointStats.ckpt_sync_t,
+                             CheckpointStats.ckpt_sync_end_t,
+                             &sync_secs, &sync_usecs);
+
+         elog(LOG, "checkpoint complete; buffers written=%d (%.1f%%); write=%ld.%03d s, sync=%ld.%03d s,
total=%ld.%03ds", 
+              CheckpointStats.ckpt_bufs_written,
+              (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
+              write_secs, write_usecs/1000,
+              sync_secs, sync_usecs/1000,
+              total_secs, total_usecs/1000);
+
+         /* Print some more details in DEBUG2 level */
+         elog(DEBUG2, "%d transaction log file(s) added, %d removed, %d recycled",
+              CheckpointStats.ckpt_segs_added,
+              CheckpointStats.ckpt_segs_removed,
+              CheckpointStats.ckpt_segs_recycled);
+     }
+ }
+
  /*
   * Perform a checkpoint --- either during shutdown, or on-the-fly
   *
***************
*** 5446,5457 ****
      uint32        freespace;
      uint32        _logId;
      uint32        _logSeg;
-     int            nsegsadded = 0;
-     int            nsegsremoved = 0;
-     int            nsegsrecycled = 0;
      TransactionId *inCommitXids;
      int            nInCommit;

      /*
       * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
       * (This is just pro forma, since in the present system structure there is
--- 5494,5505 ----
      uint32        freespace;
      uint32        _logId;
      uint32        _logSeg;
      TransactionId *inCommitXids;
      int            nInCommit;

+     ResetCheckpointStats();
+     CheckpointStats.ckpt_start_t = GetCurrentTimestamp();
+
      /*
       * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
       * (This is just pro forma, since in the present system structure there is
***************
*** 5561,5568 ****
      LWLockRelease(WALInsertLock);

      if (!shutdown)
!         ereport(DEBUG2,
!                 (errmsg("checkpoint starting")));

      /*
       * Before flushing data, we must wait for any transactions that are
--- 5609,5624 ----
      LWLockRelease(WALInsertLock);

      if (!shutdown)
!     {
!         /*
!          * We don't print this until now, because we might have decided
!          * above to skip this checkpoint altogether.
!          */
!         if (flags & CHECKPOINT_IMMEDIATE)
!             elog(LOG, "checkpoint starting (immediate)");
!         else
!             elog(LOG, "checkpoint starting");
!     }

      /*
       * Before flushing data, we must wait for any transactions that are
***************
*** 5699,5706 ****
      if (_logId || _logSeg)
      {
          PrevLogSeg(_logId, _logSeg);
!         RemoveOldXlogFiles(_logId, _logSeg, recptr,
!                            &nsegsremoved, &nsegsrecycled);
      }

      /*
--- 5755,5761 ----
      if (_logId || _logSeg)
      {
          PrevLogSeg(_logId, _logSeg);
!         RemoveOldXlogFiles(_logId, _logSeg, recptr);
      }

      /*
***************
*** 5708,5714 ****
       * segments, since that may supply some of the needed files.)
       */
      if (!shutdown)
!         nsegsadded = PreallocXlogFiles(recptr);

      /*
       * Truncate pg_subtrans if possible.  We can throw away all data before
--- 5763,5769 ----
       * segments, since that may supply some of the needed files.)
       */
      if (!shutdown)
!         PreallocXlogFiles(recptr);

      /*
       * Truncate pg_subtrans if possible.  We can throw away all data before
***************
*** 5720,5731 ****
      if (!InRecovery)
          TruncateSUBTRANS(GetOldestXmin(true, false));

      if (!shutdown)
!         ereport(DEBUG2,
!                 (errmsg("checkpoint complete; %d transaction log file(s) added, %d removed, %d recycled",
!                         nsegsadded, nsegsremoved, nsegsrecycled)));

-     LWLockRelease(CheckpointLock);
  }

  /*
--- 5775,5788 ----
      if (!InRecovery)
          TruncateSUBTRANS(GetOldestXmin(true, false));

+     LWLockRelease(CheckpointLock);
+
+     /* All real work is done. */
+     CheckpointStats.ckpt_end_t = GetCurrentTimestamp();
+
      if (!shutdown)
!         PrintCheckpointStats();

  }

  /*
Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.222
diff -c -r1.222 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c    28 Jun 2007 00:02:38 -0000    1.222
--- src/backend/storage/buffer/bufmgr.c    28 Jun 2007 17:23:29 -0000
***************
*** 1096,1101 ****
--- 1096,1107 ----
          if (++buf_id >= NBuffers)
              buf_id = 0;
      }
+
+     /*
+      * Update checkpoint statistics. As noted above, this doesn't
+      * include buffers written by other backends or bgwriter scan.
+      */
+     CheckpointStats.ckpt_bufs_written = num_written;
  }

  /*
***************
*** 1362,1371 ****
  void
  CheckPointBuffers(int flags)
  {
      BufferSync(flags);
      smgrsync();
- }


  /*
   * Do whatever is needed to prepare for commit at the bufmgr and smgr levels
--- 1368,1381 ----
  void
  CheckPointBuffers(int flags)
  {
+     CheckpointStats.ckpt_write_t = GetCurrentTimestamp();
      BufferSync(flags);
+
+     CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
      smgrsync();

+     CheckpointStats.ckpt_sync_end_t = GetCurrentTimestamp();
+ }

  /*
   * Do whatever is needed to prepare for commit at the bufmgr and smgr levels
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.403
diff -c -r1.403 guc.c
*** src/backend/utils/misc/guc.c    28 Jun 2007 00:02:39 -0000    1.403
--- src/backend/utils/misc/guc.c    28 Jun 2007 15:51:32 -0000
***************
*** 587,592 ****
--- 587,600 ----
          false, NULL, NULL
      },
      {
+         {"log_checkpoints", PGC_BACKEND, LOGGING_WHAT,
+             gettext_noop("Logs each checkpoint."),
+             NULL
+         },
+         &log_checkpoints,
+         false, NULL, NULL
+     },
+     {
          {"log_connections", PGC_BACKEND, LOGGING_WHAT,
              gettext_noop("Logs each successful connection."),
              NULL
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.217
diff -c -r1.217 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample    28 Jun 2007 00:02:39 -0000    1.217
--- src/backend/utils/misc/postgresql.conf.sample    28 Jun 2007 15:39:35 -0000
***************
*** 168,173 ****
--- 168,174 ----
  #checkpoint_timeout = 5min        # range 30s-1h
  #checkpoint_completion_target = 0.5    # checkpoint target duration, 0.0 - 1.0
  #checkpoint_warning = 30s        # 0 is off
+ #log_checkpoints = off            # logs checkpoints to server log

  # - Archiving -

Index: src/include/access/xlog.h
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/xlog.h,v
retrieving revision 1.79
diff -c -r1.79 xlog.h
*** src/include/access/xlog.h    28 Jun 2007 00:02:39 -0000    1.79
--- src/include/access/xlog.h    28 Jun 2007 17:22:04 -0000
***************
*** 16,21 ****
--- 16,22 ----
  #include "lib/stringinfo.h"
  #include "storage/buf.h"
  #include "utils/pg_crc.h"
+ #include "utils/timestamp.h"


  /*
***************
*** 150,155 ****
--- 151,157 ----
  extern int    XLogArchiveTimeout;
  extern char *XLOG_sync_method;
  extern const char XLOG_sync_method_default[];
+ extern bool log_checkpoints;

  #define XLogArchivingActive()    (XLogArchiveCommand[0] != '\0')

***************
*** 164,169 ****
--- 166,189 ----
  #define CHECKPOINT_WARNONTIME    0x0008        /* Enable CheckPointWarning */
  #define CHECKPOINT_WAIT            0x0010        /* Wait for completion */

+ /*
+  * Checkpoint statistics.
+  */
+ typedef struct CheckpointStatsData {
+     TimestampTz ckpt_start_t;        /* start of checkpoint */
+     TimestampTz ckpt_write_t;        /* start of flushing buffers */
+     TimestampTz ckpt_sync_t;        /* start of fsyncs */
+     TimestampTz ckpt_sync_end_t;    /* end of fsyncs */
+     TimestampTz ckpt_end_t;            /* end of checkpoint */
+
+     int ckpt_bufs_written;         /* # of buffers written */
+
+     int ckpt_segs_added;        /* # of new xlog segments created */
+     int ckpt_segs_removed;        /* # of xlog segments deleted */
+     int ckpt_segs_recycled;        /* # of xlog segments recycled */
+ } CheckpointStatsData;
+
+ extern CheckpointStatsData CheckpointStats;

  extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata);
  extern void XLogFlush(XLogRecPtr RecPtr);

Re: Checkpoint logging, revised patch

From

Tom Lane

Date:

29 June 2007, 21:05:41

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> One thing that's missing, that I originally hoped to achieve with this,
> is logging the cause of a checkpoint. Doing that would have required
> either sprinkling elogs to all callers of RequestCheckpoint, or adding a
> "reason" parameter to it and decoding that to a log message inside
> RequestCheckpoint, which seems a bit ugly just for logging purposes.

It strikes me that we could get most of the mileage needed here just by
logging all of the "flags" bits.  In particular, it would cost almost
nothing to define an extra flag bit that's set by the bgwriter when a
time-based checkpoint is triggered, and then we would have flag bits
telling us:
    * time-based checkpoint
    * xlog-based checkpoint
    * immediate checkpoint
    * shutdown checkpoint
and that would really cover pretty much everything you'd need to know.
Does anyone particularly care whether an immediate checkpoint is
triggered by CHECKPOINT, CREATE DATABASE or DROP DATABASE?

If anyone *does* care, there are plenty of leftover flag bits that we
could use to pass that info, but it seems like gilding the lily.

> Also, that would have meant printing one more log message per
> checkpoint, making it more chatty than I'd like it to be.

I'm envisioning something like

    LOG: checkpoint starting: immediate force wait

where we're just adding a word to the message for each flag bit.  No
more lines than you have.

            regards, tom lane

Re: Checkpoint logging, revised patch

From

Greg Smith

Date:

29 June 2007, 22:31:02

On Fri, 29 Jun 2007, Heikki Linnakangas wrote:

> LOG:  checkpoint complete; buffers written=5869 (35.8%); write=2.081 s,
> sync=4.851 s, total=7.066 s

My original patch converted the buffers written to MB. Easier to estimate
MB/s by eye; I really came to hate multiplying by 8K. And people who have
multiple boxes with different BLCKZ could merge their logs together and
not have to worry about adjusting for that.

I found this simpler to present the results to non-PostgreSQL people
without having to explain the buffer size.  It's easy for any IT person to
follow; the idea I was working toward was "see, this log entry shows the
long pauses are because it's writing 700MB of data all at once here, right
next to that statment that took 10 seconds we found with
log_min_duration_statement".

The current equivilant of what I had would be
CheckpointStats.ckpt_bufs_written * BLCKSZ / (1024*1024) formatted as
"%.1f MB"

> One thing that's missing, that I originally hoped to achieve with this,
> is logging the cause of a checkpoint.

The way pg_stat_bgwriter keeps two separate counts for requested (which
includes timeouts) and required (out of segments) checkpoints now
satisfies the main reason I originally collected that info.  I felt it was
critical when I wrote that patch for it to be possible to distinguish
between which of the two was happening more, it's only in the nice to have
category now.

Now, onto some slightly different things here, which were all aimed at
developer-level troubleshooting.  With the nice reworking for logging
checkpoints better, there were three tiny things my original patch did
that got lost along the way that I'd suggest Tom might want to consider
putting back during the final apply.  I'll include mini pseudo-diffs here
for those so no one has to find them in my original patch, they're all
one-liners:

1) Log every time a new WAL file was created, which ties into the recent
discussion here that being a possible performance issue.  At least you can
look for it happening this way:

src/backend/access/transam/xlog.c
--- 1856,1863 ----
                                 (errcode_for_file_access(),
                                  errmsg("could not create file \"%s\":
%m", tmppath)));

+       ereport(DEBUG2, (errmsg("WAL creating and filling new file on
disk")));

         /*
          * Zero-fill the file.  We have to do this the hard way to ensure
that all

2) Add a lower-level DEBUG statement when autovaccum was finished, which
helped me in several causes figure out if that was the cause of a problem
(when really doing low-level testing, I would see a vacuum start, not know
if it was done, and then wonder if that was the cause of a slow
statement):

*** src/backend/postmaster/autovacuum.c
--- 811,814 ----
                 do_autovacuum();
+               ereport(DEBUG2,
+                               (errmsg("autovacuum: processing database
\"%s\" complete", dbname)));
         }

3) I fixed a line in postmaster.c so it formatted fork PIDs the same way
most other log statements do; most statements report it as (PID %d) and
the difference in this form seemed undesirable (I spent a lot of time at
DEBUG2 and these little things started to bug me):

*** src/backend/postmaster/postmaster.c
*** 2630,2636 ****
         /* in parent, successful fork */
         ereport(DEBUG2,
!                       (errmsg_internal("forked new backend, pid=%d
socket=%d",
                                                          (int) pid,
port->sock)));
--- 2630,2636 ----
         /* in parent, successful fork */
         ereport(DEBUG2,
!                       (errmsg_internal("forked new backend (PID %d)
socket=%d",
                                                          (int) pid,
port->sock)));

Little stuff, but all things I've found valuable on several occasions,
which suggests eventually someone else may appreciate them as well.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Checkpoint logging, revised patch

From

Heikki Linnakangas

Date:

30 June 2007, 11:25:52

Greg Smith wrote:
> On Fri, 29 Jun 2007, Heikki Linnakangas wrote:
>
>> LOG:  checkpoint complete; buffers written=5869 (35.8%); write=2.081
>> s, sync=4.851 s, total=7.066 s
>
> My original patch converted the buffers written to MB. Easier to
> estimate MB/s by eye; I really came to hate multiplying by 8K. And
> people who have multiple boxes with different BLCKZ could merge their
> logs together and not have to worry about adjusting for that.
>
> I found this simpler to present the results to non-PostgreSQL people
> without having to explain the buffer size.  It's easy for any IT person
> to follow; the idea I was working toward was "see, this log entry shows
> the long pauses are because it's writing 700MB of data all at once here,
> right next to that statment that took 10 seconds we found with
> log_min_duration_statement".
>
> The current equivilant of what I had would be
> CheckpointStats.ckpt_bufs_written * BLCKSZ / (1024*1024) formatted as
> "%.1f MB"

I don't think we currently use MB in any other log messages. If we go
down that route, we need to consider switching to MB everywhere.

> 1) Log every time a new WAL file was created, which ties into the recent
> discussion here that being a possible performance issue.  At least you
> can look for it happening this way:
>
> src/backend/access/transam/xlog.c
> --- 1856,1863 ----
>                                 (errcode_for_file_access(),
>                                  errmsg("could not create file \"%s\":
> %m", tmppath)));
>
> +       ereport(DEBUG2, (errmsg("WAL creating and filling new file on
> disk")));
>
>         /*
>          * Zero-fill the file.  We have to do this the hard way to
> ensure that all

This could be useful.

> 2) Add a lower-level DEBUG statement when autovaccum was finished, which
> helped me in several causes figure out if that was the cause of a
> problem (when really doing low-level testing, I would see a vacuum
> start, not know if it was done, and then wonder if that was the cause of
> a slow statement):
>
> *** src/backend/postmaster/autovacuum.c
> --- 811,814 ----
>                 do_autovacuum();
> +               ereport(DEBUG2,
> +                               (errmsg("autovacuum: processing database
> \"%s\" complete", dbname)));
>         }

Did you check out log_autovacuum? Doesn't it do what you need?

> 3) I fixed a line in postmaster.c so it formatted fork PIDs the same way
> most other log statements do; most statements report it as (PID %d) and
> the difference in this form seemed undesirable (I spent a lot of time at
> DEBUG2 and these little things started to bug me):
>
> *** src/backend/postmaster/postmaster.c
> *** 2630,2636 ****
>         /* in parent, successful fork */
>         ereport(DEBUG2,
> !                       (errmsg_internal("forked new backend, pid=%d
> socket=%d",
>                                                          (int) pid,
> port->sock)));
> --- 2630,2636 ----
>         /* in parent, successful fork */
>         ereport(DEBUG2,
> !                       (errmsg_internal("forked new backend (PID %d)
> socket=%d",
>                                                          (int) pid,
> port->sock)));

Hmm. Since it's DEBUG2 I don't care much either way. The changed message
looks inconsistent to me, since socket is printed differently.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Checkpoint logging, revised patch

From

Greg Smith

Date:

30 June 2007, 15:34:43

On Sat, 30 Jun 2007, Heikki Linnakangas wrote:

> I don't think we currently use MB in any other log messages. If we go down
> that route, we need to consider switching to MB everywhere.

Are there a lot of messages that speak in terms of buffer pages though?
Other than the debug-level stuff, which this checkpoint message has now
been elevated out of, I don't recall seeing enough things that spoke in
those terms to consider it a standard.  Someone feel free to correct me if
I'm wrong here.

> Did you check out log_autovacuum? Doesn't it do what you need?

I have not (most of this patch was done against 8.2 and then ported
forward).  If that already shows clearly the start and end of each
autovacuum section, ignore that I even brought this up.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Checkpoint logging, revised patch

From

Tom Lane

Date:

30 June 2007, 16:12:56

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Here's a reworked version Greg Smith's patch for logging checkpoints.

Applied with minor editorialization.

            regards, tom lane

Re: Checkpoint logging, revised patch

From

Tom Lane

Date:

30 June 2007, 16:20:16

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Greg Smith wrote:
>> My original patch converted the buffers written to MB.

> I don't think we currently use MB in any other log messages. If we go
> down that route, we need to consider switching to MB everywhere.

I left this as submitted by Heikki, mainly because the percentage-of-
shared-buffers bit seemed useful to me and it would make less sense IMHO
if attached to a number of megabytes instead of a number of buffers.
But it's a judgment call for sure.  Any other opinions out there?

>> 1) Log every time a new WAL file was created, which ties into the recent
>> discussion here that being a possible performance issue.

> This could be useful.

Done; I put in two DEBUG2 messages, one at start and one at completion
of the file-creation.

>> 2) Add a lower-level DEBUG statement when autovaccum was finished,

> Did you check out log_autovacuum? Doesn't it do what you need?

I concur that log_autovacuum seems to cover this already.

>> 3) I fixed a line in postmaster.c so it formatted fork PIDsthe same way
>> most other log statements do; most statements report it as (PID %d) and
>> the difference in this form seemed undesirable

> Hmm. Since it's DEBUG2 I don't care much either way. The changed message
> looks inconsistent to me, since socket is printed differently.

No strong opinion, but I left it as-is for the moment in case Alvaro is
about to commit something in postmaster.c (I suspect the double-shutdown
business is a bug in there).  Don't want to cause merge problems for him
just for cosmetic message cleanup.

            regards, tom lane