Re: Logging checkpoints and other slowdown causes - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: Logging checkpoints and other slowdown causes
Date
Msg-id 200704030123.l331NHj11400@momjian.us
Whole thread Raw
In response to Logging checkpoints and other slowdown causes  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-patches
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Greg Smith wrote:
> This patch puts entries in the log file around some backend events that
> regularly cause client pauses, most of which are visible only when running
> with log_min_messages = debug2.  The main focus is regularly occuring
> checkpoints, from either running out of segments or hitting a timeout.
> This is an example of what gets written to the logs at checkpoint time:
>
> DEBUG2:  checkpoint required (timeout passed)
> DEBUG2:  checkpoint starting
> DEBUG2:  checkpoint flushing buffer pool
> DEBUG2:  checkpoint database fsync starting
> DEBUG2:  checkpoint database fsync complete
> DEBUG1:  checkpoint buffers dirty=16.7 MB (13.0%) write=174.8 ms
> sync=3716.5 ms
> DEBUG2:  checkpoint complete; 0 transaction log file(s) added, 0 removed,
> 0 recycled
>
> The idea here is that if you want to really watch what your system is
> doing, either to track down a problem or for benchmarking, you log at
> DEBUG2 and timestamp the logs with something like log_line_prefix =
> '[%m]:'.  Then line up your timestamped transactions with the logs to
> figure out what happened during bad periods.  This is very easy to do with
> the timestamped pgbench latency logs for example, so you can put them in
> time sequence order and see something like "oh, those long transactions
> all came in between the fsync start and end".  The summary line with the
> write/sync timing information has a broader use and suggests when you need
> to go deeper, which is why I put that one at DEBUG1.
>
> I also adjust some other log messages to make it easier to run the system
> at DEBUG2 without going completely crazy, like moving individual WAL
> segment recycles to DEBUG3 and standardizing the format of the fork/exit
> messages so they're easier to filter out.
>
> My main concern about this code is how it adds several gettimeofday calls
> and complicated string building inside FlushBufferPool, all of which is
> wasted if this data isn't actually going to get logged.  I'd like to have
> something that tells me whether I should even bother, but I didn't want to
> poke into the elog variables for fear of adding a dependency on its
> current structure to here.  Ideally, it would be nice to call something
> like WillLog(DEBUG2) and get a boolean back saying whether something at
> that log level will be output anywhere; I don't know enough about the
> logging code to add such a thing to it myself.
>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Content-Description:

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Add usage counts to pg_buffercache
Next
From: Bruce Momjian
Date:
Subject: Re: COPY-able sql log outputs