Re: Checkpoint logging, revised patch - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: Checkpoint logging, revised patch
Date
Msg-id 468667B8.9030805@enterprisedb.com
Whole thread Raw
In response to Re: Checkpoint logging, revised patch  (Greg Smith <gsmith@gregsmith.com>)
Responses Re: Checkpoint logging, revised patch
Re: Checkpoint logging, revised patch
List pgsql-patches
Greg Smith wrote:
> On Fri, 29 Jun 2007, Heikki Linnakangas wrote:
>
>> LOG:  checkpoint complete; buffers written=5869 (35.8%); write=2.081
>> s, sync=4.851 s, total=7.066 s
>
> My original patch converted the buffers written to MB. Easier to
> estimate MB/s by eye; I really came to hate multiplying by 8K. And
> people who have multiple boxes with different BLCKZ could merge their
> logs together and not have to worry about adjusting for that.
>
> I found this simpler to present the results to non-PostgreSQL people
> without having to explain the buffer size.  It's easy for any IT person
> to follow; the idea I was working toward was "see, this log entry shows
> the long pauses are because it's writing 700MB of data all at once here,
> right next to that statment that took 10 seconds we found with
> log_min_duration_statement".
>
> The current equivilant of what I had would be
> CheckpointStats.ckpt_bufs_written * BLCKSZ / (1024*1024) formatted as
> "%.1f MB"

I don't think we currently use MB in any other log messages. If we go
down that route, we need to consider switching to MB everywhere.

> 1) Log every time a new WAL file was created, which ties into the recent
> discussion here that being a possible performance issue.  At least you
> can look for it happening this way:
>
> src/backend/access/transam/xlog.c
> --- 1856,1863 ----
>                                 (errcode_for_file_access(),
>                                  errmsg("could not create file \"%s\":
> %m", tmppath)));
>
> +       ereport(DEBUG2, (errmsg("WAL creating and filling new file on
> disk")));
>
>         /*
>          * Zero-fill the file.  We have to do this the hard way to
> ensure that all

This could be useful.

> 2) Add a lower-level DEBUG statement when autovaccum was finished, which
> helped me in several causes figure out if that was the cause of a
> problem (when really doing low-level testing, I would see a vacuum
> start, not know if it was done, and then wonder if that was the cause of
> a slow statement):
>
> *** src/backend/postmaster/autovacuum.c
> --- 811,814 ----
>                 do_autovacuum();
> +               ereport(DEBUG2,
> +                               (errmsg("autovacuum: processing database
> \"%s\" complete", dbname)));
>         }

Did you check out log_autovacuum? Doesn't it do what you need?

> 3) I fixed a line in postmaster.c so it formatted fork PIDs the same way
> most other log statements do; most statements report it as (PID %d) and
> the difference in this form seemed undesirable (I spent a lot of time at
> DEBUG2 and these little things started to bug me):
>
> *** src/backend/postmaster/postmaster.c
> *** 2630,2636 ****
>         /* in parent, successful fork */
>         ereport(DEBUG2,
> !                       (errmsg_internal("forked new backend, pid=%d
> socket=%d",
>                                                          (int) pid,
> port->sock)));
> --- 2630,2636 ----
>         /* in parent, successful fork */
>         ereport(DEBUG2,
> !                       (errmsg_internal("forked new backend (PID %d)
> socket=%d",
>                                                          (int) pid,
> port->sock)));

Hmm. Since it's DEBUG2 I don't care much either way. The changed message
looks inconsistent to me, since socket is printed differently.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Greg Smith
Date:
Subject: Re: Checkpoint logging, revised patch
Next
From: Greg Smith
Date:
Subject: Re: Checkpoint logging, revised patch