Re: Fix checkpoint skip logic on idle systems by tracking LSN progress - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
Date
Msg-id 20161110152812.GI13284@tamriel.snowman.net
Whole thread Raw
In response to Re: Fix checkpoint skip logic on idle systems by tracking LSN progress  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
List pgsql-hackers
Michael,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> Thanks for the review! Waiting for a couple of days more is fine for
> me. This won't change much. Attached is v15 with the fixes you
> mentioned.

I figured I'd go ahead and start looking into this (and it's pretty easy
for me to discuss it with David, given he works in the same office ;).

A couple initial comments:

> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index adab2f8..38c2385 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -2826,12 +2826,9 @@ include_dir 'conf.d'
>          parameter is greater than zero, the server will switch to a new
>          segment file whenever this many seconds have elapsed since the last
>          segment file switch, and there has been any database activity,
> -        including a single checkpoint.  (Increasing
> -        <varname>checkpoint_timeout</> will reduce unnecessary
> -        checkpoints on an idle system.)
> -        Note that archived files that are closed early
> -        due to a forced switch are still the same length as completely full
> -        files.  Therefore, it is unwise to use a very short
> +        including a single checkpoint.  Note that archived files that are
> +        closed early due to a forced switch are still the same length as
> +        completely full files.  Therefore, it is unwise to use a very short
>          <varname>archive_timeout</> — it will bloat your archive
>          storage.  <varname>archive_timeout</> settings of a minute or so are
>          usually reasonable.  You should consider using streaming replication,

We should probably include in here that we may skip a checkpoint if no
activity has happened, meaning that this is a safe setting to set for
environments which are idle for long periods (I'm thinking embedded
systems here).

> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
[...]
> +            if (log_checkpoints)
> +                ereport(LOG, (errmsg("checkpoint skipped")));

Do we really need to log that we're skipping a checkpoint..?  As the
point of this is to avoid write activity on a system which is idle, it
doesn't make sense to me to add a new cause for writes to happen when
we're idle.

Thanks!

Stephen

pgsql-hackers by date:

Previous
From: Kuntal Ghosh
Date:
Subject: Re: WAL consistency check facility
Next
From: Robert Haas
Date:
Subject: Re: WAL consistency check facility