Re: add checkpoint stats of snapshot and mapping files of pg_logical dir - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Date
Msg-id CALj2ACWHZ-anwFMB918Hg24bihw+h5Hm8hyBJq3_OUaLyD=jFw@mail.gmail.com
Whole thread Raw
In response to Re: add checkpoint stats of snapshot and mapping files of pg_logical dir  (Andres Freund <andres@anarazel.de>)
Responses Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
List pgsql-hackers
On Tue, Mar 22, 2022 at 12:20 AM Andres Freund <andres@anarazel.de> wrote:
> > Something like attached
> > v6-0001-Deduplicate-checkpoint-restartpoint-complete-mess.patch?
>
> Mostly. I don't see a reason for the use of the stringinfo. And I think
> LogCheckpointStart() should be dealt with similarly.
>
> I'd just make it a restartpoint ? _("restartpoint") : _("checkpoint") or such
> in the argument? Then translators don't need to translate the two messages
> separately.

Do you mean like this?
    ereport(LOG,
    /* translator: the placeholders show checkpoint options */
            (errmsg("%s starting:%s%s%s%s%s%s%s%s",
                    restartpoint ? _("restartpoint") : _("checkpoint"),
                    (flags & CHECKPOINT_IS_SHUTDOWN) ? " shutdown" : "",
                    (flags & CHECKPOINT_END_OF_RECOVERY) ? "
end-of-recovery" : "",
                    (flags & CHECKPOINT_IMMEDIATE) ? " immediate" : "",
                    (flags & CHECKPOINT_FORCE) ? " force" : "",
                    (flags & CHECKPOINT_WAIT) ? " wait" : "",
                    (flags & CHECKPOINT_CAUSE_XLOG) ? " wal" : "",
                    (flags & CHECKPOINT_CAUSE_TIME) ? " time" : "",
                    (flags & CHECKPOINT_FLUSH_ALL) ? " flush-all" : "")));

Are we allowed to use _("foo") with errmsg? Isn't "foo" going to get
translated twice that way?

> Or we could just not translate restartpoint/checkpoint - after all we don't
> translate the flags in LogCheckpointStart() either. But on balance I'd lean
> towards the above.

I'm not sure if it's correct to translate some parts of the message
and leave others. What exactly is the reason for this? I think the
reason in this case might be that some flag names with hyphens and
spaces before words may not have the right/matching words in all
languages. What happens if we choose to translate/not translate the
entire message?

How about just doing this for LogCheckpointStart?
    StringInfoData logmsg;
    initStringInfo(&logmsg);
    appendStringInfo(&logmsg,
    /* translator: the placeholders show checkpoint options */
                     restartpoint ? _("restartpoint
starting:%s%s%s%s%s%s%s%s") :
                                    _("checkpoint starting:%s%s%s%s%s%s%s%s"),
                     (flags & CHECKPOINT_IS_SHUTDOWN) ? " shutdown" : "",
                     (flags & CHECKPOINT_END_OF_RECOVERY) ? "
end-of-recovery" : "",
                     (flags & CHECKPOINT_IMMEDIATE) ? " immediate" : "",
                     (flags & CHECKPOINT_FORCE) ? " force" : "",
                     (flags & CHECKPOINT_WAIT) ? " wait" : "",
                     (flags & CHECKPOINT_CAUSE_XLOG) ? " wal" : "",
                     (flags & CHECKPOINT_CAUSE_TIME) ? " time" : "",
                     (flags & CHECKPOINT_FLUSH_ALL) ? " flush-all" : "");
    ereport(LOG, errmsg_internal("%s", logmsg.data));
    pfree(logmsg.data);

Similarly, for LogCheckpointEnd:
    StringInfoData logmsg;
    initStringInfo(&logmsg);
    appendStringInfo(&logmsg,
                    restartpoint ? _("restartpoint complete: ") :
                                   _("checkpoint complete: "));
    appendStringInfo(&logmsg,
    /* translator: the placeholders show restartpoint/checkpoint stats */
                     _("wrote %d buffers (%.1f%%); "
                      "%d WAL file(s) added, %d removed, %d recycled; "
                      "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
                      "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; "
                      "distance=%d kB, estimate=%d kB"),
                       CheckpointStats.ckpt_bufs_written,
                       (double) CheckpointStats.ckpt_bufs_written *
100 / NBuffers,
                       CheckpointStats.ckpt_segs_added,
                       CheckpointStats.ckpt_segs_removed,
                       CheckpointStats.ckpt_segs_recycled,
                       write_msecs / 1000, (int) (write_msecs % 1000),
                       sync_msecs / 1000, (int) (sync_msecs % 1000),
                       total_msecs / 1000, (int) (total_msecs % 1000),
                       CheckpointStats.ckpt_sync_rels,
                       longest_msecs / 1000, (int) (longest_msecs % 1000),
                       average_msecs / 1000, (int) (average_msecs % 1000),
                       (int) (PrevCheckPointDistance / 1024.0),
                       (int) (CheckPointDistanceEstimate / 1024.0));
    if (CheckpointStats.repl_snap_files_rmvd_cnt > 0)
    {
        long t_msecs;

        t_msecs =
TimestampDifferenceMilliseconds(CheckpointStats.repl_snap_start_t,

CheckpointStats.repl_snap_end_t);

        appendStringInfo(&logmsg,
                        _("; logical decoding snapshot file(s)
removed=%lu, time=%ld.%03d s, cutoff LSN=%X/%X"),
                          CheckpointStats.repl_snap_files_rmvd_cnt,
                          t_msecs / 1000, (int) (t_msecs % 1000),

LSN_FORMAT_ARGS(CheckpointStats.repl_snap_cutoff_lsn));
    }
    if (CheckpointStats.repl_map_files_rmvd_cnt ||
        CheckpointStats.repl_map_files_syncd_cnt > 0)
    {
        long t_msecs;

        t_msecs =
TimestampDifferenceMilliseconds(CheckpointStats.repl_snap_start_t,

CheckpointStats.repl_snap_end_t);

        appendStringInfo(&logmsg,
                        _("; logical decoding rewrite mapping file(s)
removed=%lu, synced=%lu, time=%ld.%03d s, cutoff LSN=%X/%X"),
                          CheckpointStats.repl_map_files_rmvd_cnt,
                          CheckpointStats.repl_map_files_syncd_cnt,
                          t_msecs / 1000, (int) (t_msecs % 1000),
                          LSN_FORMAT_ARGS(CheckpointStats.repl_map_cutoff_lsn));
    }
    ereport(LOG, errmsg_internal("%s", logmsg.data));
    pfree(logmsg.data);

> > Do you suggest something like below? Or some other better wording like
> > "logical decoding rewrite map files" and "logical decoding snapshot
> > files" or "logical rewrite map files" and "logical snapshot files" or
> > just "rewrite mapping files" or "snapshot files" .... ?
>
> Both seem still very long. I still am doubtful this level of detail is
> appropriate. Seems more like a thing for a tracepoint or such. How about just
> printing the time for the logical decoding operations in aggregate, without
> breaking down into files, adding LSNs etc?

The distinction that the patch makes right now is for snapshot and
rewrite mapping files and it makes sense to have them separately. The
cutoff LSNs might be needed some times but I'm okay to ignore them
from the message. Therefore, I'm planning to have like this:
    if (CheckpointStats.repl_snap_files_rmvd_cnt > 0)
    {
        long t_msecs;

        t_msecs =
TimestampDifferenceMilliseconds(CheckpointStats.repl_snap_start_t,

CheckpointStats.repl_snap_end_t);

        appendStringInfo(&logmsg,
                        _("; logical decoding snapshot file(s)
removed=%lu, time=%ld.%03d s"),
                          CheckpointStats.repl_snap_files_rmvd_cnt,
                          t_msecs / 1000, (int) (t_msecs % 1000));
    }
    if (CheckpointStats.repl_map_files_rmvd_cnt ||
        CheckpointStats.repl_map_files_syncd_cnt > 0)
    {
        long t_msecs;

        t_msecs =
TimestampDifferenceMilliseconds(CheckpointStats.repl_snap_start_t,

CheckpointStats.repl_snap_end_t);

        appendStringInfo(&logmsg,
                        _("; logical decoding rewrite mapping file(s)
removed=%lu, synced=%lu, time=%ld.%03d s"),
                          CheckpointStats.repl_map_files_rmvd_cnt,
                          CheckpointStats.repl_map_files_syncd_cnt,
                          t_msecs / 1000, (int) (t_msecs % 1000));
    }

PS: I'm sorry for posting code snippets embedded in the message here,
but please bear with me until a final conclusion on the formatting is
reached, then I will post all of them as a single patch.

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: [PATCH] Add native windows on arm64 support
Next
From: "Gunnar \"Nick\" Bluth"
Date:
Subject: Re: [PATCH] pg_stat_toast v9