Thread: Use "WAL segment" instead of "log segment" consistently in user-facing messages

Use "WAL segment" instead of "log segment" consistently in user-facing messages

From
Bharath Rupireddy
Date:
Hi,

It looks like we use "log segment" in various user-facing messages.
The term "log" can mean server logs as well. The "WAL segment" suits
well here and it is consistently used across the other user-facing
messages [1].

Here's a small patch attempting to consistently use the "WAL segment".

Thoughts?

[1]
pg_log_error("could not fetch WAL segment size: got %d rows and %d
fields, expected %d rows and %d or more fields",
pg_log_error("WAL segment size could not be parsed");
pg_log_error(ngettext("WAL segment size must be a power of two between
1 MB and 1 GB, but the remote server reported a value of %d byte",
printf(_("WARNING: invalid WAL segment size\n"));
printf(_("Bytes per WAL segment:                %u\n"),
fatal_error(ngettext("WAL segment size must be a power of two between
1 MB and 1 GB, but the WAL file \"%s\" header specifies %d byte",
errmsg("requested WAL segment %s has already been removed",
elog(DEBUG2, "removed temporary WAL segment \"%s\"", path);

Regards,
Bharath Rupireddy.

Attachment

Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

From
Kyotaro Horiguchi
Date:
At Mon, 28 Feb 2022 21:03:07 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> Hi,
> 
> It looks like we use "log segment" in various user-facing messages.
> The term "log" can mean server logs as well. The "WAL segment" suits
> well here and it is consistently used across the other user-facing
> messages [1].
> 
> Here's a small patch attempting to consistently use the "WAL segment".
> 
> Thoughts?

I tend to agree to this.  I also see "log record(s)" (without prefixed
by "write-ahead") in many places especially in the documentation.  I'm
not sure how we should treat "WAL log", though.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

From
Bharath Rupireddy
Date:
On Tue, Mar 1, 2022 at 6:50 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Mon, 28 Feb 2022 21:03:07 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > Hi,
> >
> > It looks like we use "log segment" in various user-facing messages.
> > The term "log" can mean server logs as well. The "WAL segment" suits
> > well here and it is consistently used across the other user-facing
> > messages [1].
> >
> > Here's a small patch attempting to consistently use the "WAL segment".
> >
> > Thoughts?
>
> I tend to agree to this.

Thanks for taking a look at it. Here's the CF entry -
https://commitfest.postgresql.org/38/3584/

> I also see "log record(s)" (without prefixed
> by "write-ahead") in many places especially in the documentation.  I'm
> not sure how we should treat "WAL log", though.

Yes, but the docs have a glossary term for 'Log record" [1]. FWIW
attaching docs change as v2-0002 patch. I found another place where
"log records" is being used in pg_waldump.c, I changed that and
attached v2-0001 patch.

Please review the v2 patch set.

[1]
 <glossentry id="glossary-log-record">
   <glossterm>Log record</glossterm>
    <glossdef>
     <para>
      Archaic term for a <glossterm linkend="glossary-wal-record">WAL
record</glossterm>.
     </para>
    </glossdef>
  </glossentry>

Regards,
Bharath Rupireddy.

Attachment

Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

From
Bharath Rupireddy
Date:
On Wed, Mar 2, 2022 at 11:41 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Please review the v2 patch set.

Had to rebase. Attaching v3 patch set.

Regards,
Bharath Rupireddy.

Attachment
    At all times, <productname>PostgreSQL</productname> maintains a
    <firstterm>write ahead log</firstterm> (WAL) in the <filename>pg_wal/</filename>
-   subdirectory of the cluster's data directory. The log records
-   every change made to the database's data files.  This log exists
+   subdirectory of the cluster's data directory. The WAL records
+   capture every change made to the database's data files.  This log exists

I don't think this change really adds anything.  The preceding sentence
makes it clear that we are discussing the write-ahead log, and IMO the
change in phrasing ("the log records every change" is changed to "the
records capture every change") subtly changes the meaning of the sentence.

The rest looks good to me. 

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages

From
Kyotaro Horiguchi
Date:
At Thu, 31 Mar 2022 08:45:56 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in 
>     At all times, <productname>PostgreSQL</productname> maintains a
>     <firstterm>write ahead log</firstterm> (WAL) in the <filename>pg_wal/</filename>
> -   subdirectory of the cluster's data directory. The log records
> -   every change made to the database's data files.  This log exists
> +   subdirectory of the cluster's data directory. The WAL records
> +   capture every change made to the database's data files.  This log exists
> 
> I don't think this change really adds anything.  The preceding sentence
> makes it clear that we are discussing the write-ahead log, and IMO the
> change in phrasing ("the log records every change" is changed to "the
> records capture every change") subtly changes the meaning of the sentence.
> 
> The rest looks good to me. 

+1.  It is not a composite noun "log records".

The original sentence is "S(The log) V(records) O(every change that is
made to .. files)".  The proposed change looks like changing it to
"S(The log records) V(capture) O(every ..files)".  In that sense, the
original one seem rather correct to me, since "capture" seems to have
the implication of "write after log..", to me.


I looked though the document and found other use of "log
record|segment".  What do you think about the attached?

There're some uncertain point in the change.

      you should at least save the contents of the cluster's <filename>pg_wal</filename>
-     subdirectory, as it might contain logs which
+     subdirectory, as it might contain WAL files which
      were not archived before the system went down.

The "logs" means acutally "WAL segment (files)" but the concept of
"segment" is out of focus in the context.  So just "file" is used
there.  The same change is applied on dezon of places.


-   disk-space requirements for the <acronym>WAL</acronym> logs are met,
+   disk-space requirements for the <acronym>WAL</acronym> are met,

This might be better be "WAL files" instead of just "WAL".


-   <acronym>WAL</acronym> logs are stored in the directory
+   <acronym>WAL</acronym> is stored in the directory
    <filename>pg_wal</filename> under the data directory, as a set of

I'm not sure which is better, use "WAL" as a collective noun, or "WAL
files" as the cocrete objects.


-   The aim of <acronym>WAL</acronym> is to ensure that the log is
+   The aim of <acronym>WAL</acronym> is to ensure that the WAL record is
    written before database records are altered, but this can be subverted by

This is not a mechanical change.  But I think this is correct.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment
On Fri, Apr 01, 2022 at 10:31:10AM +0900, Kyotaro Horiguchi wrote:
>       you should at least save the contents of the cluster's <filename>pg_wal</filename>
> -     subdirectory, as it might contain logs which
> +     subdirectory, as it might contain WAL files which
>       were not archived before the system went down.
> 
> The "logs" means acutally "WAL segment (files)" but the concept of
> "segment" is out of focus in the context.  So just "file" is used
> there.  The same change is applied on dezon of places.

This change seems reasonable to me.

> -   disk-space requirements for the <acronym>WAL</acronym> logs are met,
> +   disk-space requirements for the <acronym>WAL</acronym> are met,
> 
> This might be better be "WAL files" instead of just "WAL".

+1 for "WAL files"

> -   <acronym>WAL</acronym> logs are stored in the directory
> +   <acronym>WAL</acronym> is stored in the directory
>     <filename>pg_wal</filename> under the data directory, as a set of
> 
> I'm not sure which is better, use "WAL" as a collective noun, or "WAL
> files" as the cocrete objects.

My vote is for "WAL files" because it was previously "WAL logs."

> -   The aim of <acronym>WAL</acronym> is to ensure that the log is
> +   The aim of <acronym>WAL</acronym> is to ensure that the WAL record is
>     written before database records are altered, but this can be subverted by
> 
> This is not a mechanical change.  But I think this is correct.

IMO the original wording is fine.  I think it is sufficiently clear that
"log" refers to "write-ahead log," and this sentence seems intended to
convey the basic rule of "log before data."  However, the rest of the
sentence is a little weird.  It's basically saying "the aim of the log is
to ensure that the log is written..."  Isn't the aim of the log to record
the database activity?  Perhaps we should rewrite it to something like the
following:

    A basic rule of WAL is that the log must be written before the database
    files are altered, but this can be...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



It's been a few weeks, so I'm marking the commitfest entry as
waiting-on-author.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



On Tue, Apr 26, 2022 at 1:24 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> It's been a few weeks, so I'm marking the commitfest entry as
> waiting-on-author.

Thanks. I'm attaching the updated v4 patches (also subsumed Kyotaro
San's patch at [1]). Please review it further.

[1] https://www.postgresql.org/message-id/20220401.103110.1103213854487561781.horikyota.ntt%40gmail.com

Regards,
Bharath Rupireddy.

Attachment
Overall, these patches look reasonable.

On Mon, Jul 18, 2022 at 04:24:12PM +0530, Bharath Rupireddy wrote:
>     record.  Because the entire content of data pages is saved in the
> -   log on the first page modification after a checkpoint (assuming
> +   WAL record on the first page modification after a checkpoint (assuming
>     <xref linkend="guc-full-page-writes"/> is not disabled), all pages
>     changed since the checkpoint will be restored to a consistent
>     state.

nitpick: I would remove the word "record" in this change.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



On Tue, Jul 19, 2022 at 12:00 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> Overall, these patches look reasonable.
>
> On Mon, Jul 18, 2022 at 04:24:12PM +0530, Bharath Rupireddy wrote:
> >     record.  Because the entire content of data pages is saved in the
> > -   log on the first page modification after a checkpoint (assuming
> > +   WAL record on the first page modification after a checkpoint (assuming
> >     <xref linkend="guc-full-page-writes"/> is not disabled), all pages
> >     changed since the checkpoint will be restored to a consistent
> >     state.
>
> nitpick: I would remove the word "record" in this change.

Done. PSA v5 patch set.

Regards,
Bharath Rupireddy.

Attachment
On Tue, Jul 19, 2022 at 02:43:59PM +0530, Bharath Rupireddy wrote:
> Done. PSA v5 patch set.

LGTM.  I've marked this as ready-for-committer.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



At Tue, 19 Jul 2022 09:58:28 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in 
> On Tue, Jul 19, 2022 at 02:43:59PM +0530, Bharath Rupireddy wrote:
> > Done. PSA v5 patch set.
> 
> LGTM.  I've marked this as ready-for-committer.

I find the following sentense in config.sgml. "Specifies the minimum
size of past log file segments kept in the pg_wal directory"

postgresql.conf.sample contains "logfile segment" in a few lines.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Wed, Jul 20, 2022 at 6:55 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Tue, 19 Jul 2022 09:58:28 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in
> > On Tue, Jul 19, 2022 at 02:43:59PM +0530, Bharath Rupireddy wrote:
> > > Done. PSA v5 patch set.
> >
> > LGTM.  I've marked this as ready-for-committer.
>
> I find the following sentense in config.sgml. "Specifies the minimum
> size of past log file segments kept in the pg_wal directory"
>
> postgresql.conf.sample contains "logfile segment" in a few lines.

Done. PSA v6 patch set.

Regards,
Bharath Rupireddy.

Attachment
At Wed, 20 Jul 2022 10:02:22 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> Done. PSA v6 patch set.

Thanks!

-        Specifies the minimum size of past log file segments kept in the
+        Specifies the minimum size of past WAL files kept in the

-        log file by reducing the value of this parameter. On a system with
+        file by reducing the value of this parameter. On a system with

Looks fine. And postgresq.conf.sample has the following lines:

#archive_library = ''        # library to use to archive a logfile segment

#archive_command = ''        # command to use to archive a logfile segment

#archive_timeout = 0        # force a logfile segment switch after this

#restore_command = ''        # command to use to restore an archived logfile segment

Aren't they need the same fix?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Wed, 20 Jul 2022 10:02:22 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > Done. PSA v6 patch set.
>
> Thanks!
>
> -        Specifies the minimum size of past log file segments kept in the
> +        Specifies the minimum size of past WAL files kept in the
>
> -        log file by reducing the value of this parameter. On a system with
> +        file by reducing the value of this parameter. On a system with
>
> Looks fine. And postgresq.conf.sample has the following lines:
>
> #archive_library = ''           # library to use to archive a logfile segment
>
> #archive_command = ''           # command to use to archive a logfile segment
>
> #archive_timeout = 0            # force a logfile segment switch after this
>
> #restore_command = ''           # command to use to restore an archived logfile segment
>
> Aren't they need the same fix?

Indeed. Thanks. Now, they are in sync with their peers in .conf.sample
file as well as description in guc.c.

PSA v7 patch set.

Regards,
Bharath Rupireddy.

Attachment
At Wed, 20 Jul 2022 17:25:33 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> PSA v7 patch set.

Thanks. Looks perfect, but (sorry..) in the final checking, I found
"log archive" in the doc.  If you agree to it please merge the
attached (or refined one) and I'd call it a day.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment
On Thu, Jul 21, 2022 at 9:50 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Wed, 20 Jul 2022 17:25:33 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
> > On Wed, Jul 20, 2022 at 12:55 PM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > PSA v7 patch set.
>
> Thanks. Looks perfect, but (sorry..) in the final checking, I found
> "log archive" in the doc.  If you agree to it please merge the
> attached (or refined one) and I'd call it a day.

Merged. PSA v8 patch set.

Regards,
Bharath Rupireddy.

Attachment
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> Merged. PSA v8 patch set.

Pushed, thanks.

            regards, tom lane