Thread: Duplicat-word typos in code comments

Duplicat-word typos in code comments

From

Dagfinn Ilmari Mannsåker

Date:

04 October 2021, 12:56:23

Hi hackers,

I noticed a duplicate-word typo in a comments recently, and cooked up
the following ripgrep command to find some more.

  rg --multiline --pcre2 --type=c '(?<!struct )(?<!union )\b((?!long\b|endif\b|that\b)\w+)\s+(^\s*[*#]\s*)?\b\1\b'

PFA a patch with the result of that.

- ilmari


From 7ed1ce16a37e7a63e7015cc4ef5b2b70ba915498 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Mon, 4 Oct 2021 13:44:51 +0100
Subject: [PATCH] Fix some duplicate words in comments

---
 src/backend/access/transam/parallel.c | 2 +-
 src/backend/catalog/heap.c            | 5 ++---
 src/backend/commands/copyfromparse.c  | 2 +-
 src/backend/partitioning/partdesc.c   | 4 ++--
 src/backend/storage/buffer/bufmgr.c   | 2 +-
 src/backend/storage/ipc/standby.c     | 2 +-
 src/include/access/tableam.h          | 2 +-
 7 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 5fdcff2a3b..bb1881f573 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1534,7 +1534,7 @@ ParallelWorkerReportLastRecEnd(XLogRecPtr last_xlog_end)
  *
  * Also explicitly detach from dsm segment so that subsystems using
  * on_dsm_detach() have a chance to send stats before the stats subsystem is
- * shut down as as part of a before_shmem_exit() hook.
+ * shut down as part of a before_shmem_exit() hook.
  *
  * One might think this could instead be solved by carefully ordering the
  * attaching to dsm segments, so that the pgstats segments get detached from
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 83746d3fd9..41d1f3b1c3 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -3547,9 +3547,8 @@ heap_truncate_find_FKs(List *relationIds)
         /*
          * If this constraint has a parent constraint which we have not seen
          * yet, keep track of it for the second loop, below.  Tracking parent
-         * constraints allows us to climb up to the top-level level constraint
-         * and look for all possible relations referencing the partitioned
-         * table.
+         * constraints allows us to climb up to the top-level constraint and
+         * look for all possible relations referencing the partitioned table.
          */
         if (OidIsValid(con->conparentid) &&
             !list_member_oid(parent_cons, con->conparentid))
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index fdf57f1556..aac10165ec 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -37,7 +37,7 @@
  * the data is valid in the current encoding.
  *
  * In binary mode, the pipeline is much simpler.  Input is loaded into
- * into 'raw_buf', and encoding conversion is done in the datatype-specific
+ * 'raw_buf', and encoding conversion is done in the datatype-specific
  * receive functions, if required.  'input_buf' and 'line_buf' are not used,
  * but 'attribute_buf' is used as a temporary buffer to hold one attribute's
  * data when it's passed the receive function.
diff --git a/src/backend/partitioning/partdesc.c b/src/backend/partitioning/partdesc.c
index 9a9d6a9643..3220d4808d 100644
--- a/src/backend/partitioning/partdesc.c
+++ b/src/backend/partitioning/partdesc.c
@@ -91,8 +91,8 @@ RelationGetPartitionDesc(Relation rel, bool omit_detached)
      * cached descriptor too.  We determine that based on the pg_inherits.xmin
      * that was saved alongside that descriptor: if the xmin that was not in
      * progress for that active snapshot is also not in progress for the
-     * current active snapshot, then we can use use it.  Otherwise build one
-     * from scratch.
+     * current active snapshot, then we can use it.  Otherwise build one from
+     * scratch.
      */
     if (omit_detached &&
         rel->rd_partdesc_nodetached &&
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e88e4e918b..08ebabfe96 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -666,7 +666,7 @@ ReadRecentBuffer(RelFileNode rnode, ForkNumber forkNum, BlockNumber blockNum,
         {
             /*
              * It's now safe to pin the buffer.  We can't pin first and ask
-             * questions later, because because it might confuse code paths
+             * questions later, because it might confuse code paths
              * like InvalidateBuffer() if we pinned a random non-matching
              * buffer.
              */
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 077251c1a6..b17326bc20 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -130,7 +130,7 @@ InitRecoveryTransactionEnvironment(void)
  *
  * This must be called even in shutdown of startup process if transaction
  * tracking has been initialized. Otherwise some locks the tracked
- * transactions were holding will not be released and and may interfere with
+ * transactions were holding will not be released and may interfere with
  * the processes still running (but will exit soon later) at the exit of
  * startup process.
  */
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 9f1e4a1ac9..4aff18215b 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -157,7 +157,7 @@ typedef struct TM_FailureData
  * work, too.  This is a little like bottom-up deletion, but not too much.
  * The tableam will only perform speculative work when it's practically free
  * to do so in passing for simple deletion caller (while always performing
- * whatever work is is needed to enable knowndeletable/LP_DEAD index tuples to
+ * whatever work is needed to enable knowndeletable/LP_DEAD index tuples to
  * be deleted within index AM).  This is the real reason why it's possible for
  * simple index deletion caller to specify knowndeletable = false up front
  * (this means "check if it's possible for me to delete corresponding index
-- 
2.30.2

Re: Duplicat-word typos in code comments

From

Daniel Gustafsson

Date:

04 October 2021, 13:15:29

> On 4 Oct 2021, at 14:56, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:

> I noticed a duplicate-word typo in a comments recently, and cooked up
> the following ripgrep command to find some more.

Pushed to master, thanks!  I avoided the reflow of the comments though to make
it the minimal change.

--
Daniel Gustafsson        https://vmware.com/

Re: Duplicat-word typos in code comments

From

Dagfinn Ilmari Mannsåker

Date:

04 October 2021, 13:30:53

Daniel Gustafsson <daniel@yesql.se> writes:

>> On 4 Oct 2021, at 14:56, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:
>
>> I noticed a duplicate-word typo in a comments recently, and cooked up
>> the following ripgrep command to find some more.
>
> Pushed to master, thanks!

Thanks!

> I avoided the reflow of the comments though to make it the minimal
> change.

Fair enough. I wasn't sure myself whether to do it or not.

- ilmari

Re: Duplicat-word typos in code comments

From

Tom Lane

Date:

04 October 2021, 13:56:53

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
> Daniel Gustafsson <daniel@yesql.se> writes:
>> I avoided the reflow of the comments though to make it the minimal
>> change.

> Fair enough. I wasn't sure myself whether to do it or not.

The next pgindent run will do it anyway (except in comment blocks
starting in column 1).

I used to think it was better to go ahead and manually reflow, if you
use an editor that makes that easy.  That way there are fewer commits
touching any one line of code, which is good when trying to review
code history.  However, now that we've got the ability to make "git
blame" ignore pgindent commits, maybe it's better to leave that sort
of mechanical cleanup to pgindent, so that the substantive patch is
easier to review.

(But I'm not sure how well the ignore-these-commits behavior actually
works for cases like this.)

            regards, tom lane

Re: Duplicat-word typos in code comments

From

Daniel Gustafsson

Date:

04 October 2021, 19:19:50

> On 4 Oct 2021, at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I used to think it was better to go ahead and manually reflow, if you
> use an editor that makes that easy.  That way there are fewer commits
> touching any one line of code, which is good when trying to review
> code history.  However, now that we've got the ability to make "git
> blame" ignore pgindent commits, maybe it's better to leave that sort
> of mechanical cleanup to pgindent, so that the substantive patch is
> easier to review.

Yeah, that's precisely why I did it.  Since we can skip over pgindent sweeps it
makes sense to try and minimize such changes to make code archaeology easier.
There are of course cases when the result will be such an eyesore that we'd
prefer to have it done sooner, but in cases like these where line just got one
word shorter it seemed an easy choice.

--
Daniel Gustafsson        https://vmware.com/

Re: Duplicat-word typos in code comments

From

Tom Lane

Date:

04 October 2021, 19:54:00

Daniel Gustafsson <daniel@yesql.se> writes:
>> On 4 Oct 2021, at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I used to think it was better to go ahead and manually reflow, if you
>> use an editor that makes that easy.  That way there are fewer commits
>> touching any one line of code, which is good when trying to review
>> code history.  However, now that we've got the ability to make "git
>> blame" ignore pgindent commits, maybe it's better to leave that sort
>> of mechanical cleanup to pgindent, so that the substantive patch is
>> easier to review.

> Yeah, that's precisely why I did it.  Since we can skip over pgindent sweeps it
> makes sense to try and minimize such changes to make code archaeology easier.
> There are of course cases when the result will be such an eyesore that we'd
> prefer to have it done sooner, but in cases like these where line just got one
> word shorter it seemed an easy choice.

Actually though, there's another consideration: if you leave
not-correctly-pgindented code laying around, it causes problems
for the next hacker who modifies that file and wishes to neaten
up their own work by pgindenting it.  They can either tediously
reverse out part of the delta, or commit a patch that includes
entirely-unrelated cosmetic changes, neither of which is
pleasant.

            regards, tom lane

Re: Duplicat-word typos in code comments

From

Daniel Gustafsson

Date:

05 October 2021, 08:53:12

> On 4 Oct 2021, at 21:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>>> On 4 Oct 2021, at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I used to think it was better to go ahead and manually reflow, if you
>>> use an editor that makes that easy.  That way there are fewer commits
>>> touching any one line of code, which is good when trying to review
>>> code history.  However, now that we've got the ability to make "git
>>> blame" ignore pgindent commits, maybe it's better to leave that sort
>>> of mechanical cleanup to pgindent, so that the substantive patch is
>>> easier to review.
>
>> Yeah, that's precisely why I did it.  Since we can skip over pgindent sweeps it
>> makes sense to try and minimize such changes to make code archaeology easier.
>> There are of course cases when the result will be such an eyesore that we'd
>> prefer to have it done sooner, but in cases like these where line just got one
>> word shorter it seemed an easy choice.
>
> Actually though, there's another consideration: if you leave
> not-correctly-pgindented code laying around, it causes problems
> for the next hacker who modifies that file and wishes to neaten
> up their own work by pgindenting it.  They can either tediously
> reverse out part of the delta, or commit a patch that includes
> entirely-unrelated cosmetic changes, neither of which is
> pleasant.

Right, this is mainly targeting comments where changing a word on the first
line in an N line long comment can have the knock-on effect of changing N-1
lines just due to reflowing.  This is analogous to wrapping existing code in a
new block, causing a re-indentation to happen, except that for comments it can
sometimes be Ok to leave (as in this particular case).  At the end of the day,
it's all a case-by-case basis trade-off call.

--
Daniel Gustafsson        https://vmware.com/