Thread: Using the return value of strlcpy() and strlcat()

Using the return value of strlcpy() and strlcat()

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Hi hackers,

Over in the "Include all columns in default names for foreign key
constraints" thread[1], I noticed the patch added the following:

+        strlcpy(buf + buflen, name, NAMEDATALEN);
+        buflen += strlen(buf + buflen);

Seeing as strlcpy() returns the copied length, this seems rather
redundant.  A quick bit of grepping shows that this pattern occurs in
several places, including the ChooseIndexNameAddition and
ChooseExtendedStatisticNameAddition functions this was no doubt inspired
by.

Attached is a patch that instead uses the return value of strlcpy() and
strlcat().  I left some strlen() calls alone in places where it wasn't
convenient (e.g. pg_open_tzfile(), where it would need an extra
variable).

- ilmari

[1] https://postgr.es/m/CAF+2_SHzBU0tWKvJMZAXfcmrnCwJUeCrAohga0awDf9uDBptnw@mail.gmail.com
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen

From 077bf231aff1b6b5078328f86cc5c190732e1860 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Wed, 13 Mar 2019 15:07:47 +0000
Subject: [PATCH] Use the return value of strlcpy() and strlcat()

These functions return the copied/total string length, respectively,
so use that instead of doing immediately doing strlen() on the target
afterwards.
---
 src/backend/access/rmgrdesc/xactdesc.c | 6 ++----
 src/backend/commands/indexcmds.c       | 3 +--
 src/backend/commands/statscmds.c       | 3 +--
 src/backend/commands/tablecmds.c       | 3 +--
 src/backend/postmaster/pgarch.c        | 3 +--
 src/backend/utils/adt/pg_locale.c      | 3 +--
 src/backend/utils/misc/ps_status.c     | 6 +++---
 src/timezone/pgtz.c                    | 5 ++---
 8 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index a61f38dd19..b258a14808 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -104,8 +104,7 @@ ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *pars
 
         if (parsed->xinfo & XACT_XINFO_HAS_GID)
         {
-            strlcpy(parsed->twophase_gid, data, sizeof(parsed->twophase_gid));
-            data += strlen(data) + 1;
+            data += strlcpy(parsed->twophase_gid, data, sizeof(parsed->twophase_gid)) + 1;
         }
     }
 
@@ -188,8 +187,7 @@ ParseAbortRecord(uint8 info, xl_xact_abort *xlrec, xl_xact_parsed_abort *parsed)
 
         if (parsed->xinfo & XACT_XINFO_HAS_GID)
         {
-            strlcpy(parsed->twophase_gid, data, sizeof(parsed->twophase_gid));
-            data += strlen(data) + 1;
+            data += strlcpy(parsed->twophase_gid, data, sizeof(parsed->twophase_gid)) + 1;
         }
     }
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index c3a53d81aa..8f1dc4ce5b 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2131,8 +2131,7 @@ ChooseIndexNameAddition(List *colnames)
          * At this point we have buflen <= NAMEDATALEN.  name should be less
          * than NAMEDATALEN already, but use strlcpy for paranoia.
          */
-        strlcpy(buf + buflen, name, NAMEDATALEN);
-        buflen += strlen(buf + buflen);
+        buflen += strlcpy(buf + buflen, name, NAMEDATALEN);
         if (buflen >= NAMEDATALEN)
             break;
     }
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 8274792a77..344c37f66f 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -527,8 +527,7 @@ ChooseExtendedStatisticNameAddition(List *exprs)
          * At this point we have buflen <= NAMEDATALEN.  name should be less
          * than NAMEDATALEN already, but use strlcpy for paranoia.
          */
-        strlcpy(buf + buflen, name, NAMEDATALEN);
-        buflen += strlen(buf + buflen);
+        buflen += strlcpy(buf + buflen, name, NAMEDATALEN);
         if (buflen >= NAMEDATALEN)
             break;
     }
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 515c29072c..1728f3670a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7242,8 +7242,7 @@ ChooseForeignKeyConstraintNameAddition(List *colnames)
          * At this point we have buflen <= NAMEDATALEN.  name should be less
          * than NAMEDATALEN already, but use strlcpy for paranoia.
          */
-        strlcpy(buf + buflen, name, NAMEDATALEN);
-        buflen += strlen(buf + buflen);
+        buflen += strlcpy(buf + buflen, name, NAMEDATALEN);
         if (buflen >= NAMEDATALEN)
             break;
     }
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index f84f882c4c..9ec9bf3fff 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -588,8 +588,7 @@ pgarch_archiveXlog(char *xlog)
                 case 'f':
                     /* %f: filename of source file */
                     sp++;
-                    strlcpy(dp, xlog, endp - dp);
-                    dp += strlen(dp);
+                    dp += strlcpy(dp, xlog, endp - dp);
                     break;
                 case '%':
                     /* convert %% to a single % */
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 50b8b31645..7c5d756fe3 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -741,8 +741,7 @@ strftime_win32(char *dst, size_t dstlen,
 
         if (convstr != dst)
         {
-            strlcpy(dst, convstr, dstlen);
-            len = strlen(dst);
+            len = strlcpy(dst, convstr, dstlen);
             pfree(convstr);
         }
     }
diff --git a/src/backend/utils/misc/ps_status.c b/src/backend/utils/misc/ps_status.c
index 67ba95c5f7..0fafd4c782 100644
--- a/src/backend/utils/misc/ps_status.c
+++ b/src/backend/utils/misc/ps_status.c
@@ -346,9 +346,9 @@ set_ps_display(const char *activity, bool force)
 #endif
 
     /* Update ps_buffer to contain both fixed part and activity */
-    strlcpy(ps_buffer + ps_buffer_fixed_size, activity,
-            ps_buffer_size - ps_buffer_fixed_size);
-    ps_buffer_cur_len = strlen(ps_buffer);
+    ps_buffer_cur_len = ps_buffer_fixed_size +
+        strlcpy(ps_buffer + ps_buffer_fixed_size, activity,
+                ps_buffer_size - ps_buffer_fixed_size);
 
     /* Transmit new setting to kernel, if necessary */
 
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index ee111a80d4..01c82872f3 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -51,7 +51,7 @@ pg_TZDIR(void)
         return tzdir;
 
     get_share_path(my_exec_path, tzdir);
-    strlcpy(tzdir + strlen(tzdir), "/timezone", MAXPGPATH - strlen(tzdir));
+    strlcat(tzdir, "/timezone", sizeof(tzdir));
 
     done_tzdir = true;
     return tzdir;
@@ -81,8 +81,7 @@ pg_open_tzfile(const char *name, char *canonname)
     int            orignamelen;
 
     /* Initialize fullname with base name of tzdata directory */
-    strlcpy(fullname, pg_TZDIR(), sizeof(fullname));
-    orignamelen = fullnamelen = strlen(fullname);
+    orignamelen = fullnamelen = strlcpy(fullname, pg_TZDIR(), sizeof(fullname));
 
     if (fullnamelen + 1 + strlen(name) >= MAXPGPATH)
         return -1;                /* not gonna fit */
-- 
2.20.1


Re: Using the return value of strlcpy() and strlcat()

From
Tom Lane
Date:
ilmari@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:
> [ let's convert
> +        strlcpy(buf + buflen, name, NAMEDATALEN);
> +        buflen += strlen(buf + buflen);
> to
> +        buflen += strlcpy(buf + buflen, name, NAMEDATALEN);
> ]

I don't think that's a safe transformation: what strlcpy returns is
strlen(src), which might be different from what it was actually
able to fit into the destination.

Sure, they're equivalent if no truncation occurred; but if we were
100.00% sure of no truncation, we'd likely not bother with strlcpy.

            regards, tom lane


Re: Using the return value of strlcpy() and strlcat()

From
Ashwin Agrawal
Date:

On Wed, Mar 13, 2019 at 9:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:
> [ let's convert
> +             strlcpy(buf + buflen, name, NAMEDATALEN);
> +             buflen += strlen(buf + buflen);
> to
> +             buflen += strlcpy(buf + buflen, name, NAMEDATALEN);
> ]

I don't think that's a safe transformation: what strlcpy returns is
strlen(src), which might be different from what it was actually
able to fit into the destination.

Sure, they're equivalent if no truncation occurred; but if we were
100.00% sure of no truncation, we'd likely not bother with strlcpy.

So, if return value < length (3rd argument) we should be able to use the return value and avoid the strlen, else do the strlen ?

Re: Using the return value of strlcpy() and strlcat()

From
Tom Lane
Date:
Ashwin Agrawal <aagrawal@pivotal.io> writes:
> On Wed, Mar 13, 2019 at 9:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't think that's a safe transformation: what strlcpy returns is
>> strlen(src), which might be different from what it was actually
>> able to fit into the destination.
>> Sure, they're equivalent if no truncation occurred; but if we were
>> 100.00% sure of no truncation, we'd likely not bother with strlcpy.

> So, if return value < length (3rd argument) we should be able to use the
> return value and avoid the strlen, else do the strlen ?

Mmm ... if there's a way to do it that's not messy and typo-prone,
maybe.  But I'm dubious that the potential gain is worth complicating
the code.  The strings involved aren't usually all that long.

            regards, tom lane


Re: Using the return value of strlcpy() and strlcat()

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Ashwin Agrawal <aagrawal@pivotal.io> writes:
>> On Wed, Mar 13, 2019 at 9:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I don't think that's a safe transformation: what strlcpy returns is
>>> strlen(src), which might be different from what it was actually
>>> able to fit into the destination.

Yeah, Andrew Gierth pointed this out on IRC as well.

>>> Sure, they're equivalent if no truncation occurred; but if we were
>>> 100.00% sure of no truncation, we'd likely not bother with strlcpy.
>
>> So, if return value < length (3rd argument) we should be able to use the
>> return value and avoid the strlen, else do the strlen ?
>
> Mmm ... if there's a way to do it that's not messy and typo-prone,
> maybe.  But I'm dubious that the potential gain is worth complicating
> the code.  The strings involved aren't usually all that long.

Please consider this patch withdrawn.

- ilmari
-- 
"I use RMS as a guide in the same way that a boat captain would use
 a lighthouse.  It's good to know where it is, but you generally
 don't want to find yourself in the same spot." - Tollef Fog Heen


Re: Using the return value of strlcpy() and strlcat()

From
David Rowley
Date:
On Fri, 15 Mar 2019 at 00:11, Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > Mmm ... if there's a way to do it that's not messy and typo-prone,
> > maybe.  But I'm dubious that the potential gain is worth complicating
> > the code.  The strings involved aren't usually all that long.
>
> Please consider this patch withdrawn.

Amusingly it seems the strlcpy() return value of the chars it would
have copied if it didn't run out of space is not all that useful for
us. I see exactly 2 matches of git grep "= strlcpy".  The
isolation_init() one looks genuine, but the SerializeLibraryState()
looks a bit bogus. Looking at EstimateLibraryStateSpace() it seems it
estimates the exact space, so the strlcpy should never cut short, but
it does seem like a bad example to leave laying around.

We should have maybe thought a bit harder when we put that strlcpy
code into the codebase and considered if we might have been better off
inventing our own function that just returns what it did copy instead
of what it would have.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Using the return value of strlcpy() and strlcat()

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> We should have maybe thought a bit harder when we put that strlcpy
> code into the codebase and considered if we might have been better off
> inventing our own function that just returns what it did copy instead
> of what it would have.

Well, strlcpy is (somewhat) standardized; we didn't just invent it
off the cuff.  I thought a little bit about whether it would be worth
having a variant version with a different return value, but concluded
that having YA strcpy variant would more likely be a dangerous source
of thinkos than something that was actually helpful.  Otherwise I'd
have given Ashwin a more positive reaction...

            regards, tom lane