Thread: Cleaning up and speeding up string functions

Cleaning up and speeding up string functions

From
David Rowley
Date:
Here's a small patch series aimed to both clean up a few misuses of
string functions and also to optimise a few things along the way.

0001: Converts various call that use appendPQExpBuffer() that really
should use appendPQExrBufferStr().  If there's no formatting then
using the former function is a waste of effort.

0002: Similar to 0001 but replaces various appendStringInfo calls with
appendStringInfoString calls.

0003: Adds a new function named appendStringInfoStringInfo() which
appends one StringInfo onto another.  Various places did this using
appendStringInfoString(), but that required a needless strlen() call.
The length is already known and stored in the StringInfo's len field.
Not sure if this is the best name for this function, but can't think
of a better one right now.

0004: inlines appendStringInfoString so that any callers that pass in
a string constant (most of them) can have the strlen() call optimised
out.

I don't have any benchmarks to show workloads that this improves,
Likely the chances that it'll slow anything down are pretty remote.

I'll park this here until July.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Cleaning up and speeding up string functions

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> Here's a small patch series aimed to both clean up a few misuses of
> string functions and also to optimise a few things along the way.

> 0001: Converts various call that use appendPQExpBuffer() that really
> should use appendPQExrBufferStr().  If there's no formatting then
> using the former function is a waste of effort.

> 0002: Similar to 0001 but replaces various appendStringInfo calls with
> appendStringInfoString calls.

Agreed on these; we've applied such transformations before.

> 0003: Adds a new function named appendStringInfoStringInfo() which
> appends one StringInfo onto another.  Various places did this using
> appendStringInfoString(), but that required a needless strlen() call.

I can't get excited about this one unless you can point to places
where the savings is meaningful.  Otherwise it's just adding mental
burden.

> 0004: inlines appendStringInfoString so that any callers that pass in
> a string constant (most of them) can have the strlen() call optimised
> out.

Here the cost is code space rather than programmer-visible complexity,
but I still doubt that it's worth it.

            regards, tom lane



Re: Cleaning up and speeding up string functions

From
David Rowley
Date:
On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > 0003: Adds a new function named appendStringInfoStringInfo() which
> > appends one StringInfo onto another.  Various places did this using
> > appendStringInfoString(), but that required a needless strlen() call.
>
> I can't get excited about this one unless you can point to places
> where the savings is meaningful.  Otherwise it's just adding mental
> burden.

The original idea was just to use appendBinaryStringInfo and make use
of the StringInfo's len field. Peter mentioned he'd rather seen a
wrapper function here [1].

> > 0004: inlines appendStringInfoString so that any callers that pass in
> > a string constant (most of them) can have the strlen() call optimised
> > out.
>
> Here the cost is code space rather than programmer-visible complexity,
> but I still doubt that it's worth it.

I see on today's master the postgres binary did grow from 8633960
bytes to 8642504 on my machine using GCC 8.3, so you might be right.
pg_receivewal grew from 96376 to 96424 bytes.

[1] https://www.postgresql.org/message-id/5567B7F5.7050705%40gmx.net


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Cleaning up and speeding up string functions

From
Alvaro Herrera
Date:
On 2019-May-26, David Rowley wrote:

> On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > Here the cost is code space rather than programmer-visible complexity,
> > but I still doubt that it's worth it.
> 
> I see on today's master the postgres binary did grow from 8633960
> bytes to 8642504 on my machine using GCC 8.3, so you might be right.
> pg_receivewal grew from 96376 to 96424 bytes.

I suppose one place that could be affected visibly is JSON object
construction (json.c, jsonfuncs.c) that could potentially deal with
millions of stringinfo manipulations, but most of those calls don't
actually use appendStringInfoString with constant values, so it's
probably not worth bothering with.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Cleaning up and speeding up string functions

From
David Rowley
Date:
On Thu, 6 Jun 2019 at 08:54, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2019-May-26, David Rowley wrote:
>
> > On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > > Here the cost is code space rather than programmer-visible complexity,
> > > but I still doubt that it's worth it.
> >
> > I see on today's master the postgres binary did grow from 8633960
> > bytes to 8642504 on my machine using GCC 8.3, so you might be right.
> > pg_receivewal grew from 96376 to 96424 bytes.
>
> I suppose one place that could be affected visibly is JSON object
> construction (json.c, jsonfuncs.c) that could potentially deal with
> millions of stringinfo manipulations, but most of those calls don't
> actually use appendStringInfoString with constant values, so it's
> probably not worth bothering with.

We could probably get the best of both worlds by using a macro and
__builtin_constant_p() to detect if the string is a const, but I won't
be pushing for that unless I find something to make it worthwhile.

For patch 0004, I think it's likely worth revising so instead of
adding a new function, make use of appendBinaryStringInfo() and pass
in the known length. Likely mostly for the xml.c calls.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Cleaning up and speeding up string functions

From
David Rowley
Date:
On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > Here's a small patch series aimed to both clean up a few misuses of
> > string functions and also to optimise a few things along the way.
>
> > 0001: Converts various call that use appendPQExpBuffer() that really
> > should use appendPQExrBufferStr().  If there's no formatting then
> > using the former function is a waste of effort.
>
> > 0002: Similar to 0001 but replaces various appendStringInfo calls with
> > appendStringInfoString calls.
>
> Agreed on these; we've applied such transformations before.

I've pushed 0001 and 0002.

Instead of having 0004, how about the attached?

Most of the calls won't improve much performance-wise since they're so
cheap anyway, but there is xmlconcat(), I imagine that should see some
speedup.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Cleaning up and speeding up string functions

From
David Rowley
Date:
On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote:
> Instead of having 0004, how about the attached?
>
> Most of the calls won't improve much performance-wise since they're so
> cheap anyway, but there is xmlconcat(), I imagine that should see some
> speedup.

I've pushed this after having found a couple more places where the
length is known.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Cleaning up and speeding up string functions

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:

> On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> Instead of having 0004, how about the attached?
>>
>> Most of the calls won't improve much performance-wise since they're so
>> cheap anyway, but there is xmlconcat(), I imagine that should see some
>> speedup.
>
> I've pushed this after having found a couple more places where the
> length is known.

I noticed a lot of these are appending one StringInfo onto another;
would it make sense to introduce a helper funciton
appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
`str.data, str2.len` repetition?

- ilmari
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen



Re: Cleaning up and speeding up string functions

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

> David Rowley <david.rowley@2ndquadrant.com> writes:
>
>> On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote:
>>> Instead of having 0004, how about the attached?
>>>
>>> Most of the calls won't improve much performance-wise since they're so
>>> cheap anyway, but there is xmlconcat(), I imagine that should see some
>>> speedup.
>>
>> I've pushed this after having found a couple more places where the
>> length is known.
>
> I noticed a lot of these are appending one StringInfo onto another;
> would it make sense to introduce a helper funciton
> appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
> `str.data, str2.len` repetition?

A bit of grepping only turned up 18 uses, but I was bored and whipped up
the attached anyway, in case we decide it's worth it.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."                              -- Skud's Meta-Law

From 1e68adae513425470cad10cd2a44f66bca61a5fd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Mon, 22 Jul 2019 15:01:03 +0100
Subject: [PATCH] Add appendStringInfoStringInfo() function

This simplifies appending one StringInfo to another
---
 contrib/hstore/hstore_io.c          |  2 +-
 contrib/postgres_fdw/deparse.c      |  4 ++--
 src/backend/access/transam/xlog.c   |  2 +-
 src/backend/executor/execMain.c     |  2 +-
 src/backend/lib/stringinfo.c        | 12 ++++++++++++
 src/backend/storage/lmgr/deadlock.c |  2 +-
 src/backend/utils/adt/ri_triggers.c |  4 ++--
 src/backend/utils/adt/ruleutils.c   | 10 +++++-----
 src/backend/utils/adt/xml.c         | 12 +++++-------
 src/include/lib/stringinfo.h        |  7 +++++++
 10 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c
index 745497c76f..58415ccaec 100644
--- a/contrib/hstore/hstore_io.c
+++ b/contrib/hstore/hstore_io.c
@@ -1309,7 +1309,7 @@ hstore_to_json_loose(PG_FUNCTION_ARGS)
             appendBinaryStringInfo(&tmp, HSTORE_VAL(entries, base, i),
                                    HSTORE_VALLEN(entries, i));
             if (IsValidJsonNumber(tmp.data, tmp.len))
-                appendBinaryStringInfo(&dst, tmp.data, tmp.len);
+                appendStringInfoStringInfo(&dst, &tmp);
             else
                 escape_json(&dst, tmp.data);
         }
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 19f928ec59..510e034e8e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1531,7 +1531,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
             {
                 Assert(fpinfo->jointype == JOIN_INNER);
                 Assert(fpinfo->joinclauses == NIL);
-                appendBinaryStringInfo(buf, join_sql_o.data, join_sql_o.len);
+                appendStringInfoStringInfo(buf, &join_sql_o);
                 return;
             }
         }
@@ -1552,7 +1552,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
             {
                 Assert(fpinfo->jointype == JOIN_INNER);
                 Assert(fpinfo->joinclauses == NIL);
-                appendBinaryStringInfo(buf, join_sql_i.data, join_sql_i.len);
+                appendStringInfoStringInfo(buf, &join_sql_i);
                 return;
             }
         }
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index da3d250986..e674c63056 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1192,7 +1192,7 @@ XLogInsertRecord(XLogRecData *rdata,
          */
         initStringInfo(&recordBuf);
         for (; rdata != NULL; rdata = rdata->next)
-            appendBinaryStringInfo(&recordBuf, rdata->data, rdata->len);
+            appendStringInfoStringInfo(&recordBuf, rdata);
 
         if (!debug_reader)
             debug_reader = XLogReaderAllocate(wal_segment_size, NULL, NULL);
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbd7dd9bcd..1969b36891 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2309,7 +2309,7 @@ ExecBuildSlotValueDescription(Oid reloid,
     if (!table_perm)
     {
         appendStringInfoString(&collist, ") = ");
-        appendBinaryStringInfo(&collist, buf.data, buf.len);
+        appendStringInfoStringInfo(&collist, &buf);
 
         return collist.data;
     }
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 99c83c1549..d25d122289 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -165,6 +165,18 @@ appendStringInfoString(StringInfo str, const char *s)
     appendBinaryStringInfo(str, s, strlen(s));
 }
 
+/*
+ * appendStringInfoStringInfo
+ *
+ * Append another StringInfo to str.
+ * Like appendBinaryStringInfo(str, str2->data, str2->len)
+ */
+void
+appendStringInfoStringInfo(StringInfo str, const StringInfo str2)
+{
+    appendBinaryStringInfo(str, str2->data, str2->len);
+}
+
 /*
  * appendStringInfoChar
  *
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 990d48377d..14a47b9e66 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -1115,7 +1115,7 @@ DeadLockReport(void)
     }
 
     /* Duplicate all the above for the server ... */
-    appendBinaryStringInfo(&logbuf, clientbuf.data, clientbuf.len);
+    appendStringInfoStringInfo(&logbuf, &clientbuf);
 
     /* ... and add info about query strings */
     for (i = 0; i < nDeadlockDetails; i++)
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 8c895459c3..5b4872db09 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -927,7 +927,7 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
             queryoids[i] = pk_type;
             queryoids[j] = pk_type;
         }
-        appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len);
+        appendStringInfoStringInfo(&querybuf, &qualbuf);
 
         /* Prepare and save the plan */
         qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys * 2, queryoids,
@@ -1106,7 +1106,7 @@ ri_set(TriggerData *trigdata, bool is_set_null)
             qualsep = "AND";
             queryoids[i] = pk_type;
         }
-        appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len);
+        appendStringInfoStringInfo(&querybuf, &qualbuf);
 
         /* Prepare and save the plan */
         qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 0c58f1f109..1801899ec1 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -2804,9 +2804,9 @@ pg_get_functiondef(PG_FUNCTION_ARGS)
         appendStringInfoChar(&dq, 'x');
     appendStringInfoChar(&dq, '$');
 
-    appendBinaryStringInfo(&buf, dq.data, dq.len);
+    appendStringInfoStringInfo(&buf, &dq);
     appendStringInfoString(&buf, prosrc);
-    appendBinaryStringInfo(&buf, dq.data, dq.len);
+    appendStringInfoStringInfo(&buf, &dq);
 
     appendStringInfoChar(&buf, '\n');
 
@@ -2930,7 +2930,7 @@ print_function_rettype(StringInfo buf, HeapTuple proctup)
         appendStringInfoString(&rbuf, format_type_be(proc->prorettype));
     }
 
-    appendBinaryStringInfo(buf, rbuf.data, rbuf.len);
+    appendStringInfoStringInfo(buf, &rbuf);
 }
 
 /*
@@ -5682,7 +5682,7 @@ get_target_list(List *targetList, deparse_context *context,
         }
 
         /* Add the new field */
-        appendBinaryStringInfo(buf, targetbuf.data, targetbuf.len);
+        appendStringInfoStringInfo(buf, &targetbuf);
     }
 
     /* clean up */
@@ -9987,7 +9987,7 @@ get_from_clause(Query *query, const char *prefix, deparse_context *context)
             }
 
             /* Add the new item */
-            appendBinaryStringInfo(buf, itembuf.data, itembuf.len);
+            appendStringInfoStringInfo(buf, &itembuf);
 
             /* clean up */
             pfree(itembuf.data);
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 5e629d29ea..2a3f5403cf 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -559,7 +559,7 @@ xmlconcat(List *args)
                        0,
                        global_standalone);
 
-        appendBinaryStringInfo(&buf2, buf.data, buf.len);
+        appendStringInfoStringInfo(&buf2, &buf);
         buf = buf2;
     }
 
@@ -1879,8 +1879,7 @@ xml_errorHandler(void *data, xmlErrorPtr error)
     if (xmlerrcxt->strictness == PG_XML_STRICTNESS_LEGACY)
     {
         appendStringInfoLineSeparator(&xmlerrcxt->err_buf);
-        appendBinaryStringInfo(&xmlerrcxt->err_buf, errorBuf->data,
-                               errorBuf->len);
+        appendStringInfoStringInfo(&xmlerrcxt->err_buf, errorBuf);
 
         pfree(errorBuf->data);
         pfree(errorBuf);
@@ -1898,8 +1897,7 @@ xml_errorHandler(void *data, xmlErrorPtr error)
     if (level >= XML_ERR_ERROR)
     {
         appendStringInfoLineSeparator(&xmlerrcxt->err_buf);
-        appendBinaryStringInfo(&xmlerrcxt->err_buf, errorBuf->data,
-                               errorBuf->len);
+        appendStringInfoStringInfo(&xmlerrcxt->err_buf, errorBuf);
 
         xmlerrcxt->err_occurred = true;
     }
@@ -2876,7 +2874,7 @@ schema_to_xml_internal(Oid nspid, const char *xmlschema, bool nulls,
         subres = table_to_xml_internal(relid, NULL, nulls, tableforest,
                                        targetns, false);
 
-        appendBinaryStringInfo(result, subres->data, subres->len);
+        appendStringInfoStringInfo(result, subres);
         appendStringInfoChar(result, '\n');
     }
 
@@ -3051,7 +3049,7 @@ database_to_xml_internal(const char *xmlschema, bool nulls,
         subres = schema_to_xml_internal(nspid, NULL, nulls,
                                         tableforest, targetns, false);
 
-        appendBinaryStringInfo(result, subres->data, subres->len);
+        appendStringInfoStringInfo(result, subres);
         appendStringInfoChar(result, '\n');
     }
 
diff --git a/src/include/lib/stringinfo.h b/src/include/lib/stringinfo.h
index c4842778c5..8a6ffd7ce8 100644
--- a/src/include/lib/stringinfo.h
+++ b/src/include/lib/stringinfo.h
@@ -112,6 +112,13 @@ extern int    appendStringInfoVA(StringInfo str, const char *fmt, va_list args) pg_
  */
 extern void appendStringInfoString(StringInfo str, const char *s);
 
+/*------------------------
+ * appendStringInfoString
+ * Append another StringInfo to str.
+ * Like appendBinaryStringInfo(str, str2->data, str2->len)
+ */
+extern void appendStringInfoStringInfo(StringInfo str, const StringInfo str2);
+
 /*------------------------
  * appendStringInfoChar
  * Append a single byte to str.
-- 
2.20.1


Re: Cleaning up and speeding up string functions

From
Alvaro Herrera
Date:
On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote:

> ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:
> 
> > I noticed a lot of these are appending one StringInfo onto another;
> > would it make sense to introduce a helper funciton
> > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
> > `str.data, str2.len` repetition?
> 
> A bit of grepping only turned up 18 uses, but I was bored and whipped up
> the attached anyway, in case we decide it's worth it.

David had already submitted the same thing upthread, and it was rejected
on the grounds that it increases the code space.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Cleaning up and speeding up string functions

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:

> On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote:
>
>> ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:
>> 
>> > I noticed a lot of these are appending one StringInfo onto another;
>> > would it make sense to introduce a helper funciton
>> > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the
>> > `str.data, str2.len` repetition?
>> 
>> A bit of grepping only turned up 18 uses, but I was bored and whipped up
>> the attached anyway, in case we decide it's worth it.
>
> David had already submitted the same thing upthread, and it was rejected
> on the grounds that it increases the code space.

Oops, sorry, I missed that. Never mind then.

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."                              -- Skud's Meta-Law