Thread: Cleaning up and speeding up string functions
Here's a small patch series aimed to both clean up a few misuses of string functions and also to optimise a few things along the way. 0001: Converts various call that use appendPQExpBuffer() that really should use appendPQExrBufferStr(). If there's no formatting then using the former function is a waste of effort. 0002: Similar to 0001 but replaces various appendStringInfo calls with appendStringInfoString calls. 0003: Adds a new function named appendStringInfoStringInfo() which appends one StringInfo onto another. Various places did this using appendStringInfoString(), but that required a needless strlen() call. The length is already known and stored in the StringInfo's len field. Not sure if this is the best name for this function, but can't think of a better one right now. 0004: inlines appendStringInfoString so that any callers that pass in a string constant (most of them) can have the strlen() call optimised out. I don't have any benchmarks to show workloads that this improves, Likely the chances that it'll slow anything down are pretty remote. I'll park this here until July. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley <david.rowley@2ndquadrant.com> writes: > Here's a small patch series aimed to both clean up a few misuses of > string functions and also to optimise a few things along the way. > 0001: Converts various call that use appendPQExpBuffer() that really > should use appendPQExrBufferStr(). If there's no formatting then > using the former function is a waste of effort. > 0002: Similar to 0001 but replaces various appendStringInfo calls with > appendStringInfoString calls. Agreed on these; we've applied such transformations before. > 0003: Adds a new function named appendStringInfoStringInfo() which > appends one StringInfo onto another. Various places did this using > appendStringInfoString(), but that required a needless strlen() call. I can't get excited about this one unless you can point to places where the savings is meaningful. Otherwise it's just adding mental burden. > 0004: inlines appendStringInfoString so that any callers that pass in > a string constant (most of them) can have the strlen() call optimised > out. Here the cost is code space rather than programmer-visible complexity, but I still doubt that it's worth it. regards, tom lane
On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <david.rowley@2ndquadrant.com> writes: > > 0003: Adds a new function named appendStringInfoStringInfo() which > > appends one StringInfo onto another. Various places did this using > > appendStringInfoString(), but that required a needless strlen() call. > > I can't get excited about this one unless you can point to places > where the savings is meaningful. Otherwise it's just adding mental > burden. The original idea was just to use appendBinaryStringInfo and make use of the StringInfo's len field. Peter mentioned he'd rather seen a wrapper function here [1]. > > 0004: inlines appendStringInfoString so that any callers that pass in > > a string constant (most of them) can have the strlen() call optimised > > out. > > Here the cost is code space rather than programmer-visible complexity, > but I still doubt that it's worth it. I see on today's master the postgres binary did grow from 8633960 bytes to 8642504 on my machine using GCC 8.3, so you might be right. pg_receivewal grew from 96376 to 96424 bytes. [1] https://www.postgresql.org/message-id/5567B7F5.7050705%40gmx.net -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2019-May-26, David Rowley wrote: > On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Here the cost is code space rather than programmer-visible complexity, > > but I still doubt that it's worth it. > > I see on today's master the postgres binary did grow from 8633960 > bytes to 8642504 on my machine using GCC 8.3, so you might be right. > pg_receivewal grew from 96376 to 96424 bytes. I suppose one place that could be affected visibly is JSON object construction (json.c, jsonfuncs.c) that could potentially deal with millions of stringinfo manipulations, but most of those calls don't actually use appendStringInfoString with constant values, so it's probably not worth bothering with. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 6 Jun 2019 at 08:54, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2019-May-26, David Rowley wrote: > > > On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Here the cost is code space rather than programmer-visible complexity, > > > but I still doubt that it's worth it. > > > > I see on today's master the postgres binary did grow from 8633960 > > bytes to 8642504 on my machine using GCC 8.3, so you might be right. > > pg_receivewal grew from 96376 to 96424 bytes. > > I suppose one place that could be affected visibly is JSON object > construction (json.c, jsonfuncs.c) that could potentially deal with > millions of stringinfo manipulations, but most of those calls don't > actually use appendStringInfoString with constant values, so it's > probably not worth bothering with. We could probably get the best of both worlds by using a macro and __builtin_constant_p() to detect if the string is a const, but I won't be pushing for that unless I find something to make it worthwhile. For patch 0004, I think it's likely worth revising so instead of adding a new function, make use of appendBinaryStringInfo() and pass in the known length. Likely mostly for the xml.c calls. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sun, 26 May 2019 at 04:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <david.rowley@2ndquadrant.com> writes: > > Here's a small patch series aimed to both clean up a few misuses of > > string functions and also to optimise a few things along the way. > > > 0001: Converts various call that use appendPQExpBuffer() that really > > should use appendPQExrBufferStr(). If there's no formatting then > > using the former function is a waste of effort. > > > 0002: Similar to 0001 but replaces various appendStringInfo calls with > > appendStringInfoString calls. > > Agreed on these; we've applied such transformations before. I've pushed 0001 and 0002. Instead of having 0004, how about the attached? Most of the calls won't improve much performance-wise since they're so cheap anyway, but there is xmlconcat(), I imagine that should see some speedup. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote: > Instead of having 0004, how about the attached? > > Most of the calls won't improve much performance-wise since they're so > cheap anyway, but there is xmlconcat(), I imagine that should see some > speedup. I've pushed this after having found a couple more places where the length is known. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Cleaning up and speeding up string functions
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
David Rowley <david.rowley@2ndquadrant.com> writes: > On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Instead of having 0004, how about the attached? >> >> Most of the calls won't improve much performance-wise since they're so >> cheap anyway, but there is xmlconcat(), I imagine that should see some >> speedup. > > I've pushed this after having found a couple more places where the > length is known. I noticed a lot of these are appending one StringInfo onto another; would it make sense to introduce a helper funciton appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the `str.data, str2.len` repetition? - ilmari -- "A disappointingly low fraction of the human race is, at any given time, on fire." - Stig Sandbeck Mathisen
Re: Cleaning up and speeding up string functions
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes: > David Rowley <david.rowley@2ndquadrant.com> writes: > >> On Thu, 4 Jul 2019 at 13:51, David Rowley <david.rowley@2ndquadrant.com> wrote: >>> Instead of having 0004, how about the attached? >>> >>> Most of the calls won't improve much performance-wise since they're so >>> cheap anyway, but there is xmlconcat(), I imagine that should see some >>> speedup. >> >> I've pushed this after having found a couple more places where the >> length is known. > > I noticed a lot of these are appending one StringInfo onto another; > would it make sense to introduce a helper funciton > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the > `str.data, str2.len` repetition? A bit of grepping only turned up 18 uses, but I was bored and whipped up the attached anyway, in case we decide it's worth it. - ilmari -- "The surreality of the universe tends towards a maximum" -- Skud's Law "Never formulate a law or axiom that you're not prepared to live with the consequences of." -- Skud's Meta-Law From 1e68adae513425470cad10cd2a44f66bca61a5fd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Mon, 22 Jul 2019 15:01:03 +0100 Subject: [PATCH] Add appendStringInfoStringInfo() function This simplifies appending one StringInfo to another --- contrib/hstore/hstore_io.c | 2 +- contrib/postgres_fdw/deparse.c | 4 ++-- src/backend/access/transam/xlog.c | 2 +- src/backend/executor/execMain.c | 2 +- src/backend/lib/stringinfo.c | 12 ++++++++++++ src/backend/storage/lmgr/deadlock.c | 2 +- src/backend/utils/adt/ri_triggers.c | 4 ++-- src/backend/utils/adt/ruleutils.c | 10 +++++----- src/backend/utils/adt/xml.c | 12 +++++------- src/include/lib/stringinfo.h | 7 +++++++ 10 files changed, 37 insertions(+), 20 deletions(-) diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c index 745497c76f..58415ccaec 100644 --- a/contrib/hstore/hstore_io.c +++ b/contrib/hstore/hstore_io.c @@ -1309,7 +1309,7 @@ hstore_to_json_loose(PG_FUNCTION_ARGS) appendBinaryStringInfo(&tmp, HSTORE_VAL(entries, base, i), HSTORE_VALLEN(entries, i)); if (IsValidJsonNumber(tmp.data, tmp.len)) - appendBinaryStringInfo(&dst, tmp.data, tmp.len); + appendStringInfoStringInfo(&dst, &tmp); else escape_json(&dst, tmp.data); } diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 19f928ec59..510e034e8e 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1531,7 +1531,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, { Assert(fpinfo->jointype == JOIN_INNER); Assert(fpinfo->joinclauses == NIL); - appendBinaryStringInfo(buf, join_sql_o.data, join_sql_o.len); + appendStringInfoStringInfo(buf, &join_sql_o); return; } } @@ -1552,7 +1552,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, { Assert(fpinfo->jointype == JOIN_INNER); Assert(fpinfo->joinclauses == NIL); - appendBinaryStringInfo(buf, join_sql_i.data, join_sql_i.len); + appendStringInfoStringInfo(buf, &join_sql_i); return; } } diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index da3d250986..e674c63056 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1192,7 +1192,7 @@ XLogInsertRecord(XLogRecData *rdata, */ initStringInfo(&recordBuf); for (; rdata != NULL; rdata = rdata->next) - appendBinaryStringInfo(&recordBuf, rdata->data, rdata->len); + appendStringInfoStringInfo(&recordBuf, rdata); if (!debug_reader) debug_reader = XLogReaderAllocate(wal_segment_size, NULL, NULL); diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index dbd7dd9bcd..1969b36891 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -2309,7 +2309,7 @@ ExecBuildSlotValueDescription(Oid reloid, if (!table_perm) { appendStringInfoString(&collist, ") = "); - appendBinaryStringInfo(&collist, buf.data, buf.len); + appendStringInfoStringInfo(&collist, &buf); return collist.data; } diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c index 99c83c1549..d25d122289 100644 --- a/src/backend/lib/stringinfo.c +++ b/src/backend/lib/stringinfo.c @@ -165,6 +165,18 @@ appendStringInfoString(StringInfo str, const char *s) appendBinaryStringInfo(str, s, strlen(s)); } +/* + * appendStringInfoStringInfo + * + * Append another StringInfo to str. + * Like appendBinaryStringInfo(str, str2->data, str2->len) + */ +void +appendStringInfoStringInfo(StringInfo str, const StringInfo str2) +{ + appendBinaryStringInfo(str, str2->data, str2->len); +} + /* * appendStringInfoChar * diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c index 990d48377d..14a47b9e66 100644 --- a/src/backend/storage/lmgr/deadlock.c +++ b/src/backend/storage/lmgr/deadlock.c @@ -1115,7 +1115,7 @@ DeadLockReport(void) } /* Duplicate all the above for the server ... */ - appendBinaryStringInfo(&logbuf, clientbuf.data, clientbuf.len); + appendStringInfoStringInfo(&logbuf, &clientbuf); /* ... and add info about query strings */ for (i = 0; i < nDeadlockDetails; i++) diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c index 8c895459c3..5b4872db09 100644 --- a/src/backend/utils/adt/ri_triggers.c +++ b/src/backend/utils/adt/ri_triggers.c @@ -927,7 +927,7 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS) queryoids[i] = pk_type; queryoids[j] = pk_type; } - appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len); + appendStringInfoStringInfo(&querybuf, &qualbuf); /* Prepare and save the plan */ qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys * 2, queryoids, @@ -1106,7 +1106,7 @@ ri_set(TriggerData *trigdata, bool is_set_null) qualsep = "AND"; queryoids[i] = pk_type; } - appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len); + appendStringInfoStringInfo(&querybuf, &qualbuf); /* Prepare and save the plan */ qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids, diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 0c58f1f109..1801899ec1 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -2804,9 +2804,9 @@ pg_get_functiondef(PG_FUNCTION_ARGS) appendStringInfoChar(&dq, 'x'); appendStringInfoChar(&dq, '$'); - appendBinaryStringInfo(&buf, dq.data, dq.len); + appendStringInfoStringInfo(&buf, &dq); appendStringInfoString(&buf, prosrc); - appendBinaryStringInfo(&buf, dq.data, dq.len); + appendStringInfoStringInfo(&buf, &dq); appendStringInfoChar(&buf, '\n'); @@ -2930,7 +2930,7 @@ print_function_rettype(StringInfo buf, HeapTuple proctup) appendStringInfoString(&rbuf, format_type_be(proc->prorettype)); } - appendBinaryStringInfo(buf, rbuf.data, rbuf.len); + appendStringInfoStringInfo(buf, &rbuf); } /* @@ -5682,7 +5682,7 @@ get_target_list(List *targetList, deparse_context *context, } /* Add the new field */ - appendBinaryStringInfo(buf, targetbuf.data, targetbuf.len); + appendStringInfoStringInfo(buf, &targetbuf); } /* clean up */ @@ -9987,7 +9987,7 @@ get_from_clause(Query *query, const char *prefix, deparse_context *context) } /* Add the new item */ - appendBinaryStringInfo(buf, itembuf.data, itembuf.len); + appendStringInfoStringInfo(buf, &itembuf); /* clean up */ pfree(itembuf.data); diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index 5e629d29ea..2a3f5403cf 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -559,7 +559,7 @@ xmlconcat(List *args) 0, global_standalone); - appendBinaryStringInfo(&buf2, buf.data, buf.len); + appendStringInfoStringInfo(&buf2, &buf); buf = buf2; } @@ -1879,8 +1879,7 @@ xml_errorHandler(void *data, xmlErrorPtr error) if (xmlerrcxt->strictness == PG_XML_STRICTNESS_LEGACY) { appendStringInfoLineSeparator(&xmlerrcxt->err_buf); - appendBinaryStringInfo(&xmlerrcxt->err_buf, errorBuf->data, - errorBuf->len); + appendStringInfoStringInfo(&xmlerrcxt->err_buf, errorBuf); pfree(errorBuf->data); pfree(errorBuf); @@ -1898,8 +1897,7 @@ xml_errorHandler(void *data, xmlErrorPtr error) if (level >= XML_ERR_ERROR) { appendStringInfoLineSeparator(&xmlerrcxt->err_buf); - appendBinaryStringInfo(&xmlerrcxt->err_buf, errorBuf->data, - errorBuf->len); + appendStringInfoStringInfo(&xmlerrcxt->err_buf, errorBuf); xmlerrcxt->err_occurred = true; } @@ -2876,7 +2874,7 @@ schema_to_xml_internal(Oid nspid, const char *xmlschema, bool nulls, subres = table_to_xml_internal(relid, NULL, nulls, tableforest, targetns, false); - appendBinaryStringInfo(result, subres->data, subres->len); + appendStringInfoStringInfo(result, subres); appendStringInfoChar(result, '\n'); } @@ -3051,7 +3049,7 @@ database_to_xml_internal(const char *xmlschema, bool nulls, subres = schema_to_xml_internal(nspid, NULL, nulls, tableforest, targetns, false); - appendBinaryStringInfo(result, subres->data, subres->len); + appendStringInfoStringInfo(result, subres); appendStringInfoChar(result, '\n'); } diff --git a/src/include/lib/stringinfo.h b/src/include/lib/stringinfo.h index c4842778c5..8a6ffd7ce8 100644 --- a/src/include/lib/stringinfo.h +++ b/src/include/lib/stringinfo.h @@ -112,6 +112,13 @@ extern int appendStringInfoVA(StringInfo str, const char *fmt, va_list args) pg_ */ extern void appendStringInfoString(StringInfo str, const char *s); +/*------------------------ + * appendStringInfoString + * Append another StringInfo to str. + * Like appendBinaryStringInfo(str, str2->data, str2->len) + */ +extern void appendStringInfoStringInfo(StringInfo str, const StringInfo str2); + /*------------------------ * appendStringInfoChar * Append a single byte to str. -- 2.20.1
On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote: > ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes: > > > I noticed a lot of these are appending one StringInfo onto another; > > would it make sense to introduce a helper funciton > > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the > > `str.data, str2.len` repetition? > > A bit of grepping only turned up 18 uses, but I was bored and whipped up > the attached anyway, in case we decide it's worth it. David had already submitted the same thing upthread, and it was rejected on the grounds that it increases the code space. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Cleaning up and speeding up string functions
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Jul-22, Dagfinn Ilmari Mannsåker wrote: > >> ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes: >> >> > I noticed a lot of these are appending one StringInfo onto another; >> > would it make sense to introduce a helper funciton >> > appendStringInfoStringInfo(StringInfo str, StringInfo str2) to avoid the >> > `str.data, str2.len` repetition? >> >> A bit of grepping only turned up 18 uses, but I was bored and whipped up >> the attached anyway, in case we decide it's worth it. > > David had already submitted the same thing upthread, and it was rejected > on the grounds that it increases the code space. Oops, sorry, I missed that. Never mind then. - ilmari -- "The surreality of the universe tends towards a maximum" -- Skud's Law "Never formulate a law or axiom that you're not prepared to live with the consequences of." -- Skud's Meta-Law