Thread: Proposal: add new API to stringinfo

Proposal: add new API to stringinfo

From
Tatsuo Ishii
Date:
Currently the StringInfo package provides StringInfo object creation
API with fixed length initial allocation size (1024 bytes). However,
if we want to allocate much smaller size of initial allocation, this
is waste of space.

Background: While working on this:
https://www.postgresql.org/message-id/20241219.151950.488757175470671324.ishii%40postgresql.org
I need to create lots of StringInfo many (like 100k) objects if a
Window frame has that number of rows.  In this case postgres allocates
1024 * 100k = 97.7MB of memory, which is too much because I usually
only need 10 or so bytes for each StringInfo data buffer.

To solve the problem I need to hack the internal of StringInfo object
like this.

len = 10;
str = makeStringInfo();
pfree(encoded_str->data);
str->data = (char *)palloc0(len);
str->maxlen = len;

This is not only ugly but breaks interface boundary. If we have
another API like makeStringInfoWithSize(), I could do that in much
better and simple way:

len = 10;
str = makeStringInfoWithSize(len);

Attached is a patch to implement it. In the patch I add two new APIs.

extern StringInfo makeStringInfoWithSize(int size);
extern void initStringInfoWithSize(StringInfo str, int size);

Maybe I could re-invent the wheel by copying stringinfo.c, but I think
there are some uses cases like me, and it could justify in adding more
code to stringinfo.c.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/common/stringinfo.c b/src/common/stringinfo.c
index 6192e65477..e740b499fa 100644
--- a/src/common/stringinfo.c
+++ b/src/common/stringinfo.c
@@ -46,6 +46,24 @@ makeStringInfo(void)
     return res;
 }
 
+/*
+ * makeStringInfoWithSize
+ *
+ * Create an empty 'StringInfoData' & return a pointer to it.
+ * The initial memory allocation size is specified by 'size'.
+ */
+StringInfo
+makeStringInfoWithSize(int size)
+{
+    StringInfo    res;
+
+    res = (StringInfo) palloc(sizeof(StringInfoData));
+
+    initStringInfoWithSize(res, size);
+
+    return res;
+}
+
 /*
  * initStringInfo
  *
@@ -57,6 +75,20 @@ initStringInfo(StringInfo str)
 {
     int            size = 1024;    /* initial default buffer size */
 
+    initStringInfoWithSize(str, size);
+}
+
+/*
+ * initStringInfoWithSize
+ *
+ * Same as initStringInfo except accepting additional 'size' argument for the
+ * initial memory allocation.
+ */
+void
+initStringInfoWithSize(StringInfo str, int size)
+{
+    Assert(size > 0);
+
     str->data = (char *) palloc(size);
     str->maxlen = size;
     resetStringInfo(str);
diff --git a/src/include/lib/stringinfo.h b/src/include/lib/stringinfo.h
index cd9632e3fc..5d603aa0cc 100644
--- a/src/include/lib/stringinfo.h
+++ b/src/include/lib/stringinfo.h
@@ -60,6 +60,10 @@ typedef StringInfoData *StringInfo;
  * StringInfo stringptr = makeStringInfo();
  *        Both the StringInfoData and the data buffer are palloc'd.
  *
+ * StringInfo stringptr = makeStringInfoWithSize(size);
+ *        Both the StringInfoData and the data buffer are palloc'd.
+ *        The data buffer is allocated with size 'size'.
+ *
  * StringInfoData string;
  * initStringInfo(&string);
  *        The data buffer is palloc'd but the StringInfoData is just local.
@@ -106,6 +110,13 @@ typedef StringInfoData *StringInfo;
  */
 extern StringInfo makeStringInfo(void);
 
+/*------------------------
+ * makeStringInfo
+ * Create an empty 'StringInfoData' & return a pointer to it.
+ * The data buffer is allocated with size 'size'.
+ */
+extern StringInfo makeStringInfoWithSize(int size);
+
 /*------------------------
  * initStringInfo
  * Initialize a StringInfoData struct (with previously undefined contents)
@@ -113,6 +124,14 @@ extern StringInfo makeStringInfo(void);
  */
 extern void initStringInfo(StringInfo str);
 
+/*------------------------
+ * initStringInfoWithSize
+ * Initialize a StringInfoData struct (with previously undefined contents)
+ * to describe an empty string.
+ * The data buffer is allocated with size 'size'.
+ */
+extern void initStringInfoWithSize(StringInfo str, int size);
+
 /*------------------------
  * initReadOnlyStringInfo
  * Initialize a StringInfoData struct from an existing string without copying

Re: Proposal: add new API to stringinfo

From
Tatsuo Ishii
Date:
> On Wed, Dec 25, 2024 at 12:37:04PM +0900, Tatsuo Ishii wrote:
>> Attached is a patch to implement it. In the patch I add two new APIs.
>> 
>> extern StringInfo makeStringInfoWithSize(int size);
>> extern void initStringInfoWithSize(StringInfo str, int size);
>> 
>> Maybe I could re-invent the wheel by copying stringinfo.c, but I think
>> there are some uses cases like me, and it could justify in adding more
>> code to stringinfo.c.
> 
> Not sure how other feel about that, but I am not really convinced that
> we need two APIs.

We could make initStringInfoWithSize() as a static function if we
don't want to add two new APIs. The reason it is exported in the patch
is that I thought it maybe useful for callers who use stringinfo in
this pattern.

StringInfoData string;
initStringInfoWithSize(&string);

> Saying that, having more control over the initial
> size used for a StringInfo could provide better practices in some
> cases.  This reminds me of the ALLOCSET_SMALL_* fields in memutils.h,
> hence an idea would be an initStringInfoExtended() that you could
> combine with two #define two: one for the "default" of 1024 and a
> second one for "small", like 32 or 64 (?), that can be used at will
> with the new API call.  Then you could switch initStringInfo() to
> become a macro of the new "extended" call.  Just an idea.

But then the extensions that use stringinfo.c need to be recompiled?

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Proposal: add new API to stringinfo

From
David Rowley
Date:
On Wed, 25 Dec 2024 at 16:37, Tatsuo Ishii <ishii@postgresql.org> wrote:
> Currently the StringInfo package provides StringInfo object creation
> API with fixed length initial allocation size (1024 bytes). However,
> if we want to allocate much smaller size of initial allocation, this
> is waste of space.

I think it would be good to have some method to allow callers to
specify the initial size. I'm personally surprise that the 1024
setting has been ok with all callers so far. StringInfo has always
reminded me of C#'s StringBuilder class, which has an overloaded
constructor to allow callers to specify the initial size. It seems
they realised that the default size isn't always ideal, so I don't
find it surprising that we've realised that too.

If you have trouble convincing people we need this for some new
reason, then maybe you could review the existing callers to see if
some of the existing call sites make it any more convincing.

A very quick review, I found:

send_message_to_frontend() initStringInfo(&buf) 1024 is probably too big
HandleParallelMessages() seems to know exactly how many bytes are
needed. Can this use a read-only StringInfo?
send_feedback() seems to know what size of the buffer it needs
buf_init() knows the exact size.

I didn't review the patch in detail, but I think "initsize" would be a
better parameter name than "size".

David