Re: Making aggregate deserialization (and WAL receive) functions slightly faster - Mailing list pgsql-hackers

From David Rowley
Subject Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Date
Msg-id CAApHDvorfO3iBZ=xpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ@mail.gmail.com
Whole thread Raw
In response to Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Re: Making aggregate deserialization (and WAL receive) functions slightly faster
List pgsql-hackers
On Mon, 9 Oct 2023 at 21:17, David Rowley <dgrowleyml@gmail.com> wrote:
> Here are some more thoughts on how we could improve this:
>
> 1. Adjust the definition of StringInfoData.maxlen to define that -1
> means the StringInfoData's buffer is externally managed.
> 2. Adjust enlargeStringInfo() to add a check for maxlen = -1 and have
> it palloc, say, pg_next_pow2(str->len * 2) bytes and memcpy the
> existing (externally managed string) into the newly palloc'd buffer.
> 3. Add a new function along the lines of what I originally proposed to
> allow init of a StringInfoData using an existing allocated string
> which sets maxlen = -1.
> 4. Update all the existing places, including the ones I just committed
> (plus the ones you committed in ba1e066e4) to make use of the function
> added in #3.

I just spent the past few hours playing around with the attached WIP
patch to try to clean up the various places where we manually build
StringInfoDatas around the tree.

While working on this, I added an Assert in the new
initStringInfoFromStringWithLen function to ensure that data[len] ==
'\0' per the "There is guaranteed to be a terminating '\0' at
data[len]" comment in stringinfo.h.  It looks like we have some
existing breakers of this rule.

If you apply the attached patch to 608fd198de~1 and ignore the
rejected hunks from the deserial functions, you'll see an Assert
failure during 023_twophase_stream.pl

023_twophase_stream_subscriber.log indicates:
TRAP: failed Assert("data[len] == '\0'"), File:
"../../../../src/include/lib/stringinfo.h", Line: 97, PID: 1073141
postgres: subscriber: logical replication parallel apply worker for
subscription 16396 (ExceptionalCondition+0x70)[0x56160451e9d0]
postgres: subscriber: logical replication parallel apply worker for
subscription 16396 (ParallelApplyWorkerMain+0x53c)[0x5616043618cc]
postgres: subscriber: logical replication parallel apply worker for
subscription 16396 (StartBackgroundWorker+0x20b)[0x56160434452b]

So it seems like we have some existing issues with
LogicalParallelApplyLoop(). The code there does not properly NUL
terminate the StringInfoData.data field. There are some examples in
exec_bind_message() of how that could be fixed. I've CC'd Amit to let
him know about this.

I'll also need to revert 608fd198 as this also highlights that setting
the StringInfoData.data to point to a bytea Datum can't be done either
as those aren't NUL terminated strings.

If people think it's worthwhile having something like the attached to
try to eliminate our need to manually build StringInfoDatas then I can
spend more time on it once LogicalParallelApplyLoop() is fixed.

David

Attachment

pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: Amit Kapila
Date:
Subject: Re: typo in couple of places