Re: Logical Replica ReorderBuffer Size Accounting Issues - Mailing list pgsql-bugs

From Alex Richman
Subject Re: Logical Replica ReorderBuffer Size Accounting Issues
Date
Msg-id CAMnUB3pwknqoe5s-bGuRD8nX1bWkZRbFF=jWNLTWbm_etFigkA@mail.gmail.com
Whole thread Raw
In response to RE: Logical Replica ReorderBuffer Size Accounting Issues  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Responses RE: Logical Replica ReorderBuffer Size Accounting Issues  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
List pgsql-bugs
On Thu, 12 Jan 2023 at 10:44, wangw.fnst@fujitsu.com <wangw.fnst@fujitsu.com> wrote:
I think parallelism doesn't affect this problem. Because for a walsender, I
think it will always read the wal serially in order. Please let me know if I'm
missing something.
I suspect it's more about getting enough changes into the WAL quickly enough for walsender to not spend any time idle.  I suppose you could stack the deck towards this by first disabling the subscription, doing the updates to spool a bunch of changes in the WAL, then enabling the subscription again.  Perhaps there is also some impact in the WAL records interleaving from the concurrent updates and making more work for the reorder buffer.
The servers I am testing on are quite beefy, so it might be a little harder to generate sufficient load if you're testing locally on a laptop or something.
 
And I tried to use the table structure and UPDATE statement you said. But
unfortunately I didn't catch 1GB or unexpected (I mean a lot size beyond 256MB)
usage in rb->tup_context. Could you please help me to confirm my test? Here is
my test details:
Here's test scripts that replicate it for me: [1]
This is on 15.1, installed on debian-11, running on GCP n2-highmem-80 (IceLake) /w 24x Local SSD in raid0.
 
BTW, I'm not sure, what is the operator '@-' at the end of the UPDATE statement
you mentioned? Do you mean '#-'? I think JSONB seem not to have operator '@-'.
So I used '#-' instead of '@-' when testing.
Not sure tbh, I pulled this out of our source control at the callsite generating the prod traffic I'm seeing.  Seems to be doing the same thing as '-' though (deleting elements from the column matching keys in the array).  Separately @ is "absolute value" [2] so maybe this is just @ and - in a weird mix that happens to work :shrug:.
 
Could you share one thing with me: When you print rb->size and call the
function MemoryContextStats(rb->context), which line of code is being executed
by the program?
While I was checking rb->size against the memory context I was logging in and around ReorderBufferGetTuple.


To go towards the theory on GenerationAlloc being the issue, I have a patch that replaces ReorderBuffer's usage of it with simple malloc/free: [3].  Naturally this would not be sensible as a long term fix, but does prove the issue.
This results in the walsender process RSS never exceeding 297MB, even under heavier load.  If I have the opportunity I will also test this against our production traffic to double confirm that the issue does not replicate with it there.

Thanks,
- Alex.

pgsql-bugs by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Logical Replica ReorderBuffer Size Accounting Issues
Next
From: PG Bug reporting form
Date:
Subject: BUG #17748: postres_raster extension missing in postgis packages for SLES