On Sat, Jan 7, 2023 at 0:36 AM Alex Richman <alexrichman@onesignal.com> wrote:
> > Do you have any sample data to confirm
> > this? If you can't share sample data, can you let us know the average
> > tuple size?
>
> I suspect the writes are updates to jsonb columns. I can correlate bursts of
> writes of this form to the memory spikes:
> UPDATE suspect_table
> SET jsonb_column = jsonb_column || $1,
> updated_at = $2
> WHERE ...
> The update being added in $1 is typically a single new field. The jsonb column is
> flat string key/value pairs, e.g. lots of {"key": "value", ...}.
>
> The average size of the whole tuple in the suspect table is ~800 bytes (based on
> 10000 random samples), of which the jsonb column is 80%.
>
> I have been trying to break into a walsender to inspect some tuple bufs directly
> and compare the ChangeSize vs GetTupleBuf size as you suggest, but it's proving
> a little tricky - I'll let you know if I have any luck here.
Hi,
Thanks for your report and Amit's analysis.
I tried to do some research with gdb. And I think the adjustment of the
parameter 'size' in the function GenerationAlloc() can cause the requested
memory to become larger for each change.
I tried to reproduce the problem with the table structure you mentioned, but
rb->size didn't get close to 5GB after setting 256MB limit.
I think that with the same logical_decoding_work_mem, the more the number of
changes, the more extra space will be allocated due to the adjustment in the
function GenerationAlloc(). So I resized my test tuple (reduce the tuple size),
and rb->size just exceeded the configured logical_decoding_work_mem a bit. (For
every additional 1MB configured, the additional 40+KB space will be allocated.)
I'm not sure if there is a problem with my reproduction approach, could you
please help to confirm? Here is my test details:
[Table info]
create table tab(a jsonb, b text, c int);
[Tuple info]
I use the pg_column_size() to select specific data.
The size of column 'a' in my test tuple is 27 bytes. (Before resizing it's 620 bytes.)
The size of column 'b' is 5 byte. (Before resizing it's 164 bytes.)
[Reproduce SQL]
UPDATE tab SET a = (a || '{"key0":"values0"}'), c = c*3 WHERE mod(c,2) = 1;
If you have a successfully reproduced use case, could you please provide more
detailed reproduction steps if possible?
Regards,
Wang Wei