Thread: Make reorder buffer max_changes_in_memory adjustable?
Hi hackers.
Recently I came to an issue about logical replicating very big
transactions. Since we already have logical_decoding_work_mem to keep
the memory usage, there is no risk of OOM during decoding. However, the
memory usage still goes out of control in 'Tuples' memory context of
reorder buffer. It seems that when restoring the spilled transactions
from disk, the memory usage is still limited by max_changes_in_memory
which is hard coded to 4096 like what decoding does before v13.
For big transactions, we have already supported streaming mode since
v14, which should solve this issue, but using streaming mode relies on
the subscriptor's support. There are still a lot of PostgreSQL running
v12/13 in production, or maybe v11 or older even though EOLed. Also,
there are a lot of CDCs which logical-replicates PostgreSQL seem not
support streaming either.
Would it be possible to make max_changes_in_memory a GUC so it can be
adjusted dynamically? Make the default value 4096 as what current is.
When coming with big transactions on memory-constrained machine, at
least we can adjust max_changes_in_memory to a lower value to make
logical WAL sender passing through this transaction. Or WAL sender may
get kill -9 and recovery is needed. After recovery, WAL sender needs to
restart from a point before this transaction starts, and keep this loop
without anything useful. It would never have a chance to pass through
this transaction except adding more memory to the machine, which is
usually not practical in reality.
Sincerely, Jingtang
On 7/21/24 07:19, Jingtang Zhang wrote: > Hi hackers. > > Recently I came to an issue about logical replicating very big > transactions. Since we already have logical_decoding_work_mem to keep > the memory usage, there is no risk of OOM during decoding. However, the > memory usage still goes out of control in 'Tuples' memory context of > reorder buffer. It seems that when restoring the spilled transactions > from disk, the memory usage is still limited by max_changes_in_memory > which is hard coded to 4096 like what decoding does before v13. > > For big transactions, we have already supported streaming mode since > v14, which should solve this issue, but using streaming mode relies on > the subscriptor's support. There are still a lot of PostgreSQL running > v12/13 in production, or maybe v11 or older even though EOLed. Also, > there are a lot of CDCs which logical-replicates PostgreSQL seem not > support streaming either. > > Would it be possible to make max_changes_in_memory a GUC so it can be > adjusted dynamically? Make the default value 4096 as what current is. > When coming with big transactions on memory-constrained machine, at > least we can adjust max_changes_in_memory to a lower value to make > logical WAL sender passing through this transaction. Or WAL sender may > get kill -9 and recovery is needed. After recovery, WAL sender needs to > restart from a point before this transaction starts, and keep this loop > without anything useful. It would never have a chance to pass through > this transaction except adding more memory to the machine, which is > usually not practical in reality. > Theoretically, yes, we could make max_changes_in_memory a GUC, but it's not clear to me how would that help 12/13, because there's ~0% chance we'd backpatch that ... But even for master, is GUC really the appropriate solution? It's still a manual action, so if things go wrong some human has to connect, try a setting a lower value, which might or might not work, etc. Wouldn't it be better to have adjusts the value automatically, somehow? For example, before restoring the changes, we could count the number of transactions, and set it to 4096/ntransactions or something like that. Or do something smarter by estimating tuple size, to count it in the logical__decoding_work_mem budget. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thanks, Tomas.
> Theoretically, yes, we could make max_changes_in_memory a GUC, but it's
> not clear to me how would that help 12/13, because there's ~0% chance> we'd backpatch that ...
What I mean is not about back-patch work. Things should happen on publisher
side?
Consider when the publisher is a PostgreSQL v14+~master (with streaming
support) and subscriber is a 12/13 where streaming is not supported, the publisher
would still have the risk of OOM. The same thing should happen when we use a
v14+~master as publisher and a whatever open source CDC as subscriber.
> Wouldn't it be better to have adjusts the value automatically, somehow?
> For example, before restoring the changes, we could count the number of
> transactions, and set it to 4096/ntransactions or something like that.
> Or do something smarter by estimating tuple size, to count it in the
> logical__decoding_work_mem budget.
> For example, before restoring the changes, we could count the number of
> transactions, and set it to 4096/ntransactions or something like that.
> Or do something smarter by estimating tuple size, to count it in the
> logical__decoding_work_mem budget.
Yes, I think this issue should have been solved when logical_decoding_work_mem
was initially been introduced, but it didn't. There could be some reasons like
sub-transaction stuff which has been commented in the header of reorderbuffer.c.
regards, Jingtang
On 7/22/24 05:28, Jingtang Zhang wrote: > Thanks, Tomas. > >> Theoretically, yes, we could make max_changes_in_memory a GUC, but it's >> not clear to me how would that help 12/13, because there's ~0% chance >> we'd backpatch that ... > > What I mean is not about back-patch work. Things should happen on publisher > side? > > Consider when the publisher is a PostgreSQL v14+~master (with streaming > support) and subscriber is a 12/13 where streaming is not supported, the > publisher > would still have the risk of OOM. The same thing should happen when we use a > v14+~master as publisher and a whatever open source CDC as subscriber. > Yes, you're right - if we're talking about mixed setups with just the subscriber running the old version, then it would benefit from this improvement even without backpatching. >> Wouldn't it be better to have adjusts the value automatically, somehow? >> For example, before restoring the changes, we could count the number of >> transactions, and set it to 4096/ntransactions or something like that. >> Or do something smarter by estimating tuple size, to count it in the >> logical__decoding_work_mem budget. > > Yes, I think this issue should have been solved when > logical_decoding_work_mem > was initially been introduced, but it didn't. There could be some reasons > like > sub-transaction stuff which has been commented in the header of > reorderbuffer.c. > True, but few patches are perfect/complete from V1. There's often stuff that's unlikely to happen, left for future improvements. And this is one of those cases, I believe. The fact that the comment even mentions this is a sign the developers considered this, and chose to ignore it. That being said, I think it'd be nice to improve this, and I'm willing to take a look if someone prepares a patch. But I don't think making this a GUC is the right approach - it's the simplest patch, but every new GUC just makes the database harder to manage. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company