On Dec 30, 2024, at 7:05 PM, James Hunter <james.hunter.pg@gmail.com> wrote:
>
> On Sat, Dec 28, 2024 at 11:24 PM Jim Nasby <jnasby@upgrade.com> wrote:
>>
>> IMHO none of this will be very sane until we actually have cluster-level limits. One sudden burst in active
connectionsand you still OOM the instance.
>
> Fwiw, PG does support "max_connections" GUC, so a backend/connection -
> level limit, times "max_connections", yields a cluster-level limit.
max_connections is useless here, for two reasons:
1. Changing it requires a restart. That’s at *best* a real PITA in production. [1]
2. It still doesn’t solve the actual problem. Unless your workload *and* your data are extremely homogeneous you can’t
simplylimit the number of connections and call it a day. A slight change in incoming queries, OR in the data that the
queriesare looking at and you go from running fine to meltdown. You don’t even need a plan flip for this to happen,
justthe same plan run at the same rate but now accessing more data than before.
Most of what I’ve seen on this thread is discussing ways to *optimize* how much memory the set of running backends can
consume.Adjusting how you slice the memory pie across backends, or even within a single backend, is optimization. While
that’sa great goal that I do support, it will never fully fix the problem. At some point you need to either throw your
handsin the air and start tossing memory errors, because you don’t have control over how much work is being thrown at
theengine. The only way that the engine can exert control over that would be to hold new transactions from starting
whenthe system is under duress (ie, workload management). While workload managers can be quite sophisticated (aka,
complex),the nice thing about limiting this scope to work_mem, and only as a means to prevent complete overload, is
thatthe problem becomes a lot simpler since you’re only looking at one metric and not trying to support any kind of
prioritysystem. The only fanciness I think an MVP would need is a GUC to control how long a transaction can sit waiting
beforeit throws an error. Frankly, that sounds a lot less complex and much easier for DBAs to adjust than trying to
teachthe planner how to apportion out per-node work_mem limits.
As I said, I’m not opposed to optimizations, I just think they’re very much cart-before-the-horse.
1: While it’d be a lot of work to make max_connections dynamic one thing we could do fairly easily would be to
introduceanother GUC (max_backends?) that actually controls the total number of allowed backends for everything. The
sumof max_backends + autovac workers + background workers + whatever else I’m forgetting would have to be less than
that.The idea here is that you’d normally run with max_connections set significantly lower than max_backends. That
meansthat if you need to adjust any of these GUCs (other than max_backends) you don’t need to restart - the new limits
wouldjust apply to new connection requests.