I think this discussion is getting away from a manageable scope of work...
On Thu, Jan 2, 2025 at 1:09 PM Jim Nasby <jnasby@upgrade.com> wrote:
> That’s basically my argument for having workload management. If a system becomes loaded enough for the global limit
tostart kicking in it’s likely that query response time is increasing, which means you will soon have more and more
activebackends trying to run queries. That’s just going to make the situation even worse. You’d either have to start
tryingto “take memory away” from already running backends or backends that are just starting would have such a low
limitas to cause them to spill very quickly, creating further load on the system.
I think this is backward. There's a fixed number of backends, limited
by "max_connections." A given backend becoming slow to respond doesn't
increase this limit. The load on the system is limited by
"max_connections." This makes the system stable.
If, instead, the system starts cancelling queries, then a naive client
will typically just retry the query. If the reason the query was
cancelled is that the system is overloaded, then the query is likely
to be cancelled again, leading to a retry storm, even though the
number of backends is limited by "max_connections," since a given
backend will be repeatedly cancelled.
Long-term, the solution to, "I don't have enough resources to run my
workload efficiency," is either:
* Buy more resources; or
* Reduce your workload.
We can't help with either of those solutions!
However, the (short-term) problem I'd like to solve is: how do we
expose efficient use of resources, in those cases where we *do* have
enough resources to run a workload efficiently, but the existing
"work_mem" and "hash_mem_multiplier" GUCs are insufficient?
Thanks,
James