On 2023-05-12 17:41:37 +0200, Marc Millas wrote:
> On Fri, May 12, 2023 at 8:31 AM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
> My guess is that the amount of parallelism is the problem.
>
> work_mem is a per-node limit. Even a single process can use a multiple of
> work_mem if the query contains nested nodes (which almost every query
> does, but most nodes don't need much memory). With 5 parallel workers,
> the total consumption will be 5 times that. So to prevent the OOM
> condition you would need to reduce work_mem or max_parallel_workers (at
> least for this query).
>
>
> we have more than 100GB RAM and only 1 user, with one request running.
> work_mem is set to 10MB.
I don't remember mentioning you that before, so I had to guess.
> for oom to kill due to work_mem it means that for one
> request with 2 left join, postgres needs more than 10.000 work_mem buffers.
> to me, it seems difficult to believe.
Yes. work_mem is unlikely to be the culprit here. It must be something
else which is eating the RAM. But I think it's hard to reproduce that
with the information you've given us this far.
> but that postgres may need that RAM space for hashing or whatever
> other similar purpose seems more probable. no ?
That should adhere to the work_mem limit, too.
I'm not a core developer (actually not a Postgres developer at all, just
a developer of other stuff which happens to use Postgres), but I
remember that there was an issue some years ago where a hash table could
grow much larger than estimated by the planner. That issue was fixed,
but maybe you've run into a similar case?
> The description temp_file_limit says "...the maximum amount of disk
> space that *a process* can use...". So with 5 workers that's 210*5 =
> 1050 GB total. Again, you may want to reduce either temp_file_limit or
> max_parallel_workers.
>
> Yes, but if so, we may have had a request canceled due to temp_file limit,
> which was not the case.
Well, you've said that the disk space grew by "1.1 TB". I understand
that to mean "something between 1.05 TB and 1.15 TB", not exactly
1100000000000 Bytes (or 1209462790554 Bytes or 1100213452800 Bytes,
depending on how you define "TB"). 1050 GB is within that range, so you
might just have missed that limit.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"