On 2023-05-11 21:27:57 +0200, Marc Millas wrote:
> the 750000 lines in each tables are not NULLs but '' empty varchar, which,
> obviously is not the same thing.
> and which perfectly generates 500 billions lines for the left join.
> So, no planner or statistics pbs. apologies for the time wasted.
No problem. Glad to have solved that puzzle.
> Back to the initial pb: if, with temp_file_limit positioned to 210 GB,
> I try to run the select * from table_a left join table_b on the col_a
> (which contains the 750000 '' on both tables)
> then postgres do crash, killed by oom, after having taken 1.1 TB of additional
> disk space.
My guess is that the amount of parallelism is the problem.
work_mem is a per-node limit. Even a single process can use a multiple of
work_mem if the query contains nested nodes (which almost every query
does, but most nodes don't need much memory). With 5 parallel workers,
the total consumption will be 5 times that. So to prevent the OOM
condition you would need to reduce work_mem or max_parallel_workers (at
least for this query).
The description temp_file_limit says "...the maximum amount of disk
space that *a process* can use...". So with 5 workers that's 210*5 =
1050 GB total. Again, you may want to reduce either temp_file_limit or
max_parallel_workers.
> to my understanding, before postgres 13, hash aggregate did eat RAM limitless
> in such circumstances.
> but in 14.2 ??
> (I know, 14.8 is up...)
Maybe the older version of postgres didn't use as many workers for that
query (or maybe not parallelize it at all)?
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"