On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I'm testing with a customer's data on pg13dev and got output for which Peak
> Memory doesn't look right/useful. I reproduced it on 565f16902.
Likely the sanity of those results depends on whether you think that
the Memory Usage reported outside of the workers is meant to be the
sum of all processes or the memory usage for the leader backend.
All that's going on here is that the Parallel Append is using some
parallel safe paths and giving one to each worker. The 2 workers take
the first 2 subpaths and the leader takes the third. The memory usage
reported helps confirm that's the case.
Can you explain what you'd want to see changed about this? Or do you
want to see the non-parallel worker memory be the sum of all workers?
Sort does not seem to do that, so I'm not sure if we should consider
hash agg as an exception to that.
One thing I did notice from playing with this table is that Sort does
not show the memory used by the leader process when it didn't do any
of the work itself.
postgres=# set parallel_leader_participation =off;
SET
postgres=# explain analyze select i from p group by i;
-> Sort (cost=59436.92..60686.92 rows=500000 width=4)
(actual time=246.836..280.985 rows=500000 loops=2)
Sort Key: p.i
Worker 0: Sort Method: quicksort Memory: 27898kB
Worker 1: Sort Method: quicksort Memory: 55842kB
Whereas with the leader helping out we get:
-> Sort (cost=51284.39..52326.05 rows=416666 width=4) (actual
time=191.814..213.418 rows=333333 loops=3)
Sort Key: p.i
Sort Method: quicksort Memory: 33009kB
Worker 0: Sort Method: quicksort Memory: 25287kB
Worker 1: Sort Method: quicksort Memory: 25445kB
Maybe we should do the same for hash agg when the leader didn't assist?
David