Commit 896ddf9b added prefetching to logtape.c to avoid excessive
fragmentation in the context of hash aggs that spill and have many
batches/tapes. Apparently the preallocation doesn't actually perform
any filesystem operations, so the new mechanism should be zero
overhead when "preallocated" blocks aren't actually used after all
(right?). However, I notice that this breaks the statistics shown by
things like trace_sort, and even EXPLAIN ANALYZE.
LogicalTapeSetBlocks() didn't get the memo about preallocation.
The easiest way to spot the issue is to compare trace_sort output on
v13 with output for the same case in v12 -- the "%u disk blocks used"
statistics are consistently higher on v13, especially for cases with
many tapes. I spotted the bug when I noticed that v13 external sorts
reportedly use more or less disk space when fewer or more tapes are
involved (again, this came from trace_sort). That doesn't make sense
-- the total amount of space used for external sort temp files should
practically be fixed, aside from insignificant rounding effects.
Reducing the amount of memory by orders of magnitude in a Postgres 12
tuplesort will hardly affect the "%u disk blocks used" trace_sort
output at all. That's what we need to get back to.
This bug probably won't be difficult to fix. Actually, we have had
similar problems in the past. The fix could be as simple as teaching
LogicalTapeSetBlocks() about this new variety of "sparse allocation".
Although maybe the preallocation stuff should somehow be rolled into
the much older nHoleBlocks stuff. Unsure.
--
Peter Geoghegan