On Sat, Nov 9, 2019 at 6:14 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Sun, Nov 10, 2019 at 7:27 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
> > Hmmm, but the expected row width is only 16B, and with 6M rows that's
> > only about 90GB. So how come this needs 1TB temporary files? I'm sure
> > there's a bit of overhead, but 10X seems a bit much.
>
> (s/6M/6B/) Yeah, that comes out to only ~90GB but ... PHJ doesn't
> immediately unlink files from the previous generation when it
> repartitions. You need at two generations' worth of files (old and
> new) while repartitioning, but you don't need the grand-parent
> generation. I didn't think this was a problem because I didn't expect
> to have to repartition many times (and there is a similar but
> different kind of amplification in the non-parallel code). If this
> problem is due to the 100% extreme skew threshold causing us to go
> berserk, then that 10X multiplier is of the right order, if you
> imagine this thing started out with ~512 batches and got up to ~1M
> batches before it blew a gasket.
Are you saying that it also doesn't unlink the grand-parent until the end?