Home > mailing lists

Re: [GENERAL] Query Using Massive Temp Space - Mailing list pgsql-general

From	Tom Lane
Subject	Re: [GENERAL] Query Using Massive Temp Space
Date	November 22, 2017 04:37:23
Msg-id	4346.1511303843@sss.pgh.pa.us Whole thread Raw
In response to	Re: [GENERAL] Query Using Massive Temp Space (Thomas Munro <thomas.munro@enterprisedb.com>)
List	pgsql-general

Tree view

Thomas Munro <thomas.munro@enterprisedb.com> writes:
> On Wed, Nov 22, 2017 at 7:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Now, there's definitely something busted here; it should not have gone as
>> far as 2 million batches before giving up on splitting.

> I had been meaning to discuss this.  We only give up when we reach the
> point when a batch is entirely entirely kept or sent to a new batch
> (ie splitting the batch resulted in one batch with the whole contents
> and another empty batch).  If you have about 2 million evenly
> distributed keys and an ideal hash function, and then you also have 42
> billion keys that are the same (and exceed work_mem), we won't detect
> extreme skew until the 2 million well behaved keys have been spread so
> thin that the 42 billion keys are isolated in a batch on their own,
> which we should expect to happen somewhere around 2 million batches.

Yeah, I suspected it was something like that, but hadn't dug into the
code yet.

> I have wondered if our extreme skew detector needs to go off sooner.
> I don't have a specific suggestion, but it could just be something
> like 'you threw out or kept more than X% of the tuples'.

Doing this, with some threshold like 95% or 99%, sounds plausible to me.
I'd like to reproduce Cory's disk-space issue before we monkey with
related logic, though; fixing the part we understand might obscure
the part we still don't.
        regards, tom lane

pgsql-general by date:

From: Torsten Förtsch
Date: 22 November 2017, 03:35:25
Subject: dblink surprise

From: Justin Pryzby
Date: 22 November 2017, 05:09:26
Subject: backends stuck in "startup"

Re: [GENERAL] Query Using Massive Temp Space - Mailing list pgsql-general

Previous

Next