Home > mailing lists

Re: Spilling hashed SetOps and aggregates to disk - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Spilling hashed SetOps and aggregates to disk
Date	June 5, 2018 20:52:09
Msg-id	20180605175209.vavuqe4idovcpeie@alap3.anarazel.de Whole thread Raw
In response to	Re: Spilling hashed SetOps and aggregates to disk (Jeff Davis <pgsql@j-davis.com>)
List	pgsql-hackers

Tree view

Hi,

On 2018-06-05 10:47:49 -0700, Jeff Davis wrote:
> The thing I don't like about it is that it requires running two memory-
> hungry operations at once. How much of work_mem do we use for sorted
> runs, and how much do we use for the hash table?

Is that necessarily true? I'd assume that we'd use a small amount of
memory for the tuplesort, enough to avoid unnecessary disk spills for
each tuple. But a few kb should be enough - think it's fine to
aggressively spill to disk, we after all already have handled the case
of smaller number of input rows.  Then at the end of the run, we empty
out the hashtable, and free it. Only then we do to the sort.

One thing this wouldn't handle are datatypes that support hashing, but
no sorting. Not exactly common.

Greetings,

Andres Freund

pgsql-hackers by date:

From: David Fetter
Date: 05 June 2018, 20:51:14
Subject: Re: [PATCH] Trim trailing whitespace in vim and emacs

From: "MauMau"
Date: 05 June 2018, 20:53:37
Subject: Re: I'd like to discuss scaleout at PGCon

Re: Spilling hashed SetOps and aggregates to disk - Mailing list pgsql-hackers

Previous

Next