Home > mailing lists

Re: [HACKERS] Effect of changing the value forPARALLEL_TUPLE_QUEUE_SIZE - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: [HACKERS] Effect of changing the value forPARALLEL_TUPLE_QUEUE_SIZE
Date	June 2, 2017 00:11:28
Msg-id	20170601181128.k5ciuloorhyytr66@alap3.anarazel.de Whole thread Raw
In response to	Re: [HACKERS] Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE (Rafia Sabih <rafia.sabih@enterprisedb.com>)
List	pgsql-hackers

Tree view

On 2017-06-01 18:41:20 +0530, Rafia Sabih wrote:
> As per my understanding it looks like this increase in tuple queue
> size is helping only gather-merge. Particularly, in the case where it
> is enough stalling by master in gather-merge because it is maintaining
> the sort-order. Like in q12 the index is unclustered and gather-merge
> is just above parallel index scan, thus, it is likely that to maintain
> the order the workers have to wait long for the in-sequence tuple is
> attained by the master.

I wonder if there's some way we could make this problem a bit less bad.
One underlying problem is that we don't know what the current boundary
on each worker is, unless it returns a tuple. I.e. even if some worker
is guaranteed to not return any further tuples below another worker's
last tuple, gather-merge won't know about that until it finds another
matching tuple.  Perhaps, for some subsets, we could make the workers
update that boundary without producing a tuple that gather will actually
return?  In the, probably reasonably common, case of having merge-joins
below the gather, it shouldn't be very hard to do so.  Imagine e.g. that
every worker gets a "slot" in a dsm where it can point to a tuple
(managed by dsa.c to deal with variable-length keys) that contains the
current boundary.  For a merge-join it'd not be troublesome to
occasionally - although what constitutes that isn't easy, perhaps the
master signals the worker? - put a new boundary tuple there, even if it
doesn't find a match.  It's probably harder for cases where most of the
filtering happens far below the top-level worker node.

- Andres

pgsql-hackers by date:

From: Robert Haas
Date: 01 June 2017, 23:59:42
Subject: Re: [HACKERS] Hash Functions

From: Andres Freund
Date: 02 June 2017, 00:25:22
Subject: Re: [HACKERS] Hash Functions

Re: [HACKERS] Effect of changing the value forPARALLEL_TUPLE_QUEUE_SIZE - Mailing list pgsql-hackers

Previous

Next