Home > mailing lists

Re: Parallel query and temp_file_limit - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Parallel query and temp_file_limit
Date	May 18, 2016 12:17:21
Msg-id	CAKJS1f-AEQr+xm=1p81tgpuMZ3e31wwQox=PXOSmoOm+zeHctw@mail.gmail.com Whole thread
In response to	Re: Parallel query and temp_file_limit (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On 18 May 2016 at 22:40, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 17, 2016 at 6:40 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Fundamentally, since temporary_files_size enforcement simply
>> piggy-backs on low-level fd.c file management, without any
>> consideration of what the temp files contain, it'll be hard to be sure
>> that parallel workers will not have issues. I think it'll be far
>> easier to fix the problem then it would be to figure out if it's
>> possible to get away with it.
>
> I'll write a patch to fix the issue, if there is a consensus on a solution.

I think for 9.6 we just have to document this issue. In the next
release, we could (and might well want to) try to do something more
clever.

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

I was asked (internally I believe) about abuse of work_mem during my work on parallel aggregates, at the time I didn't really feel like I was abusing that any more than parallel hash join was. My thought was that one day it would be nice if work_mem could be granted to a query and we had some query marshal system which ensured that the total grants did not exceed the server wide memory dedicated to work_mem. Of course that's lots of work, as there's at least one node (HashAgg) which can still blow out work_mem for bad estimates. For this release, I assumed it wouldn't be too big an issue if we're shipping with max_parallel_degree = 0 as we could just decorate the docs with some warnings about work_mem is per node / per worker to caution users setting this setting any higher. That might be enough to give us wriggle from for the future where we can make improvements, so I agree with Robert, the docs seem like the best solution for 9.6.

David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Robert Haas
Date: 18 May 2016, 10:40:52
Subject: Re: Parallel query and temp_file_limit

From: David Steele
Date: 18 May 2016, 12:41:35
Subject: Re: Reviewing freeze map code

Re: Parallel query and temp_file_limit - Mailing list pgsql-hackers

Previous

Next