Re: [HACKERS] [POC] Faster processing at Gather node - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] [POC] Faster processing at Gather node
Date
Msg-id CA+TgmoYPPp5r=X4c33wwgN7ZS-BHWWe5Yt2+2NNgNPXMKmvR7g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [POC] Faster processing at Gather node  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] [POC] Faster processing at Gather node  (Ants Aasma <ants.aasma@eesti.ee>)
List pgsql-hackers
On Wed, Nov 15, 2017 at 9:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> The main advantage of local queue idea is that it won't consume any
> memory by default for running parallel queries.  It would consume
> memory when required and accordingly help in speeding up those cases.
> However, increasing the size of shared queues by default will increase
> memory usage for cases where it is even not required.   Even, if we
> provide a GUC to tune the amount of shared memory, I am not sure how
> convenient it will be for the user to use it as it needs different
> values for different workloads and it is not easy to make a general
> recommendation.  I am not telling we can't work-around this with the
> help of GUC, but it seems like it will be better if we have some
> autotune mechanism and I think Rafia's patch is one way to achieve it.

It's true this might save memory in some cases.  If we never generate
very many tuples, then we won't allocate the local queue and we'll
save memory.  That's mildly nice.

On the other hand, the local queue may also use a bunch of memory
without improving performance, as in the case of Rafia's test where
she raised the queue size 10x and it didn't help.
Alternatively, it may improve performance by a lot, but use more
memory than necessary to do so.  In Rafia's test results, a 100x
improvement got it down to 7s; if she'd done 200x instead, I don't
think it would have helped further, but it would have been necessary
to go 200x to get the full benefit if the data had been twice as big.

The problem here is that we have no idea how big the queue needs to
be.  The workers will always be happy to generate tuples faster than
the leader can read them, if that's possible, but it will only
sometimes help performance to let them do so.   I think in most cases
we'll end up allocating the local queue - because the workers can
generate faster than the leader can read - but only occasionally will
it make anything faster.

If what we really want to do is allow the workers to get arbitrarily
far ahead of the leader, we could ditch shm_mq altogether here and use
Thomas's shared tuplestore stuff.  Then you never run out of memory
because you spill to disk.  I'm not sure that's the way to go, though.
It still has the problem that you may let the workers get very far
ahead not just when it helps, but also when it's possible but not
helpful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] Assertion failure when the non-exclusive pg_stop_backup aborted.
Next
From: David Rowley
Date:
Subject: Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)