Re: [HACKERS] [POC] Faster processing at Gather node - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] [POC] Faster processing at Gather node
Date
Msg-id CAA4eK1JExaqaRgT=UwiXqBCj8NhROJefDVO2BVwoNp6w4APYCA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [POC] Faster processing at Gather node  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, May 19, 2017 at 5:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 19, 2017 at 7:55 AM, Rafia Sabih
> <rafia.sabih@enterprisedb.com> wrote:
>> While analysing the performance of TPC-H queries for the newly developed
>> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed
>> that the time taken by gather node is significant. On investigation, as per
>> the current method it copies each tuple to the shared queue and notifies the
>> receiver. Since, this copying is done in shared queue, a lot of locking and
>> latching overhead is there.
>>
>> So, in this POC patch I tried to copy all the tuples in a local queue thus
>> avoiding all the locks and latches. Once, the local queue is filled as per
>> it's capacity, tuples are transferred to the shared queue. Once, all the
>> tuples are transferred the receiver is sent the notification about the same.
>
> What if, instead of doing this, we switched the shm_mq stuff to use atomics?
>

That is one of the very first things we have tried, but it didn't show
us any improvement, probably because sending tuple-by-tuple over
shm_mq is not cheap.  Also, independently, we have tried to reduce the
frequency of SetLatch (used to notify receiver), but that also didn't
result in improving the results. Now, I think one thing that can be
tried is to use atomics in shm_mq and reduce the frequency to notify
receiver, but not sure if that can give us better results than with
this idea. There are a couple of other ideas which has been tried to
improve the speed of Gather like avoiding an extra copy of tuple which
we need to do before sending tuple
(tqueueReceiveSlot->ExecMaterializeSlot) and increasing the size of
tuple queue length, but none of those has shown any noticeable
improvement.  I am aware of all this because I and Dilip were offlist
involved in brainstorming ideas with Rafia to improve the speed of
Gather.  I think it might have been better to show the results of
ideas that didn't work out, but I guess Rafia hasn't shared those with
the intuition that nobody would be interested in hearing what didn't
work out.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Regression in join selectivity estimations when using foreign keys
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Removal of plaintext password type references