Home > mailing lists

Re: using custom scan nodes to prototype parallel sequential scan - Mailing list pgsql-hackers

From	Haribabu Kommi
Subject	Re: using custom scan nodes to prototype parallel sequential scan
Date	November 11, 2014 04:13:21
Msg-id	CAJrrPGca43oS9kKKaFsunbpf3QH-PHdxiqjgB0t5oJqUxCEttQ@mail.gmail.com Whole thread Raw
In response to	Re: using custom scan nodes to prototype parallel sequential scan (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: using custom scan nodes to prototype parallel sequential scan Re: using custom scan nodes to prototype parallel sequential scan
List	pgsql-hackers

Tree view

On Tue, Nov 11, 2014 at 2:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Nov 11, 2014 at 5:30 AM, Haribabu Kommi <kommi.haribabu@gmail.com>
> wrote:
>>
>> On Tue, Nov 11, 2014 at 10:21 AM, Andres Freund <andres@2ndquadrant.com>
>> wrote:
>> > On 2014-11-10 10:57:16 -0500, Robert Haas wrote:
>> >> Does parallelism help at all?
>> >
>> > I'm pretty damn sure. We can't even make a mildly powerfull storage
>> > fully busy right now. Heck, I can't make my workstation's storage with a
>> > raid 10 out of four spinning disks fully busy.
>> >
>> > I think some of that benefit also could be reaped by being better at
>> > hinting the OS...
>>
>> Yes, it definitely helps but not only limited to IO bound operations.
>> It gives a good gain for the queries having CPU intensive where
>> conditions.
>>
>> One more point we may need to consider, is there any overhead in passing
>> the data row from workers to backend?
>
> I am not sure if that overhead will be too much visible if we improve the
> use of I/O subsystem by making parallel tasks working on it.

I feel there may be an overhead because of workers needs to put the result
data in the shared memory and the backend has to read from there to process
it further. If the cost of transfering data from worker to backend is more than
fetching a tuple from the scan, then the overhead is visible when the
selectivity is more.

> However
> another idea here could be that instead of passing tuple data, we just
> pass tuple id, but in that case we have to retain the pin on the buffer
> that contains tuple untill master backend reads from it that might have
> it's own kind of problems.

Transfering tuple id doesn't solve the scenarios if the node needs any
projection.

Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

From: Amit Kapila
Date: 11 November 2014, 03:35:30
Subject: Re: using custom scan nodes to prototype parallel sequential scan

From: Michael Paquier
Date: 11 November 2014, 04:13:25
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()

Re: using custom scan nodes to prototype parallel sequential scan - Mailing list pgsql-hackers

Previous

Next