Re: using custom scan nodes to prototype parallel sequential scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: using custom scan nodes to prototype parallel sequential scan
Date
Msg-id CAA4eK1KGuNut4H5K3_j52xbzJs+YqbKNYUQ0PtftBN1MH8Nd1g@mail.gmail.com
Whole thread Raw
In response to Re: using custom scan nodes to prototype parallel sequential scan  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Tue, Nov 11, 2014 at 9:42 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Tue, Nov 11, 2014 at 2:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Tue, Nov 11, 2014 at 5:30 AM, Haribabu Kommi <kommi.haribabu@gmail.com>
> > wrote:
> >>
> >> On Tue, Nov 11, 2014 at 10:21 AM, Andres Freund <andres@2ndquadrant.com>
> >> wrote:
> >> > On 2014-11-10 10:57:16 -0500, Robert Haas wrote:
> >> >> Does parallelism help at all?
> >> >
> >> > I'm pretty damn sure. We can't even make a mildly powerfull storage
> >> > fully busy right now. Heck, I can't make my workstation's storage with a
> >> > raid 10 out of four spinning disks fully busy.
> >> >
> >> > I think some of that benefit also could be reaped by being better at
> >> > hinting the OS...
> >>
> >> Yes, it definitely helps but not only limited to IO bound operations.
> >> It gives a good gain for the queries having CPU intensive where
> >> conditions.
> >>
> >> One more point we may need to consider, is there any overhead in passing
> >> the data row from workers to backend?
> >
> > I am not sure if that overhead will be too much visible if we improve the
> > use of I/O subsystem by making parallel tasks working on it.
>
> I feel there may be an overhead because of workers needs to put the result
> data in the shared memory and the backend has to read from there to process
> it further. If the cost of transfering data from worker to backend is more than
> fetching a tuple from the scan, then the overhead is visible when the
> selectivity is more.
>
> > However
> > another idea here could be that instead of passing tuple data, we just
> > pass tuple id, but in that case we have to retain the pin on the buffer
> > that contains tuple untill master backend reads from it that might have
> > it's own kind of problems.
>
> Transfering tuple id doesn't solve the scenarios if the node needs any
> projection.

Hmm, that's why I told that we need to retain buffer pin, so that we can
get the tuple data.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: REINDEX CONCURRENTLY 2.0
Next
From: Amit Kapila
Date:
Subject: Re: [v9.5] Custom Plan API