Home > mailing lists

Re: Parallel Sequence Scan doubts - Mailing list pgsql-hackers

From	Haribabu Kommi
Subject	Re: Parallel Sequence Scan doubts
Date	August 24, 2014 11:22:31
Msg-id	CAJrrPGd98mCCQiET0ceThjYg_ogdKs1dM3adPs4AyEAi18fY3w@mail.gmail.com Whole thread
In response to	Re: Parallel Sequence Scan doubts (Craig Ringer <craig@2ndquadrant.com>)
Responses	Re: Parallel Sequence Scan doubts
List	pgsql-hackers

Tree view

On Sun, Aug 24, 2014 at 12:34 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 08/24/2014 09:40 AM, Haribabu Kommi wrote:
>
>> Any suggestions?
>
> Another point I didn't raise first time around, but that's IMO quite
> significant, is that you haven't addressed why this approach to fully
> parallel seqscans is useful and solves real problems in effective ways.
>
> It might seem obvious - "of course they're useful". But I see two things
> they'd address:
>
> - CPU-limited sequential scans, where expensive predicates are filtering
> the scan; and

Yes, we are mainly targeting CPU-limited sequential scans, Because of
this reason
only I want the worker to handle the predicates also not just reading
the tuples from
disk.

> - I/O limited sequential scans, where the predicates already execute
> fast enough on one CPU, so most time is spent waiting for more disk I/O.
>
> The problem I see with your design is that it's not going to be useful
> for a large class of CPU-limited scans where the predicate isn't
> composed entirely of immutable functions an operators. Especially since
> immutable-only predicates are the best candidates for expression indexes
> anyway.
>
> While it'd likely be useful for I/O limited scans, it's going to
> increase contention on shared_buffers locking and page management. More
> importantly, is it the most efficient way to solve the problem with I/O
> limited scans?
>
> I would seriously suggest looking at first adding support for
> asynchronous I/O across ranges of extents during a sequential scan. You
> might not need multiple worker backends at all.
>
> I'm sure using async I/O to implement effective_io_concurrency for
> seqscans has been been discussed and explored before, so again I think
> some time in the list archives might make sense.

Thanks for your inputs. I will check it.

> I don't know if it makes sense to do something as complex and parallel
> multi-process seqscans without having a path forward for supporting
> non-immutable functions - probably with fmgr API enhancements,
> additional function labels ("PARALLEL"), etc, depending on what you find
> is needed.

Thanks for your inputs.
I will check for a solution to extend the support for non-immutable
functions also.

>>>> 3. In the executor Init phase, Try to copy the necessary data required
>>>> by the workers and start the workers.
>>>
>>> Copy how?
>>>
>>> Back-ends can only communicate with each other over shared memory,
>>> signals, and using sockets.
>>
>> Sorry for not being clear, copying those data structures into dynamic
>> shared memory only.
>> From there the workers can access.
>
> That'll probably work with read-only data, but it's not viable for
> read/write data unless you use a big lock to protect it, in which case
> you lose the parallelism you want to achieve.

As of now I am thinking of sharing read-only data with workers.
In case if read/write data needs to be shared, then  we may need
another approach to handle the same.

> You'd have to classify what may be modified during scan execution
> carefully and determine if you need to feed any of the resulting
> modifications back to the original backend - and how to merge
> modifications by multiple workers, if it's even possible.
>
> That's going to involve a detailed structure-by-structure analysis and
> seems likely to be error prone and buggy.

Thanks for your inputs. I will check it properly.

> I think you should probably talk to Robert Haas about what he's been
> doing over the last couple of years on parallel query.

Sure, I will check with him.

>>>> 4. In the executor run phase, just get the tuples which are sent by
>>>> the workers and process them further in the plan node execution.
>>>
>>> Again, how do you propose to copy these back to the main bgworker?
>>
>> With the help of message queues that are created in the dynamic shared memory,
>> the workers can send the data to the queue. On other side the main
>> backend receives the tuples from the queue.
>
> OK, so you plan to implement shmem queues.
>
> That'd be a useful starting point, as it'd be something that would be
> useful in its own right.

shmem queues are already possible with dynamic shared memory.
Just I want to use them here.

> You'd have to be able to handle individual values that're than the ring
> buffer or whatever you're using for transfers, in case you're dealing
> with already-detoasted tuples or in-memory tuples.
>
> Again, chatting with Robert and others who've worked on dynamic shmem,
> parallel query, etc would be wise here.
>
>> Yes you are correct. For that reason only I am thinking of Supporting
>> of functions
>> that only dependent on input variables and are not modifying any global data.
>
> You'll want to be careful with that. Nothing stops an immutable function
> referencing a cache in a C global that it initializes one and then
> treats as read only, for example.
>
> I suspect you'll need a per-function whitelist. I'd love to be wrong.

Yes, we need per-function level details. Once we have a better
solution to handle
non-immutable functions also then these may not be required.


Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

From: Fabien COELHO
Date: 24 August 2014, 10:16:37
Subject: Re: pgbench throttling latency limit

From: Alexey Klyukin
Date: 24 August 2014, 12:12:09
Subject: Re: implement subject alternative names support for SSL connections

Re: Parallel Sequence Scan doubts - Mailing list pgsql-hackers

Previous

Next