Re: Parallel Sequence Scan doubts - Mailing list pgsql-hackers
From | Haribabu Kommi |
---|---|
Subject | Re: Parallel Sequence Scan doubts |
Date | |
Msg-id | CAJrrPGd98mCCQiET0ceThjYg_ogdKs1dM3adPs4AyEAi18fY3w@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Sequence Scan doubts (Craig Ringer <craig@2ndquadrant.com>) |
Responses |
Re: Parallel Sequence Scan doubts
|
List | pgsql-hackers |
On Sun, Aug 24, 2014 at 12:34 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 08/24/2014 09:40 AM, Haribabu Kommi wrote: > >> Any suggestions? > > Another point I didn't raise first time around, but that's IMO quite > significant, is that you haven't addressed why this approach to fully > parallel seqscans is useful and solves real problems in effective ways. > > It might seem obvious - "of course they're useful". But I see two things > they'd address: > > - CPU-limited sequential scans, where expensive predicates are filtering > the scan; and Yes, we are mainly targeting CPU-limited sequential scans, Because of this reason only I want the worker to handle the predicates also not just reading the tuples from disk. > - I/O limited sequential scans, where the predicates already execute > fast enough on one CPU, so most time is spent waiting for more disk I/O. > > The problem I see with your design is that it's not going to be useful > for a large class of CPU-limited scans where the predicate isn't > composed entirely of immutable functions an operators. Especially since > immutable-only predicates are the best candidates for expression indexes > anyway. > > While it'd likely be useful for I/O limited scans, it's going to > increase contention on shared_buffers locking and page management. More > importantly, is it the most efficient way to solve the problem with I/O > limited scans? > > I would seriously suggest looking at first adding support for > asynchronous I/O across ranges of extents during a sequential scan. You > might not need multiple worker backends at all. > > I'm sure using async I/O to implement effective_io_concurrency for > seqscans has been been discussed and explored before, so again I think > some time in the list archives might make sense. Thanks for your inputs. I will check it. > I don't know if it makes sense to do something as complex and parallel > multi-process seqscans without having a path forward for supporting > non-immutable functions - probably with fmgr API enhancements, > additional function labels ("PARALLEL"), etc, depending on what you find > is needed. Thanks for your inputs. I will check for a solution to extend the support for non-immutable functions also. >>>> 3. In the executor Init phase, Try to copy the necessary data required >>>> by the workers and start the workers. >>> >>> Copy how? >>> >>> Back-ends can only communicate with each other over shared memory, >>> signals, and using sockets. >> >> Sorry for not being clear, copying those data structures into dynamic >> shared memory only. >> From there the workers can access. > > That'll probably work with read-only data, but it's not viable for > read/write data unless you use a big lock to protect it, in which case > you lose the parallelism you want to achieve. As of now I am thinking of sharing read-only data with workers. In case if read/write data needs to be shared, then we may need another approach to handle the same. > You'd have to classify what may be modified during scan execution > carefully and determine if you need to feed any of the resulting > modifications back to the original backend - and how to merge > modifications by multiple workers, if it's even possible. > > That's going to involve a detailed structure-by-structure analysis and > seems likely to be error prone and buggy. Thanks for your inputs. I will check it properly. > I think you should probably talk to Robert Haas about what he's been > doing over the last couple of years on parallel query. Sure, I will check with him. >>>> 4. In the executor run phase, just get the tuples which are sent by >>>> the workers and process them further in the plan node execution. >>> >>> Again, how do you propose to copy these back to the main bgworker? >> >> With the help of message queues that are created in the dynamic shared memory, >> the workers can send the data to the queue. On other side the main >> backend receives the tuples from the queue. > > OK, so you plan to implement shmem queues. > > That'd be a useful starting point, as it'd be something that would be > useful in its own right. shmem queues are already possible with dynamic shared memory. Just I want to use them here. > You'd have to be able to handle individual values that're than the ring > buffer or whatever you're using for transfers, in case you're dealing > with already-detoasted tuples or in-memory tuples. > > Again, chatting with Robert and others who've worked on dynamic shmem, > parallel query, etc would be wise here. > >> Yes you are correct. For that reason only I am thinking of Supporting >> of functions >> that only dependent on input variables and are not modifying any global data. > > You'll want to be careful with that. Nothing stops an immutable function > referencing a cache in a C global that it initializes one and then > treats as read only, for example. > > I suspect you'll need a per-function whitelist. I'd love to be wrong. Yes, we need per-function level details. Once we have a better solution to handle non-immutable functions also then these may not be required. Regards, Hari Babu Fujitsu Australia
pgsql-hackers by date: