Re: Parallel Sequence Scan doubts - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Parallel Sequence Scan doubts
Date
Msg-id 53F94F42.4040702@2ndquadrant.com
Whole thread Raw
In response to Re: Parallel Sequence Scan doubts  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: Parallel Sequence Scan doubts
List pgsql-hackers
On 08/24/2014 09:40 AM, Haribabu Kommi wrote:

> Any suggestions?

Another point I didn't raise first time around, but that's IMO quite
significant, is that you haven't addressed why this approach to fully
parallel seqscans is useful and solves real problems in effective ways.

It might seem obvious - "of course they're useful". But I see two things
they'd address:

- CPU-limited sequential scans, where expensive predicates are filtering
the scan; and

- I/O limited sequential scans, where the predicates already execute
fast enough on one CPU, so most time is spent waiting for more disk I/O.

The problem I see with your design is that it's not going to be useful
for a large class of CPU-limited scans where the predicate isn't
composed entirely of immutable functions an operators. Especially since
immutable-only predicates are the best candidates for expression indexes
anyway.

While it'd likely be useful for I/O limited scans, it's going to
increase contention on shared_buffers locking and page management. More
importantly, is it the most efficient way to solve the problem with I/O
limited scans?

I would seriously suggest looking at first adding support for
asynchronous I/O across ranges of extents during a sequential scan. You
might not need multiple worker backends at all.

I'm sure using async I/O to implement effective_io_concurrency for
seqscans has been been discussed and explored before, so again I think
some time in the list archives might make sense.

I don't know if it makes sense to do something as complex and parallel
multi-process seqscans without having a path forward for supporting
non-immutable functions - probably with fmgr API enhancements,
additional function labels ("PARALLEL"), etc, depending on what you find
is needed.

Do you have specific workloads where you see this as useful, and where
doing async I/O and readahead within a single back-end wouldn't solve
the same problem?


>>> 3. In the executor Init phase, Try to copy the necessary data required
>>> by the workers and start the workers.
>>
>> Copy how?
>>
>> Back-ends can only communicate with each other over shared memory,
>> signals, and using sockets.
> 
> Sorry for not being clear, copying those data structures into dynamic
> shared memory only.
> From there the workers can access.

That'll probably work with read-only data, but it's not viable for
read/write data unless you use a big lock to protect it, in which case
you lose the parallelism you want to achieve.

You'd have to classify what may be modified during scan execution
carefully and determine if you need to feed any of the resulting
modifications back to the original backend - and how to merge
modifications by multiple workers, if it's even possible.

That's going to involve a detailed structure-by-structure analysis and
seems likely to be error prone and buggy.

I think you should probably talk to Robert Haas about what he's been
doing over the last couple of years on parallel query.

>>> 4. In the executor run phase, just get the tuples which are sent by
>>> the workers and process them further in the plan node execution.
>>
>> Again, how do you propose to copy these back to the main bgworker?
> 
> With the help of message queues that are created in the dynamic shared memory,
> the workers can send the data to the queue. On other side the main
> backend receives the tuples from the queue.

OK, so you plan to implement shmem queues.

That'd be a useful starting point, as it'd be something that would be
useful in its own right.

You'd have to be able to handle individual values that're than the ring
buffer or whatever you're using for transfers, in case you're dealing
with already-detoasted tuples or in-memory tuples.

Again, chatting with Robert and others who've worked on dynamic shmem,
parallel query, etc would be wise here.

> Yes you are correct. For that reason only I am thinking of Supporting
> of functions
> that only dependent on input variables and are not modifying any global data.

You'll want to be careful with that. Nothing stops an immutable function
referencing a cache in a C global that it initializes one and then
treats as read only, for example.

I suspect you'll need a per-function whitelist. I'd love to be wrong.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: Parallel Sequence Scan doubts
Next
From: Haribabu Kommi
Date:
Subject: Re: [BUGS] BUG #9652: inet types don't support min/max