Re: Parallel Seq Scan - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Parallel Seq Scan |
Date | |
Msg-id | CA+TgmoY_grYf9S3zf6bBsRK_8UudtKrhZdrkDzsEtAALZVHkbw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Seq Scan (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Parallel Seq Scan
(Amit Kapila <amit.kapila16@gmail.com>)
Re: Parallel Seq Scan (Andres Freund <andres@2ndquadrant.com>) |
List | pgsql-hackers |
On Sat, Feb 7, 2015 at 4:30 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-02-06 22:57:43 -0500, Robert Haas wrote: >> On Fri, Feb 6, 2015 at 2:13 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> > My first comment here is that I think we should actually teach >> > heapam.c about parallelism. >> >> I coded this up; see attached. I'm also attaching an updated version >> of the parallel count code revised to use this API. It's now called >> "parallel_count" rather than "parallel_dummy" and I removed some >> stupid stuff from it. I'm curious to see what other people think, but >> this seems much cleaner to me. With the old approach, the >> parallel-count code was duplicating some of the guts of heapam.c and >> dropping the rest on the floor; now it just asks for a parallel scan >> and away it goes. Similarly, if your parallel-seqscan patch wanted to >> scan block-by-block rather than splitting the relation into equal >> parts, or if it wanted to participate in the synchronized-seqcan >> stuff, there was no clean way to do that. With this approach, those >> decisions are - as they quite properly should be - isolated within >> heapam.c, rather than creeping into the executor. > > I'm not convinced that that reasoning is generally valid. While it may > work out nicely for seqscans - which might be useful enough on its own - > the more stuff we parallelize the *more* the executor will have to know > about it to make it sane. To actually scale nicely e.g. a parallel sort > will have to execute the nodes below it on each backend, instead of > doing that in one as a separate step, ferrying over all tuples to > indivdual backends through queues, and only then parallezing the > sort. > > Now. None of that is likely to matter immediately, but I think starting > to build the infrastructure at the points where we'll later need it does > make some sense. Well, I agree with you, but I'm not really sure what that has to do with the issue at hand. I mean, if we were to apply Amit's patch, we'd been in a situation where, for a non-parallel heap scan, heapam.c decides the order in which blocks get scanned, but for a parallel heap scan, nodeParallelSeqscan.c makes that decision. Maybe I'm an old fuddy-duddy[1] but that seems like an abstraction violation to me. I think the executor should see a parallel scan as a stream of tuples that streams into a bunch of backends in parallel, without really knowing how heapam.c is dividing up the work. That's how it's modularized today, and I don't see a reason to change it. Do you? Regarding tuple flow between backends, I've thought about that before, I agree that we need it, and I don't think I know how to do it. I can see how to have a group of processes executing a single node in parallel, or a single process executing a group of nodes we break off from the query tree and push down to it, but what you're talking about here is a group of processes executing a group of nodes jointly. That seems like an excellent idea, but I don't know how to design it. Actually routing the tuples between whichever backends we want to exchange them between is easy enough, but how do we decide whether to generate such a plan? What does the actual plan tree look like? Maybe we designate nodes as can-generate-multiple-tuple-streams (seq scan, mostly, I would think) and can-absorb-parallel-tuple-streams (sort, hash, materialize), or something like that, but I'm really fuzzy on the details. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] Actually, there's not really any "maybe" about this.
pgsql-hackers by date: