Re: Parallel Seq Scan - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Parallel Seq Scan |
Date | |
Msg-id | CA+Tgmobf_9hVOToSptM05oEe6cM5ohBt=GyNfb-o+MJt9nQSYw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Seq Scan (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Parallel Seq Scan
Re: Parallel Seq Scan |
List | pgsql-hackers |
On Sat, Sep 26, 2015 at 10:16 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Sep 26, 2015 at 8:38 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> QueryDesc's totaltime is for instrumentation information for plugin's >>> like pg_stat_statements and we need only the total buffer usage >>> of each worker to make it work as the other information is already >>> collected in master backend, so I think that should work as I have >>> written. >> >> I don't think that's right at all. First, an extension can choose to >> look at any part of the Instrumentation, not just the buffer usage. >> Secondly, the buffer usage inside QueryDesc's totaltime isn't the same >> as the global pgBufferUsage. > > Oh... but I'm wrong. As long as our local pgBufferUsage gets update > correctly to incorporate the data from the other workers, the > InstrStopNode(queryDesc->totaltime) will suck in those statistics. > And the only other things getting updated are nTuples (which shouldn't > count anything that the workers did), firsttuple (similarly), and > counter (where the behavior is a bit more arguable, but just counting > the master's wall-clock time is at least defensible). So now I think > you're right: this should be OK. OK, so here's a patch extracted from your parallel_seqscan_partialseqscan_v18.patch with a fairly substantial amount of rework by me: - I left out the Funnel node itself; this is just the infrastructure portion of the patch. I also left out the stop-the-executor early stuff and the serialization of PARAM_EXEC values. I want to have those things, but I think they need more thought and study first. - I reorganized the code a fair amount into a former that I thought was clearer, and certainly is closer to what I did previously in parallel.c. I found your version had lots of functions with lots of parameters, and I found that made the logic difficult to follow, at least for me. As part of that, I munged the interface a bit so that execParallel.c returns a structure with a bunch of pointers in it instead of separately returning each one as an out parameter. I think that's cleaner. If we need to add more stuff in the future, that way we don't break existing callers. - I reworked the interface with instrument.c and tried to preserve something of an abstraction boundary there. I also changed the way that stuff accumulated statistics to include more things; I couldn't see any reason to make it as narrow as you had it. - I did a bunch of cosmetic cleanup, including changing function names and rewriting comments. - I replaced your code for serializing and restoring a ParamListInfo with my version. - I fixed the code so that it can handle collecting instrumentation data from multiple nodes, bringing all the data back to the leader and associating it with the right plan node. This involved giving every plan node a unique ID, as discussed with Tom on another recent thread. After I did all that, I wrote some test code, which is also attached here, that adds a new GUC force_parallel_worker. If you set that GUC, when you run a query, it'll run the query in a parallel worker and feed the results back to the master. I've tested this and it seems to work, at least on the queries where you'd expect it to work. It's just test code, so it doesn't have error checking or make any attempt not to push down queries that will fail in parallel mode. But you can use it to see what happens. You can also run queries under EXPLAIN ANALYZE this way and, lo and behold, the worker stats show up attached to the correct plan nodes. I intend to commit this patch (but not the crappy test code, of course) pretty soon, and then I'm going to start working on the portion of the patch that actually adds the Funnel node, which I think you are working on renaming to Gather. I think that getting that part committed is likely to be pretty straightforward; it doesn't need to do a lot more than call this stuff and tell it to go do its thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: