Re: Custom Scan APIs (Re: Custom Plan node) - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: Custom Scan APIs (Re: Custom Plan node) |
Date | |
Msg-id | 20140226152339.GH2921@tamriel.snowman.net Whole thread Raw |
In response to | Re: Custom Scan APIs (Re: Custom Plan node) (Kouhei Kaigai <kaigai@ak.jp.nec.com>) |
Responses |
Re: Custom Scan APIs (Re: Custom Plan node)
Re: Custom Scan APIs (Re: Custom Plan node) |
List | pgsql-hackers |
* Kouhei Kaigai (kaigai@ak.jp.nec.com) wrote: > IIUC, his approach was integration of join-pushdown within FDW APIs, > however, it does not mean the idea of remote-join is rejected. For my part, trying to consider doing remote joins *without* going through FDWs is just nonsensical. What are you joining remotely if not two foreign tables? With regard to the GPU approach, if that model works whereby the normal PG tuples are read off disk, fed over to the GPU, processed, then returned back to the user through PG, then I wouldn't consider it really a 'remote' join but rather simply a new execution node inside of PG which is planned and costed just like the others. We've been over the discussion already about trying to make that a pluggable system but the, very reasonable, push-back on that has been if it's really possible and really makes sense to be pluggable. It certainly doesn't *have* to be- PostgreSQL is written in C, as we all know, and plenty of C code talks to GPUs and shuffles memory around- and that's almost exactly what Robert is working on supporting with regular CPUs and PG backends already. In many ways, trying to conflate this idea of using-GPUs-to-do-work with the idea of remote-FDW-joins has really disillusioned me with regard to the CustomScan approach. > > Then perhaps they should be exposed more directly? I can understand > > generally useful functionality being exposed in a way that anyone can use > > it, but we need to avoid interfaces which can't be stable due to normal > > / ongoing changes to the backend code. > > > The functions my patches want to expose are: > - get_restriction_qual_cost() > - fix_expr_common() I'll try and find time to go look at these in more detail later this week. I have reservations about exposing the current estimates on costs as we may want to adjust them in the future- but such adjustments may need to be made in balance with other changes throughout the system and an external module which depends on one result from the qual costing might end up having problems with the costing changes because the extension author wasn't aware of the other changes happening in other areas of the costing. I'm talking about this from a "beyond-just-the-GUCs" point of view, I realize that the extension author could go look at the GUC settings, but it's entirely reasonable to believe we'll make changes to the default GUC settings along with how they're used in the future. > And, the functions my patches newly want are: > - bms_to_string() > - bms_from_string() Offhand, these look fine, if there's really an external use for them. Will try to look at them in more detail later. > > That's fine, if we can get data to and from those co-processors efficiently > > enough that it's worth doing so. If moving the data to the GPU's memory > > will take longer than running the actual aggregation, then it doesn't make > > any sense for regular tables because then we'd have to cache the data in > > the GPU's memory in some way across multiple queries, which isn't something > > we're set up to do. > > > When I made a prototype implementation on top of FDW, using CUDA, it enabled > to run sequential scan 10 times faster than SeqScan on regular tables, if > qualifiers are enough complex. > Library to communicate GPU (OpenCL/CUDA) has asynchronous data transfer > mode using hardware DMA. It allows to hide the cost of data transfer by > pipelining, if here is enough number of records to be transferred. That sounds very interesting and certainly figuring out the costing to support that model will be tricky. Also, shuffling the data around in that way will also be interesting. It strikes me that it'll be made more difficult if we're trying to do it through the limitations of a pre-defined API between the core code and an extension. > Also, the recent trend of semiconductor device is GPU integration with CPU, > that shares a common memory space. See, Haswell of Intel, Kaveri of AMD, or > Tegra K1 of nvidia. All of them shares same memory, so no need to transfer > the data to be calculated. This trend is dominated by physical law because > of energy consumption by semiconductor. So, I'm optimistic for my idea. And this just makes me wonder why the focus isn't on the background worker approach instead of trying to do this all in an extension. > The usage was found by the contrib module that wants to call static > functions, or feature to translate existing data structure to/from > cstring. But, anyway, does separated patch make sense? I haven't had a chance to go back and look into the functions in detail, but offhand I'd say the bms ones are probably fine while the others would need more research as to if they make sense to expose to an extension. > Hmm... It seems to me we should follow the existing manner to construct > join path, rather than special handling. Even if a query contains three or > more foreign tables managed by same server, it shall be consolidated into > one remote join as long as its cost is less than local ones. I'm not convinced that it's going to be that simple, but I'm certainly interested in the general idea. > So, I'd like to bed using the new add_join_path_hook to compute possible > join path. If remote join implemented by custom-scan is cheaper than local > join, it shall be chosen, then optimizer will try joining with other foreign > tables with this custom-scan node. If remote-join is still cheap, then it > shall be consolidated again. And I'm still unconvinced that trying to make this a hook and implemented by an extension makes sense. > > Admittedly, getting the costing right isn't easy either, but it's not clear > > to me how it'd make sense for the local server to be doing costing for remote > > servers. > > > Right now, I ignored the cost to run remote-server, focused on the cost to > transfer via network. It might be an idea to discount the CPU cost of remote > execution. Pretty sure we're going to need to consider the remote processing cost of the join as well.. Thanks, Stephen
pgsql-hackers by date: