Re: pgsql_fdw, FDW for PostgreSQL server - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: pgsql_fdw, FDW for PostgreSQL server |
Date | |
Msg-id | 12181.1331223482@sss.pgh.pa.us Whole thread Raw |
In response to | Re: pgsql_fdw, FDW for PostgreSQL server (Shigeru Hanada <shigeru.hanada@gmail.com>) |
Responses |
Re: pgsql_fdw, FDW for PostgreSQL server
|
List | pgsql-hackers |
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > [ pgsql_fdw_v15.patch ] I've been looking at this patch a little bit over the past day or so. I'm pretty unhappy with deparse.c --- it seems like a real kluge, inefficient and full of corner-case bugs. After some thought I believe that you're ultimately going to have to abandon depending on ruleutils.c for reverse-listing services, and it would be best to bite that bullet now and rewrite this code from scratch. ruleutils.c is serving two masters already (rule-dumping and EXPLAIN) and it's not going to be practical to tweak its behavior further for this usage; yet there are all sorts of clear problems that you are going to run into, boiling down to the fact that names on the remote end aren't necessarily the same as names on the local end. For instance, ruleutils.c is not going to be helpful at schema-qualifying function names in a way that's correct for the foreign server environment. Another issue is that as soon as you try to push down join clauses for parameterized paths, you are going to want Vars of other relations to be printed as parameters ($n, most likely) and ruleutils is not going to know to do that. Seeing that semantic constraints will greatly limit the set of node types that can ever be pushed down anyway, I think it's likely to be easiest to just write your own node-printing code and not even try to use ruleutils.c. There are a couple of other points that make me think we need to revisit the PlanForeignScan API definition some more, too. First, deparse.c is far from cheap. While this doesn't matter greatly as long as there's only one possible path for a foreign table, as soon as you want to create more than one it's going to be annoying to do all that work N times and then throw away N-1 of the results. I also choked on the fact that the pushdown patch thinks it can modify baserel->baserestrictinfo. That might accidentally fail to malfunction right now, but it's never going to scale to multiple paths with potentially different sets of remotely-applied constraints. So I'm thinking we really need to let FDWs in on the Path versus Plan distinction --- that is, a Path just needs to be a cheap summary of a way to do things, and then at createplan.c time you convert the selected Path into a full-fledged Plan. Most of the work done in deparse.c could be postponed to createplan time and done only once, even with multiple paths. The baserestrictinfo hack would be unnecessary too if the FDW had more direct control over generation of the ForeignScan plan node. Another thing I'm thinking we should let FDWs in on is the distinction between rowcount estimation and path generation. When we did the first API design last year it was okay to expect a single call to do both, but as of a couple months ago allpaths.c does those steps in two separate passes over the baserels, and it'd be handy if FDWs would cooperate. So we need to break down what PlanForeignScan currently does into three separate steps. The first idea that comes to mind is to call them GetForeignRelSize, GetForeignPaths, GetForeignPlan; but maybe somebody has a better idea for names? regards, tom lane
pgsql-hackers by date: