On Tue, Sep 1, 2015 at 11:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> It would be a bad idea to cling blindly to the FDW infrastructure if
> it's fundamentally inadequate to do what we want. On the other hand,
> it would also be a bad idea to set about recreating it without a
> really good reason, and - just to take one example - the fact that it
> doesn't currently push down DML operations to the remote side is not a
> really good reason to rewrite the whole thing. On the contrary, it's
> a reason to put some energy into the already-written patch which
> implements that optimization.
The problem with FDW for these purposes as I see it is that too much
intelligence is relegated to the implementer of the API. There needs
to be a mechanism so that the planner can rewrite the remote query and
then do some after the fact processing. This exactly what citus does;
if you send out AVG(foo) it rewrites that to SUM(foo) and COUNT(foo)
so that aggregation can be properly weighted to the result. To do
this (distributed OLAP-type processing) right, the planner needs to
*know* that this table is in fact distributed and also know that it
can make SQL compatible adjustments to the query.
This strikes me as a bit of a conflict of interest with FDW which
seems to want to hide the fact that it's foreign; the FDW
implementation makes it's own optimization decisions which might make
sense for single table queries but breaks down in the face of joins.
merlin