Re: FDW for PostgreSQL - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: FDW for PostgreSQL |
Date | |
Msg-id | 3290.1361442627@sss.pgh.pa.us Whole thread Raw |
In response to | Re: FDW for PostgreSQL (Shigeru Hanada <shigeru.hanada@gmail.com>) |
Responses |
Re: FDW for PostgreSQL
(Albe Laurenz <laurenz.albe@wien.gv.at>)
Re: FDW for PostgreSQL (Thom Brown <thom@linux.com>) Re: FDW for PostgreSQL (Stephen Frost <sfrost@snowman.net>) |
List | pgsql-hackers |
Shigeru Hanada <shigeru.hanada@gmail.com> writes: > [ postgres_fdw.v5.patch ] Applied with a lot of revisions. There are still a number of loose ends and things that need to be discussed: I'm not at all happy with the planner support functions --- as designed, that code is basically incapable of thinking about more than one remote access path. It needs to be restructured and then extended to be able to generate parameterized remote paths. The "local estimation" mode is pretty bogus as well. I thought the patch could be committed anyway, but I'm going to go back and work on that part later. I doubt that deparseFuncExpr is doing the right thing by printing casts as their underlying function calls. The comment claims that this way is more robust but I rather think it's less so, because it's effectively assuming that the remote server implements casts exactly like the local one, which might be incorrect if the remote is a different Postgres version. I think we should probably change that, but would like to know what the argument was for coding it like this. Also, if this is to be the approach to printing casts, why is RelabelType handled differently? As I mentioned earlier, I think it would be better to force the remote session's search_path setting to just "pg_catalog" and then reduce the amount of explicit schema naming in the queries --- any opinions about that? I took out the checks on collations of operators because I thought they were thoroughly broken. In the first place, looking at operator names to deduce semantics is unsafe (if we were to try to distinguish equality, looking up btree opclass membership would be the way to do that). In the second place, restricting only collation-sensitive operators and not collation-sensitive functions seems just about useless for guaranteeing safety. But we don't have any very good handle on which functions might be safe to send despite having collatable input types, so taking that approach would greatly restrict our ability to send function calls at all. The bigger picture here though is that we're already relying on the user to make sure that remote tables have column data types matching the local definition, so why can't we say that they've got to make sure collations match too? So I think this is largely a documentation issue and we don't need any automated enforcement mechanism, or at least it's silly to try to enforce this when we're not enforcing column type matching (yet?). What might make sense is to try to determine whether a WHERE clause uses any collations different from those of the contained foreign-column Vars, and send it over only if not. That would prevent us from sending clauses that explicitly use collations that might not exist on the remote server. I didn't try to code this though. Another thing that I find fairly suspicious in this connection is that deparse.c isn't bothering to print collations attached to Const nodes. That may be a good idea to avoid needing the assumption that the remote server uses the same collation names we do, but if we're going to do it like this, those Const collations need to be considered when deciding if the expression is safe to send at all. A more general idea that follows on from that is that if we're relying on the user to be sure the semantics are the same, maybe we don't need to be quite so tight about what we'll send over. In particular, if the user has declared a foreign-table column of a non-built-in type, the current code will never send any constraints for that column at all, which seems overly conservative if you believe he's matched the type correctly. I'm not sure exactly how to act on that thought, but I think there's room for improvement there. A related issue is that as coded, is_builtin() is pretty flaky, because what's built-in on our server might not exist at all on the remote side, if it's a different major Postgres version. So what we've got is that the code is overly conservative about what it will send and yet still perfectly capable of sending remote queries that will fail outright, which is not a happy combination. I have no very good idea how to deal with that though. Another thing I was wondering about, but did not change, is that if we're having the remote transaction inherit the local transaction's isolation level, shouldn't it inherit the READ ONLY property as well? regards, tom lane
pgsql-hackers by date: