Re: FDW for PostgreSQL - Mailing list pgsql-hackers

From Albe Laurenz
Subject Re: FDW for PostgreSQL
Date
Msg-id A737B7A37273E048B164557ADEF4A58B057B6672@ntex2010a.host.magwien.gv.at
Whole thread Raw
In response to Re: FDW for PostgreSQL  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
>> I think that the question what to push down is a different question
>> from checking column data types, because there we can rely on the
>> type input functions to reject bad values.
>
> Unfortunately, that's a very myopic view of the situation: there
> are many cases where datatype semantics can vary without the I/O
> functions having any idea that anything is wrong.  To take one example,
> what if the underlying column is type citext but the user wrote "text"
> in the foreign table definition?  postgres_fdw would see no reason not
> to push "col = 'foo'" across, but that clause would behave quite
> differently on the remote side.  Another example is that float8 and
> numeric will have different opinions about the truth of
> "1.000000000000000000001 = 1.000000000000000000002", so you're going
> to get into trouble if you declare an FT column as one when the
> underlying column is the other, even though the I/O functions for these
> types will happily take each other's output.

You are right.

> So I think (and have written in the committed docs) that users had
> better be careful to ensure that FT columns are declared as the same
> type as the underlying columns, even though we can't readily enforce
> that, at least not for non-builtin types.
>
> And there's really no difference between that situation and the
> collation situation, though I agree with you that the latter is a lot
> more likely to bite careless users.

That's what I am worried about.

>> What about the following design principle:
>> Only push down conditions which are sure to return the correct
>> result, provided that the PostgreSQL system objects have not
>> been tampered with.
>
> That's a nice, simple, and useless-in-practice design principle,
> because it will toss out many situations that users will want to work;
> situations that in fact *would* work as long as the users adhere to
> safe coding conventions.  I do not believe that when people ask "why
> does performance of LIKE suck on my foreign table", they will accept an
> answer of "we don't allow that to be pushed across because we think
> you're too stupid to make the remote collation match".

I think that it will be pretty hard to get both reliability
and performance to an optimum.

I'd rather hear complaints about bad performance than
about bad results, but that's just my opinion.

>> Would it be reasonable to push down operators and functions
>> only if
>> a) they are in the pg_catalog schema
>> b) they have been around for a couple of releases
>> c) they are not collation sensitive?
>
> We don't track (b) nor (c), so this suggestion is entirely
> unimplementable in the 9.3 time frame; nor is it provably safe,
> unless by (b) you mean "must have been around since 7.4 or so".

My idea was to have a (hand picked) list of functions and operators
that are considered safe, but I understand that that is rather
ugly.  There could of course be no generic method because,
as you say, b) and c) are not tracked.

> On a longer time horizon this might be doable, but it would be a lot
> of work to implement a solution that most people will find far too
> restrictive.  I'd rather see the long-term focus be on doing
> type/collation matching, so that we can expand not restrict the set
> of things we can push across.

I like that vision, and of course my above idea does not
go well with it.

Yours,
Laurenz Albe



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Materialized views WIP patch
Next
From: Tom Lane
Date:
Subject: Re: Materialized views WIP patch