Thread: [Proposal] Arbitrary queries in postgres_fdw

[Proposal] Arbitrary queries in postgres_fdw

From
rtorre@carto.com
Date:
Dear all,

We stumbled upon a few cases in which retrieving information from the
foreign server may turn pretty useful before creating any foreign
table, especially info related to the catalog. E.g: a list of schemas
or tables the user has access to.

I thought of using dblink for it, but that requires duplication of
server and user mapping details and it adds its own management of
connections.

Then I thought a better approach may be a mix of both: a function to
issue arbitrary queries to the foreign server reusing all the details
encapsulated in the server and user mapping. It would use the same
pool of connections.

E.g:

CREATE FUNCTION postgres_fdw_query(server name, sql text)
RETURNS SETOF record

SELECT * FROM postgres_fdw_query('foreign_server', $$SELECT table_name, table_type
    FROM information_schema.tables
    WHERE table_schema = 'public'
    ORDER BY table_name$$
) AS schemas(table_name text, table_type text);

Find attached a patch with a working PoC (with some code from
dblink). It is not meant to be perfect yet.

Is this something you may be interested in having as part of
postgres_fdw? Thoughts?

Thanks
-Rafa de la Torre
Attachment

Re: [Proposal] Arbitrary queries in postgres_fdw

From
Tom Lane
Date:
rtorre@carto.com writes:
> We stumbled upon a few cases in which retrieving information from the
> foreign server may turn pretty useful before creating any foreign
> table, especially info related to the catalog. E.g: a list of schemas
> or tables the user has access to.

> I thought of using dblink for it, but that requires duplication of
> server and user mapping details and it adds its own management of
> connections.

> Then I thought a better approach may be a mix of both: a function to
> issue arbitrary queries to the foreign server reusing all the details
> encapsulated in the server and user mapping. It would use the same
> pool of connections.

dblink can already reference a postgres_fdw "server" for connection
details, so I think this problem is already solved from the usability
end of things.  And allowing arbitrary queries to go over a postgres_fdw
connection would be absolutely disastrous from a debuggability and
maintainability standpoint, because they might change the remote
session's state in ways that postgres_fdw isn't prepared to handle.
(In a dblink connection, the remote session's state is the user's
responsibility to manage, but this isn't the case for postgres_fdw.)
So I think this proposal has to be firmly rejected.

            regards, tom lane



Re: [Proposal] Arbitrary queries in postgres_fdw

From
David Fetter
Date:
On Fri, Oct 25, 2019 at 05:17:18PM +0200, rtorre@carto.com wrote:
> Dear all,
> 
> We stumbled upon a few cases in which retrieving information from the
> foreign server may turn pretty useful before creating any foreign
> table, especially info related to the catalog. E.g: a list of schemas
> or tables the user has access to.
> 
> I thought of using dblink for it, but that requires duplication of
> server and user mapping details and it adds its own management of
> connections.
> 
> Then I thought a better approach may be a mix of both: a function to
> issue arbitrary queries to the foreign server reusing all the details
> encapsulated in the server and user mapping. It would use the same
> pool of connections.

There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
does something similar to this.  Might it be possible to incorporate
it into the previous patch that implemented that feature?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [Proposal] Arbitrary queries in postgres_fdw

From
rtorre@carto.com
Date:
On Sun, Oct 27, 2019 at 7:07 PM David Fetter <david@fetter.org> wrote:
>
> There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> does something similar to this.  Might it be possible to incorporate
> it into the previous patch that implemented that feature?

Thanks for the idea, David. I'll investigate it and hopefully
come up with a more standard proposal.

Best regards
-Rafa


Re: [Proposal] Arbitrary queries in postgres_fdw

From
Robert Haas
Date:
On Fri, Oct 25, 2019 at 12:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> end of things.  And allowing arbitrary queries to go over a postgres_fdw
> connection would be absolutely disastrous from a debuggability and
> maintainability standpoint, because they might change the remote
> session's state in ways that postgres_fdw isn't prepared to handle.
> (In a dblink connection, the remote session's state is the user's
> responsibility to manage, but this isn't the case for postgres_fdw.)
> So I think this proposal has to be firmly rejected.

I think the reduction in debuggability and maintainability has to be
balanced against a possible significant gain in usability.  I mean,
you could document that if the values of certain GUCs are changed, or
if you create and drop prepared statements with certain names, it
might cause queries in the same session issued through the regular
foreign table API to produce wrong answers. That would still leave an
enormous number of queries that users could issue with absolutely no
problems. I really don't see a bona fide maintainability problem here.
When someone produces a reproducible test case showing that they did
one of the things we told them not to do, then we'll tell them to read
the fine manual and move on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [Proposal] Arbitrary queries in postgres_fdw

From
rtorre@carto.com
Date:
On Sun, Oct 27, 2019 at 7:07 PM David Fetter <david@fetter.org> wrote:
> There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> does something similar to this.  Might it be possible to incorporate
> it into the previous patch that implemented that feature?

Supporting CREATE ROUTINE MAPPING goes a level beyond
postgres_fdw. It'd require adding new DDL syntax elements to the
parser, catalog and callbacks for the FDW's to support them.

For the record, there's a very interesting thread on this topic (that
you participated in):
https://www.postgresql.org/message-id/flat/CADkLM%3DdK0dmkzLhaLPpnjN2wBF5GRpvzOr%3DeW0EWdCnG-OHnpQ%40mail.gmail.com

I know they have different semantics and may turn more limiting, but
for certain use cases, the `extensions` parameter of postgres_fdw may
come in handy (shipping function calls to the foreign end from
extensions present in both local and foreign).

For my use case, which is retrieving catalog info before any CREATE
FOREIGN TABLE, CREATE ROUTINE MAPPING is not really a good fit.

Thank you for pointing out anyway.
-Rafa

Re: [Proposal] Arbitrary queries in postgres_fdw

From
rtorre@carto.com
Date:
> On Fri, Oct 25, 2019 at 12:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > end of things.  And allowing arbitrary queries to go over a postgres_fdw
> > connection would be absolutely disastrous from a debuggability and
> > maintainability standpoint, because they might change the remote
> > session's state in ways that postgres_fdw isn't prepared to handle.
> > (In a dblink connection, the remote session's state is the user's
> > responsibility to manage, but this isn't the case for postgres_fdw.)
> > So I think this proposal has to be firmly rejected.

On Mon, Oct 28, 2019 at 1:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I think the reduction in debuggability and maintainability has to be
> balanced against a possible significant gain in usability.  I mean,
> you could document that if the values of certain GUCs are changed, or
> if you create and drop prepared statements with certain names, it
> might cause queries in the same session issued through the regular
> foreign table API to produce wrong answers. That would still leave an
> enormous number of queries that users could issue with absolutely no
> problems.

I understand both points, the alternatives and the tradeoffs.

My motivations not use dblink are twofold: to purposefully reuse the
connection pool in postgres_fdw, and to avoid installing another
extension. I cannot speak to whether this can be advantageous to
others to accept the tradeoffs.

If you are still interested, I'm willing to listen to the feedback and
continue improving the patch. Otherwise we can settle it here and (of
course!) I won't take any offense because of that.

Find attached v2 of the patch with the following changes:
- added support for commands, as it failed upon PGRES_COMMAND_OK (with
  tests with prepared statements)
- documentation for the new function, with the mentioned caveats
- removed the test with the `SELECT current_user`, because it produced
  different results depending on the execution environment.

Regards
-Rafa
Attachment

Re: [Proposal] Arbitrary queries in postgres_fdw

From
David Fetter
Date:
On Tue, Nov 05, 2019 at 11:09:34AM +0100, rtorre@carto.com wrote:
> On Sun, Oct 27, 2019 at 7:07 PM David Fetter <david@fetter.org> wrote:
> > There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> > does something similar to this.  Might it be possible to incorporate
> > it into the previous patch that implemented that feature?
> 
> Supporting CREATE ROUTINE MAPPING goes a level beyond
> postgres_fdw. It'd require adding new DDL syntax elements to the
> parser, catalog and callbacks for the FDW's to support them.
> 
> For the record, there's a very interesting thread on this topic (that
> you participated in):
> https://www.postgresql.org/message-id/flat/CADkLM%3DdK0dmkzLhaLPpnjN2wBF5GRpvzOr%3DeW0EWdCnG-OHnpQ%40mail.gmail.com
> 
> I know they have different semantics and may turn more limiting, but
> for certain use cases, the `extensions` parameter of postgres_fdw may
> come in handy (shipping function calls to the foreign end from
> extensions present in both local and foreign).
> 
> For my use case, which is retrieving catalog info before any CREATE
> FOREIGN TABLE,

Could you use IMPORT FOREIGN SCHEMA for that? I seem to recall that
I've managed to import information_schema successfully.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [Proposal] Arbitrary queries in postgres_fdw

From
rtorre@carto.com
Date:
On Tue, Nov 5, 2019 at 7:41 PM David Fetter <david@fetter.org> wrote:
> Could you use IMPORT FOREIGN SCHEMA for that? I seem to recall that
> I've managed to import information_schema successfully.

Yes, I tried it and I can import and operate on the
information_schema, which actually covers part of my needs. It does so
at the expense of polluting the catalog with foreign tables, but I can
live with it. Thanks for pointing out.

There are other cases that can be covered with either this proposal or
CREATE ROUTINE MAPPING, but not with the current state of things (as
far as I know). E.g: calling version() or postgis_version() on the
foreign end.

It's largely a matter of convenience vs development effort. That said,
I understand this may not make the design quality cut.

Regards
-Rafa