Thread: [Proposal] Arbitrary queries in postgres_fdw
Dear all,
We stumbled upon a few cases in which retrieving information from the
foreign server may turn pretty useful before creating any foreign
table, especially info related to the catalog. E.g: a list of schemas
or tables the user has access to.
I thought of using dblink for it, but that requires duplication of
server and user mapping details and it adds its own management of
connections.
Then I thought a better approach may be a mix of both: a function to
issue arbitrary queries to the foreign server reusing all the details
encapsulated in the server and user mapping. It would use the same
pool of connections.
E.g:
CREATE FUNCTION postgres_fdw_query(server name, sql text)
RETURNS SETOF record
SELECT * FROM postgres_fdw_query('foreign_server', $$SELECT table_name, table_type
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name$$
) AS schemas(table_name text, table_type text);
Is this something you may be interested in having as part of
postgres_fdw? Thoughts?
Thanks
-Rafa de la Torre
We stumbled upon a few cases in which retrieving information from the
foreign server may turn pretty useful before creating any foreign
table, especially info related to the catalog. E.g: a list of schemas
or tables the user has access to.
I thought of using dblink for it, but that requires duplication of
server and user mapping details and it adds its own management of
connections.
Then I thought a better approach may be a mix of both: a function to
issue arbitrary queries to the foreign server reusing all the details
encapsulated in the server and user mapping. It would use the same
pool of connections.
E.g:
CREATE FUNCTION postgres_fdw_query(server name, sql text)
RETURNS SETOF record
SELECT * FROM postgres_fdw_query('foreign_server', $$SELECT table_name, table_type
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name$$
) AS schemas(table_name text, table_type text);
Find attached a patch with a working PoC (with some code from
dblink). It is not meant to be perfect yet.
dblink). It is not meant to be perfect yet.
Is this something you may be interested in having as part of
postgres_fdw? Thoughts?
Thanks
-Rafa de la Torre
Attachment
rtorre@carto.com writes: > We stumbled upon a few cases in which retrieving information from the > foreign server may turn pretty useful before creating any foreign > table, especially info related to the catalog. E.g: a list of schemas > or tables the user has access to. > I thought of using dblink for it, but that requires duplication of > server and user mapping details and it adds its own management of > connections. > Then I thought a better approach may be a mix of both: a function to > issue arbitrary queries to the foreign server reusing all the details > encapsulated in the server and user mapping. It would use the same > pool of connections. dblink can already reference a postgres_fdw "server" for connection details, so I think this problem is already solved from the usability end of things. And allowing arbitrary queries to go over a postgres_fdw connection would be absolutely disastrous from a debuggability and maintainability standpoint, because they might change the remote session's state in ways that postgres_fdw isn't prepared to handle. (In a dblink connection, the remote session's state is the user's responsibility to manage, but this isn't the case for postgres_fdw.) So I think this proposal has to be firmly rejected. regards, tom lane
On Fri, Oct 25, 2019 at 05:17:18PM +0200, rtorre@carto.com wrote: > Dear all, > > We stumbled upon a few cases in which retrieving information from the > foreign server may turn pretty useful before creating any foreign > table, especially info related to the catalog. E.g: a list of schemas > or tables the user has access to. > > I thought of using dblink for it, but that requires duplication of > server and user mapping details and it adds its own management of > connections. > > Then I thought a better approach may be a mix of both: a function to > issue arbitrary queries to the foreign server reusing all the details > encapsulated in the server and user mapping. It would use the same > pool of connections. There's a SQL MED standard feature for CREATE ROUTINE MAPPING that does something similar to this. Might it be possible to incorporate it into the previous patch that implemented that feature? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sun, Oct 27, 2019 at 7:07 PM David Fetter <david@fetter.org> wrote:
>
> There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> does something similar to this. Might it be possible to incorporate
> it into the previous patch that implemented that feature?
Thanks for the idea, David. I'll investigate it and hopefully
come up with a more standard proposal.
Best regards
-Rafa
>
> There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> does something similar to this. Might it be possible to incorporate
> it into the previous patch that implemented that feature?
Thanks for the idea, David. I'll investigate it and hopefully
come up with a more standard proposal.
Best regards
-Rafa
On Fri, Oct 25, 2019 at 12:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > end of things. And allowing arbitrary queries to go over a postgres_fdw > connection would be absolutely disastrous from a debuggability and > maintainability standpoint, because they might change the remote > session's state in ways that postgres_fdw isn't prepared to handle. > (In a dblink connection, the remote session's state is the user's > responsibility to manage, but this isn't the case for postgres_fdw.) > So I think this proposal has to be firmly rejected. I think the reduction in debuggability and maintainability has to be balanced against a possible significant gain in usability. I mean, you could document that if the values of certain GUCs are changed, or if you create and drop prepared statements with certain names, it might cause queries in the same session issued through the regular foreign table API to produce wrong answers. That would still leave an enormous number of queries that users could issue with absolutely no problems. I really don't see a bona fide maintainability problem here. When someone produces a reproducible test case showing that they did one of the things we told them not to do, then we'll tell them to read the fine manual and move on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Oct 27, 2019 at 7:07 PM David Fetter <david@fetter.org> wrote:
> There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> does something similar to this. Might it be possible to incorporate
> it into the previous patch that implemented that feature?
Supporting CREATE ROUTINE MAPPING goes a level beyond
postgres_fdw. It'd require adding new DDL syntax elements to the
parser, catalog and callbacks for the FDW's to support them.
For the record, there's a very interesting thread on this topic (that
you participated in):
https://www.postgresql.org/message-id/flat/CADkLM%3DdK0dmkzLhaLPpnjN2wBF5GRpvzOr%3DeW0EWdCnG-OHnpQ%40mail.gmail.com
I know they have different semantics and may turn more limiting, but
for certain use cases, the `extensions` parameter of postgres_fdw may
come in handy (shipping function calls to the foreign end from
extensions present in both local and foreign).
For my use case, which is retrieving catalog info before any CREATE
FOREIGN TABLE, CREATE ROUTINE MAPPING is not really a good fit.
> There's a SQL MED standard feature for CREATE ROUTINE MAPPING that
> does something similar to this. Might it be possible to incorporate
> it into the previous patch that implemented that feature?
Supporting CREATE ROUTINE MAPPING goes a level beyond
postgres_fdw. It'd require adding new DDL syntax elements to the
parser, catalog and callbacks for the FDW's to support them.
For the record, there's a very interesting thread on this topic (that
you participated in):
https://www.postgresql.org/message-id/flat/CADkLM%3DdK0dmkzLhaLPpnjN2wBF5GRpvzOr%3DeW0EWdCnG-OHnpQ%40mail.gmail.com
I know they have different semantics and may turn more limiting, but
for certain use cases, the `extensions` parameter of postgres_fdw may
come in handy (shipping function calls to the foreign end from
extensions present in both local and foreign).
For my use case, which is retrieving catalog info before any CREATE
FOREIGN TABLE, CREATE ROUTINE MAPPING is not really a good fit.
Thank you for pointing out anyway.
-Rafa
> On Fri, Oct 25, 2019 at 12:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > end of things. And allowing arbitrary queries to go over a postgres_fdw
> > connection would be absolutely disastrous from a debuggability and
> > maintainability standpoint, because they might change the remote
> > session's state in ways that postgres_fdw isn't prepared to handle.
> > (In a dblink connection, the remote session's state is the user's
> > responsibility to manage, but this isn't the case for postgres_fdw.)
> > So I think this proposal has to be firmly rejected.
On Mon, Oct 28, 2019 at 1:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I think the reduction in debuggability and maintainability has to be
> balanced against a possible significant gain in usability. I mean,
> you could document that if the values of certain GUCs are changed, or
> if you create and drop prepared statements with certain names, it
> might cause queries in the same session issued through the regular
> foreign table API to produce wrong answers. That would still leave an
> enormous number of queries that users could issue with absolutely no
> problems.
I understand both points, the alternatives and the tradeoffs.
My motivations not use dblink are twofold: to purposefully reuse the
connection pool in postgres_fdw, and to avoid installing another
extension. I cannot speak to whether this can be advantageous to
others to accept the tradeoffs.
If you are still interested, I'm willing to listen to the feedback and
continue improving the patch. Otherwise we can settle it here and (of
course!) I won't take any offense because of that.
Find attached v2 of the patch with the following changes:
- added support for commands, as it failed upon PGRES_COMMAND_OK (with
tests with prepared statements)
- documentation for the new function, with the mentioned caveats
- removed the test with the `SELECT current_user`, because it produced
different results depending on the execution environment.
Regards
-Rafa
> > end of things. And allowing arbitrary queries to go over a postgres_fdw
> > connection would be absolutely disastrous from a debuggability and
> > maintainability standpoint, because they might change the remote
> > session's state in ways that postgres_fdw isn't prepared to handle.
> > (In a dblink connection, the remote session's state is the user's
> > responsibility to manage, but this isn't the case for postgres_fdw.)
> > So I think this proposal has to be firmly rejected.
On Mon, Oct 28, 2019 at 1:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I think the reduction in debuggability and maintainability has to be
> balanced against a possible significant gain in usability. I mean,
> you could document that if the values of certain GUCs are changed, or
> if you create and drop prepared statements with certain names, it
> might cause queries in the same session issued through the regular
> foreign table API to produce wrong answers. That would still leave an
> enormous number of queries that users could issue with absolutely no
> problems.
I understand both points, the alternatives and the tradeoffs.
My motivations not use dblink are twofold: to purposefully reuse the
connection pool in postgres_fdw, and to avoid installing another
extension. I cannot speak to whether this can be advantageous to
others to accept the tradeoffs.
If you are still interested, I'm willing to listen to the feedback and
continue improving the patch. Otherwise we can settle it here and (of
course!) I won't take any offense because of that.
Find attached v2 of the patch with the following changes:
- added support for commands, as it failed upon PGRES_COMMAND_OK (with
tests with prepared statements)
- documentation for the new function, with the mentioned caveats
- removed the test with the `SELECT current_user`, because it produced
different results depending on the execution environment.
Regards
-Rafa
Attachment
On Tue, Nov 05, 2019 at 11:09:34AM +0100, rtorre@carto.com wrote: > On Sun, Oct 27, 2019 at 7:07 PM David Fetter <david@fetter.org> wrote: > > There's a SQL MED standard feature for CREATE ROUTINE MAPPING that > > does something similar to this. Might it be possible to incorporate > > it into the previous patch that implemented that feature? > > Supporting CREATE ROUTINE MAPPING goes a level beyond > postgres_fdw. It'd require adding new DDL syntax elements to the > parser, catalog and callbacks for the FDW's to support them. > > For the record, there's a very interesting thread on this topic (that > you participated in): > https://www.postgresql.org/message-id/flat/CADkLM%3DdK0dmkzLhaLPpnjN2wBF5GRpvzOr%3DeW0EWdCnG-OHnpQ%40mail.gmail.com > > I know they have different semantics and may turn more limiting, but > for certain use cases, the `extensions` parameter of postgres_fdw may > come in handy (shipping function calls to the foreign end from > extensions present in both local and foreign). > > For my use case, which is retrieving catalog info before any CREATE > FOREIGN TABLE, Could you use IMPORT FOREIGN SCHEMA for that? I seem to recall that I've managed to import information_schema successfully. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Tue, Nov 5, 2019 at 7:41 PM David Fetter <david@fetter.org> wrote:
> Could you use IMPORT FOREIGN SCHEMA for that? I seem to recall that
> I've managed to import information_schema successfully.
Yes, I tried it and I can import and operate on the
information_schema, which actually covers part of my needs. It does so
at the expense of polluting the catalog with foreign tables, but I can
live with it. Thanks for pointing out.
There are other cases that can be covered with either this proposal or
CREATE ROUTINE MAPPING, but not with the current state of things (as
far as I know). E.g: calling version() or postgis_version() on the
foreign end.
It's largely a matter of convenience vs development effort. That said,
I understand this may not make the design quality cut.
Regards
-Rafa
> Could you use IMPORT FOREIGN SCHEMA for that? I seem to recall that
> I've managed to import information_schema successfully.
Yes, I tried it and I can import and operate on the
information_schema, which actually covers part of my needs. It does so
at the expense of polluting the catalog with foreign tables, but I can
live with it. Thanks for pointing out.
There are other cases that can be covered with either this proposal or
CREATE ROUTINE MAPPING, but not with the current state of things (as
far as I know). E.g: calling version() or postgis_version() on the
foreign end.
It's largely a matter of convenience vs development effort. That said,
I understand this may not make the design quality cut.
Regards
-Rafa