Re: Option to dump foreign data in pg_dump - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Option to dump foreign data in pg_dump |
Date | |
Msg-id | 8001.1573759651@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Option to dump foreign data in pg_dump (Alvaro Herrera <alvherre@2ndquadrant.com>) |
List | pgsql-hackers |
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Nov-12, Luis Carril wrote: >> But, not all foreign tables are necessarily in a remote server like >> the ones referenced by the postgres_fdw. >> In FDWs like swarm64da, cstore, citus or timescaledb, the foreign >> tables are part of your database, and one could expect that a dump of >> the database includes data from these FDWs. > BTW these are not FDWs in the "foreign" sense at all; they're just > abusing the FDW system in order to be able to store data in some > different way. The right thing to do IMO is to port these systems to be > users of the new storage abstraction (table AM). If we do that, what > value is there to the feature being proposed here? That is a pretty valid point. I'm not sure however that there would be *no* use-cases for the proposed option if all of those FDWs were converted to table AMs. Also, even if the authors of those systems are all hard at work on such a conversion, it'd probably be years before the FDW implementations disappear from the wild. Having said that, I'm ending up -0.5 or so on the patch as it stands, mainly because it seems like it is bringing way more maintenance burden than it's realistically worth. I'm particularly unhappy about the proposed regression test additions --- the cycles added to check-world, and the maintenance effort that's inevitably going to be needed for all that code, seem unwarranted for something that's at best a very niche use-case. And, despite the bulk of the test additions, they're in no sense offering an end-to-end test, because that would require successfully reloading the data as well. That objection could be addressed, perhaps, by scaling down the tests to just have a goal of exercising the new pg_dump option-handling code, and not to attempt to do meaningful data extraction from a foreign table. You could do that with an entirely dummy foreign data wrapper and server (cf. sql/foreign_data.sql). I'm imagining perhaps create two dummy servers, of which only one has a table, and we ask to dump data from the other one. This would cover parsing and validation of the --include-foreign-data option, and make sure that we don't dump from servers we're not supposed to. It doesn't actually dump any data, but that part is a completely trivial aspect of the patch, really, and almost all of the code relevant to that does get tested already. In the department of minor nitpicks ... why bother with joining to pg_foreign_server in the query that retrieves a foreign table's server OID? ft.ftserver is already the answer you seek. Also, I think it'd be wise from a performance standpoint to skip doing that query altogether in the normal case where --include-foreign-data hasn't been requested. regards, tom lane
pgsql-hackers by date: