Thread: [HACKERS] postgres_fdw IMPORT SCHEMA and partitioned tables
Greetings, While reviewing Amit Langote's patch to handle partitioned tables properly in various contrib modules (mostly by throwing an error since things like pageinspect aren't going to work on the empty 'parent' table), I went looking through contrib for other modules that do something with relkind and noticed that postgres_fdw's IMPORT SCHEMA would pull in the child tables (relkind = 'r') but would ignore the parent table (relkind = 'P', or soon to be 'p', I guess). I tend to view this as an issue which should be added to the open items list and resolved before PG10 (though perhaps it could be done after feature freeze), but I could see an argument that it should be just a documented limitation of postgres_fdw and that adding such support would be a new feature. In any case, this seems like an issue that should be addressed one way or the other, so I'll add it to the open items list. I'm not planning to work on fixing it myself, but if someone proposes a patch which looks reasonable, I'll try to find time for it. Thanks! Stephen
On Thu, Mar 9, 2017 at 11:15 PM, Stephen Frost <sfrost@snowman.net> wrote: > While reviewing Amit Langote's patch to handle partitioned tables > properly in various contrib modules (mostly by throwing an error since > things like pageinspect aren't going to work on the empty 'parent' > table), I went looking through contrib for other modules that do > something with relkind and noticed that postgres_fdw's IMPORT SCHEMA > would pull in the child tables (relkind = 'r') but would ignore the > parent table (relkind = 'P', or soon to be 'p', I guess). It is not as straight-forward as it seems. A foreign table can be defined as a child (use of PARTITION OF), but not as a parent (use PARTITION BY), and IMPORT SCHEMA has to issue queries to create foreign tables. It seems to me that the correct fix here is to ignore child tables that are part of a partition, and just include the parent in what is imported so as when querying the parent through postgres_fdw all the child partitions are considered automatically. Thoughts? Of course this should be documented. > I tend to view this as an issue which should be added to the open items > list and resolved before PG10 (though perhaps it could be done after > feature freeze), but I could see an argument that it should be just a > documented limitation of postgres_fdw and that adding such support would > be a new feature. Agreed. I think that this is a bug, because any database having one partitioning set of tables would fail to import automatically except by excluding some tables, and that's annoying for the user. > In any case, this seems like an issue that should be addressed one way > or the other, so I'll add it to the open items list. I'm not planning > to work on fixing it myself, but if someone proposes a patch which looks > reasonable, I'll try to find time for it. Well, I worked on IMPORT SCHEMA, so I'm fine to do something. After the CF is done though. -- Michael
On 2017/03/10 10:26, Michael Paquier wrote: > On Thu, Mar 9, 2017 at 11:15 PM, Stephen Frost <sfrost@snowman.net> wrote: >> While reviewing Amit Langote's patch to handle partitioned tables >> properly in various contrib modules (mostly by throwing an error since >> things like pageinspect aren't going to work on the empty 'parent' >> table), I went looking through contrib for other modules that do >> something with relkind and noticed that postgres_fdw's IMPORT SCHEMA >> would pull in the child tables (relkind = 'r') but would ignore the >> parent table (relkind = 'P', or soon to be 'p', I guess). > > It is not as straight-forward as it seems. A foreign table can be > defined as a child (use of PARTITION OF), but not as a parent (use > PARTITION BY), and IMPORT SCHEMA has to issue queries to create > foreign tables. It seems to me that the correct fix here is to ignore > child tables that are part of a partition, and just include the parent > in what is imported so as when querying the parent through > postgres_fdw all the child partitions are considered automatically. > Thoughts? I think that makes sense. The query in postgresImportForeignSchema() that fetches the information about remote tables should be fixed to include relkind = 'P' tables (partitioned tables) but exclude relispartition = true (partitions). Something like below: - "WHERE c.relkind IN ('r', 'v', 'f', 'm') " + "WHERE c.relkind IN ('r', 'v', 'f', 'm', 'P') " + " AND NOT c.relispartition " It means we don't import tables that are supposed to be partitions of some table. If we allow importing the latter, we get access to those partitions anyway. I would like to hear more opinions of course. > Of course this should be documented. +1 >> I tend to view this as an issue which should be added to the open items >> list and resolved before PG10 (though perhaps it could be done after >> feature freeze), but I could see an argument that it should be just a >> documented limitation of postgres_fdw and that adding such support would >> be a new feature. > > Agreed. I think that this is a bug, because any database having one > partitioning set of tables would fail to import automatically except > by excluding some tables, and that's annoying for the user. Agreed too. >> In any case, this seems like an issue that should be addressed one way >> or the other, so I'll add it to the open items list. I'm not planning >> to work on fixing it myself, but if someone proposes a patch which looks >> reasonable, I'll try to find time for it. > > Well, I worked on IMPORT SCHEMA, so I'm fine to do something. After > the CF is done though. Thanks. Regards, Amit
On Thu, Mar 9, 2017 at 8:47 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> It is not as straight-forward as it seems. A foreign table can be >> defined as a child (use of PARTITION OF), but not as a parent (use >> PARTITION BY), and IMPORT SCHEMA has to issue queries to create >> foreign tables. It seems to me that the correct fix here is to ignore >> child tables that are part of a partition, and just include the parent >> in what is imported so as when querying the parent through >> postgres_fdw all the child partitions are considered automatically. >> Thoughts? > > I think that makes sense. The query in postgresImportForeignSchema() that > fetches the information about remote tables should be fixed to include > relkind = 'P' tables (partitioned tables) but exclude relispartition = > true (partitions). Something like below: > > - "WHERE c.relkind IN ('r', 'v', 'f', 'm') " > + "WHERE c.relkind IN ('r', 'v', 'f', 'm', 'P') " > + " AND NOT c.relispartition " > > It means we don't import tables that are supposed to be partitions of some > table. If we allow importing the latter, we get access to those > partitions anyway. > > I would like to hear more opinions of course. For the most part, I'm not very exercised about this either way. I think that the above definition seems reasonably likely to be a useful one, but I certainly wouldn't try to insist on it in the face of opposition; I don't think it's 100% clear what users will want here. However, if we're going to do something about this, I think it should be done soon. Otherwise, I'm going to advocate for reclassifying this issue from "open item" to "possible area for future development". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 29, 2017 at 12:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Mar 9, 2017 at 8:47 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> It is not as straight-forward as it seems. A foreign table can be >>> defined as a child (use of PARTITION OF), but not as a parent (use >>> PARTITION BY), and IMPORT SCHEMA has to issue queries to create >>> foreign tables. It seems to me that the correct fix here is to ignore >>> child tables that are part of a partition, and just include the parent >>> in what is imported so as when querying the parent through >>> postgres_fdw all the child partitions are considered automatically. >>> Thoughts? >> >> I think that makes sense. The query in postgresImportForeignSchema() that >> fetches the information about remote tables should be fixed to include >> relkind = 'P' tables (partitioned tables) but exclude relispartition = >> true (partitions). Something like below: >> >> - "WHERE c.relkind IN ('r', 'v', 'f', 'm') " >> + "WHERE c.relkind IN ('r', 'v', 'f', 'm', 'P') " >> + " AND NOT c.relispartition " >> >> It means we don't import tables that are supposed to be partitions of some >> table. If we allow importing the latter, we get access to those >> partitions anyway. >> >> I would like to hear more opinions of course. > > For the most part, I'm not very exercised about this either way. I > think that the above definition seems reasonably likely to be a useful > one, but I certainly wouldn't try to insist on it in the face of > opposition; I don't think it's 100% clear what users will want here. Users like things that are friendly, and we are most likely going to piss them off when using postgres_fdw if they need to list manually each parent table from the IMPORT FOREIGN SCHEMA command. > However, if we're going to do something about this, I think it should > be done soon. Otherwise, I'm going to advocate for reclassifying this > issue from "open item" to "possible area for future development". I was just waiting for the end of the CF before sending in a patch, allocating now some time to look at some patches pending for reviews. -- Michael
On Wed, Mar 29, 2017 at 12:30 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > Users like things that are friendly, and we are most likely going to > piss them off when using postgres_fdw if they need to list manually > each parent table from the IMPORT FOREIGN SCHEMA command. > >> However, if we're going to do something about this, I think it should >> be done soon. Otherwise, I'm going to advocate for reclassifying this >> issue from "open item" to "possible area for future development". > > I was just waiting for the end of the CF before sending in a patch, > allocating now some time to look at some patches pending for reviews. And here is the promised patch to address this open item. -- Michael
Attachment
On 2017/03/31 13:23, Michael Paquier wrote: > On Wed, Mar 29, 2017 at 12:30 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> Users like things that are friendly, and we are most likely going to >> piss them off when using postgres_fdw if they need to list manually >> each parent table from the IMPORT FOREIGN SCHEMA command. >> >>> However, if we're going to do something about this, I think it should >>> be done soon. Otherwise, I'm going to advocate for reclassifying this >>> issue from "open item" to "possible area for future development". >> >> I was just waiting for the end of the CF before sending in a patch, >> allocating now some time to look at some patches pending for reviews. > > And here is the promised patch to address this open item. Looks good to me, except maybe: + + <para> + For partitioned tables, partitions are automatically excluded from the + schema data imported. Only the definition of partitioned tables is included + to give access to the full data set of all partitions present remotely. + </para> + Only the definitions of "root" partitioned tables, because when using multi-level partitioning, there would be partitioned tables that won't be included (because, relispartition=true). If you agree, then this code comment too could use the same terminology: + * Ignore table data for partitions and only include the parent + * definitions to allow access to the complete remote data set + * locally in the schema imported. + * Thanks, Amit
On Fri, Mar 31, 2017 at 1:37 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > + <para> > + For partitioned tables, partitions are automatically excluded from the > + schema data imported. Only the definition of partitioned tables is > included > + to give access to the full data set of all partitions present remotely. > + </para> > + > > Only the definitions of "root" partitioned tables, because when using > multi-level partitioning, there would be partitioned tables that won't be > included (because, relispartition=true). > > If you agree, then this code comment too could use the same terminology: > > + * Ignore table data for partitions and only include the parent > + * definitions to allow access to the complete remote data set > + * locally in the schema imported. > + * That makes sense. Updated this way as I have my hands on it now. -- Michael
Attachment
On 2017/03/31 13:46, Michael Paquier wrote: > On Fri, Mar 31, 2017 at 1:37 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> + <para> >> + For partitioned tables, partitions are automatically excluded from the >> + schema data imported. Only the definition of partitioned tables is >> included >> + to give access to the full data set of all partitions present remotely. >> + </para> >> + >> >> Only the definitions of "root" partitioned tables, because when using >> multi-level partitioning, there would be partitioned tables that won't be >> included (because, relispartition=true). >> >> If you agree, then this code comment too could use the same terminology: >> >> + * Ignore table data for partitions and only include the parent >> + * definitions to allow access to the complete remote data set >> + * locally in the schema imported. >> + * > > That makes sense. Updated this way as I have my hands on it now. Thanks, no more comments from my side. Regards, Amit
On Fri, Mar 31, 2017 at 12:51 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Thanks, no more comments from my side. Committed after rewording the documentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-03-31 15:25:19 -0400, Robert Haas wrote: > On Fri, Mar 31, 2017 at 12:51 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > > Thanks, no more comments from my side. > > Committed after rewording the documentation. Hm. Wonder if something like that shouldn't be backpatched - because otherwise using postgres_fdw from an old server against a newer one will do weird stuff. I don't know what kind of policy we've committed to with postgresImportForeignSchema... Greetings, Andres Freund
On Fri, Mar 31, 2017 at 3:31 PM, Andres Freund <andres@anarazel.de> wrote: >> Committed after rewording the documentation. > > Hm. Wonder if something like that shouldn't be backpatched - because > otherwise using postgres_fdw from an old server against a newer one will > do weird stuff. I don't know what kind of policy we've committed to > with postgresImportForeignSchema... I don't think I'd like to promise that postgres_fdw will always be forward-compatible. Backward-compatibility is hard enough already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Mar 31, 2017 at 3:31 PM, Andres Freund <andres@anarazel.de> wrote: >> Hm. Wonder if something like that shouldn't be backpatched - because >> otherwise using postgres_fdw from an old server against a newer one will >> do weird stuff. I don't know what kind of policy we've committed to >> with postgresImportForeignSchema... > I don't think I'd like to promise that postgres_fdw will always be > forward-compatible. Backward-compatibility is hard enough already. Unless I'm missing something, the behavior will be that an older version will simply ignore remote partitioned tables (they will not pass the relkind filter in the query). Seems pretty fail-soft, so I think it's fine. regards, tom lane
On Sat, Apr 1, 2017 at 4:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Mar 31, 2017 at 3:31 PM, Andres Freund <andres@anarazel.de> wrote: >>> Hm. Wonder if something like that shouldn't be backpatched - because >>> otherwise using postgres_fdw from an old server against a newer one will >>> do weird stuff. I don't know what kind of policy we've committed to >>> with postgresImportForeignSchema... > >> I don't think I'd like to promise that postgres_fdw will always be >> forward-compatible. Backward-compatibility is hard enough already. Thanks for the commit. > Unless I'm missing something, the behavior will be that an older > version will simply ignore remote partitioned tables (they will not > pass the relkind filter in the query). Seems pretty fail-soft, > so I think it's fine. Yeah, I would suggest to revisit that if we get actual complaints, but I would not push much in favor of it. It's not an area where nothing can be done to improve the user experience. -- Michael