Thread: [HACKERS] Add support for tuple routing to foreign partitions
Hi, Here is a patch for $subject. This is based on Amit's original patch (patch '0007-Tuple-routing-for-partitioned-tables-15.patch' in [1]). Main changes are: * I like Amit's idea that for each partition that is a foreign table, the FDW performs query planning in the same way as in non-partition cases, but the changes to make_modifytable() in the original patch that creates a working-copy of Query to pass to PlanForeignModify() seem unacceptable. So, I modified the changes so that we create more-valid-looking copies of Query and ModifyTable with the foreign partition as target, as I said before. In relation to this, I also allowed expand_inherited_rtentry() to build an RTE and AppendRelInfo for each partition of a partitioned table that is an INSERT target, as mentioned by Amit in [1], by modifying transformInsertStmt() so that we set the inh flag for the target table's RTE to true when the target table is a partitioned table. The partition RTEs are not only referenced by FDWs, but used in selecting relation aliases for EXPLAIN (see below) as well as extracting plan dependencies in setref.c so that we force rebuilding of the plan on any change to partitions. (We do replanning on FDW table-option changes to foreign partitions, currently. See regression tests in the attached patch.) * The original patch added tuple routing to foreign partitions in COPY FROM, but I'm not sure the changes to BeginCopy() in the patch are the right way to go. For example, the patch passes a dummy empty Query to PlanForeignModify() to make FDWs perform query planning, but I think FDWs would be surprised by the Query. Currently, we don't support COPY to foreign tables, so ISTM that it's better to revisit this issue when adding support for COPY to foreign tables. So, I dropped the COPY part. * I modified explain.c so that EXPLAIN for an INSERT into a partitioned table displays partitions just below the ModifyTable node, and shows remote queries for foreign partitions in the same way as EXPLAIN for UPDATE/DELETE cases. Here is an example: postgres=# explain verbose insert into pt values (1), (2); QUERY PLAN ------------------------------------------------------------------- Insert on public.pt (cost=0.00..0.03 rows=2 width=4) Foreign Insert on public.fp1 Remote SQL: INSERT INTO public.t1(a) VALUES ($1) Foreign Insert on public.fp2 Remote SQL: INSERT INTO public.t2(a) VALUES ($1) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) Output: "*VALUES*".column1 (7 rows) Comments are welcome! Anyway, I'll add this to the next commitfest. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/aa874eaf-cd3b-0f75-c647-6c0ea823781e%40lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Fujita-san, On 2017/06/29 20:20, Etsuro Fujita wrote: > Hi, > > Here is a patch for $subject. This is based on Amit's original patch > (patch '0007-Tuple-routing-for-partitioned-tables-15.patch' in [1]). Thanks a lot for taking this up. > Main changes are: > > * I like Amit's idea that for each partition that is a foreign table, the > FDW performs query planning in the same way as in non-partition cases, but > the changes to make_modifytable() in the original patch that creates a > working-copy of Query to pass to PlanForeignModify() seem unacceptable. > So, I modified the changes so that we create more-valid-looking copies of > Query and ModifyTable with the foreign partition as target, as I said > before. This sounds good. > In relation to this, I also allowed expand_inherited_rtentry() to > build an RTE and AppendRelInfo for each partition of a partitioned table > that is an INSERT target, as mentioned by Amit in [1], by modifying > transformInsertStmt() so that we set the inh flag for the target table's > RTE to true when the target table is a partitioned table. About this part, Robert sounded a little unsure back then [1]; his words: "Doing more inheritance expansion early is probably not a good idea because it likely sucks for performance, and that's especially unfortunate if it's only needed for foreign tables." But... > The partition > RTEs are not only referenced by FDWs, but used in selecting relation > aliases for EXPLAIN (see below) as well as extracting plan dependencies in > setref.c so that we force rebuilding of the plan on any change to > partitions. (We do replanning on FDW table-option changes to foreign > partitions, currently. See regression tests in the attached patch.) ...this part means we cannot really avoid locking at least the foreign partitions during the planning stage, which we currently don't, as we don't look at partitions at all before ExecSetupPartitionTupleRouting() is called by ExecInitModifyTable(). Since we are locking *all* the partitions anyway, it may be better to shift the cost of locking to the planner by doing the inheritance expansion even in the insert case (for partitioned tables) and avoid calling the lock manager again in the executor when setting up tuple-routing data structures (that is, in ExecSetupPartitionTupleRouting). An idea that Robert recently mentioned on the nearby "UPDATE partition key" thread [2] seems relevant in this regard. By that idea, expand_inherited_rtentry(), instead of reading in the partition OIDs in the old catalog order, will instead call RelationBuildPartitionDispatchInfo(), which locks the partitions and returns their OIDs in the canonical order. If we store the foreign partition RT indexes in that order in ModifyTable.partition_rels introduced by this patch, then the following code added by the patch could be made more efficient (as also explained in aforementioned Robert's email): + foreach(l, node->partition_rels) + { + Index rti = lfirst_int(l); + Oid relid = getrelid(rti, estate->es_range_table); + bool found; + int j; + + /* First, find the ResultRelInfo for the partition */ + found = false; + for (j = 0; j < mtstate->mt_num_partitions; j++) + { + resultRelInfo = partitions + j; + + if (RelationGetRelid(resultRelInfo->ri_RelationDesc) == relid) + { + Assert(mtstate->mt_partition_indexes[i] == 0); + mtstate->mt_partition_indexes[i] = j; + found = true; + break; + } + } + if (!found) + elog(ERROR, "failed to find ResultRelInfo for rangetable index %u", rti); > * The original patch added tuple routing to foreign partitions in COPY > FROM, but I'm not sure the changes to BeginCopy() in the patch are the > right way to go. For example, the patch passes a dummy empty Query to > PlanForeignModify() to make FDWs perform query planning, but I think FDWs > would be surprised by the Query. Currently, we don't support COPY to > foreign tables, so ISTM that it's better to revisit this issue when adding > support for COPY to foreign tables. So, I dropped the COPY part. I agree. Thanks, Amit [1] https://www.postgresql.org/message-id/CA%2BTgmoYvp6DD1jpwb9sNe08E7jSFEFky32TJU%2BsJ8OStHYW3nA%40mail.gmail.com [2] https://www.postgresql.org/message-id/CA%2BTgmoajC0J50%3D2FqnZLvB10roY%2B68HgFWhso%3DV_StkC6PWujQ%40mail.gmail.com
On 2017/07/05 9:13, Amit Langote wrote: > On 2017/06/29 20:20, Etsuro Fujita wrote: >> In relation to this, I also allowed expand_inherited_rtentry() to >> build an RTE and AppendRelInfo for each partition of a partitioned table >> that is an INSERT target, as mentioned by Amit in [1], by modifying >> transformInsertStmt() so that we set the inh flag for the target table's >> RTE to true when the target table is a partitioned table. > > About this part, Robert sounded a little unsure back then [1]; his words: > > "Doing more inheritance expansion early is probably not a good idea > because it likely sucks for performance, and that's especially unfortunate > if it's only needed for foreign tables." > > But... > >> The partition >> RTEs are not only referenced by FDWs, but used in selecting relation >> aliases for EXPLAIN (see below) as well as extracting plan dependencies in >> setref.c so that we force rebuilding of the plan on any change to >> partitions. (We do replanning on FDW table-option changes to foreign >> partitions, currently. See regression tests in the attached patch.) > > ...this part means we cannot really avoid locking at least the foreign > partitions during the planning stage, which we currently don't, as we > don't look at partitions at all before ExecSetupPartitionTupleRouting() is > called by ExecInitModifyTable(). Another case where we need inheritance expansion during the planning stage would probably be INSERT .. ON CONFLICT into a partitioned table, I guess. We don't support that yet, but if implementing that, I guess we would probably need to look at each partition and do something like infer_arbiter_indexes() for each partition during the planner stage. Not sure. > Since we are locking *all* the partitions anyway, it may be better to > shift the cost of locking to the planner by doing the inheritance > expansion even in the insert case (for partitioned tables) and avoid > calling the lock manager again in the executor when setting up > tuple-routing data structures (that is, in ExecSetupPartitionTupleRouting). We need the execution-time lock, anyway. See the comments from Robert in [3]. > An idea that Robert recently mentioned on the nearby "UPDATE partition > key" thread [2] seems relevant in this regard. By that idea, > expand_inherited_rtentry(), instead of reading in the partition OIDs in > the old catalog order, will instead call > RelationBuildPartitionDispatchInfo(), which locks the partitions and > returns their OIDs in the canonical order. If we store the foreign > partition RT indexes in that order in ModifyTable.partition_rels > introduced by this patch, then the following code added by the patch could > be made more efficient (as also explained in aforementioned Robert's email): > > + foreach(l, node->partition_rels) > + { > + Index rti = lfirst_int(l); > + Oid relid = getrelid(rti, estate->es_range_table); > + bool found; > + int j; > + > + /* First, find the ResultRelInfo for the partition */ > + found = false; > + for (j = 0; j < mtstate->mt_num_partitions; j++) > + { > + resultRelInfo = partitions + j; > + > + if (RelationGetRelid(resultRelInfo->ri_RelationDesc) == > relid) > + { > + Assert(mtstate->mt_partition_indexes[i] == 0); > + mtstate->mt_partition_indexes[i] = j; > + found = true; > + break; > + } > + } > + if (!found) > + elog(ERROR, "failed to find ResultRelInfo for rangetable > index %u", rti); Agreed. I really want to improve this based on that idea. Thank you for the comments! Best regards, Etsuro Fujita [3] https://www.postgresql.org/message-id/CA+TgmoYiwviCDRi3Zk+QuXj1r7uMu9T_kDNq+17PCWgzrbzw8A@mail.gmail.com
On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > So, I dropped the COPY part. Ouch. I think we should try to figure out how the COPY part will be handled before we commit to a design. I have to admit that I'm a little bit fuzzy about why foreign insert routing requires all of these changes. I think this patch would benefit from being accompanied by several paragraphs of explanation outlining the rationale for each part of the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/07/11 6:56, Robert Haas wrote: > On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> So, I dropped the COPY part. > > Ouch. I think we should try to figure out how the COPY part will be > handled before we commit to a design. I spent some time on this. To handle that, I'd like to propose doing something similar to \copy (frontend copy): submit a COPY query "COPY ... FROM STDIN" to the remote server and route data from a file to the remote server. For that, I'd like to add new FDW APIs called during CopyFrom that allow us to copy to foreign tables: * BeginForeignCopyIn: this would be called after creating a ResultRelInfo for the target table (or each leaf partition of the target partitioned table) if it's a foreign table, and perform any initialization needed before the remote copy can start. In the postgres_fdw case, I think this function would be a good place to send "COPY ... FROM STDIN" to the remote server. * ExecForeignCopyInOneRow: this would be called instead of heap_insert if the target is a foreign table, and route the tuple read from the file by NextCopyFrom to the remote server. In the postgres_fdw case, I think this function would convert the tuple to text format for portability, and then send the data to the remote server using PQputCopyData. * EndForeignCopyIn: this would be called at the bottom of CopyFrom, and release resources such as connections to the remote server. In the postgres_fdw case, this function would do PQputCopyEnd to terminate data transfer. I think that would be much more efficient than INSERTing tuples into the remote server one by one. What do you think about that? > I have to admit that I'm a little bit fuzzy about why foreign insert > routing requires all of these changes. I think this patch would > benefit from being accompanied by several paragraphs of explanation > outlining the rationale for each part of the patch. Will do. Best regards, Etsuro Fujita
On Thu, Aug 17, 2017 at 1:57 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > On 2017/07/11 6:56, Robert Haas wrote: >> >> On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita >> <fujita.etsuro@lab.ntt.co.jp> wrote: >>> >>> So, I dropped the COPY part. >> >> >> Ouch. I think we should try to figure out how the COPY part will be >> handled before we commit to a design. > > > I spent some time on this. To handle that, I'd like to propose doing > something similar to \copy (frontend copy): submit a COPY query "COPY ... > FROM STDIN" to the remote server and route data from a file to the remote > server. For that, I'd like to add new FDW APIs called during CopyFrom that > allow us to copy to foreign tables: The description seems to support only COPY to a foreign table from a file, but probably we need the support other way round as well. This looks like a feature (support copy to and from a foreign table) to be handled by itself. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On 2017/08/17 20:37, Ashutosh Bapat wrote: > On Thu, Aug 17, 2017 at 1:57 PM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> I spent some time on this. To handle that, I'd like to propose doing >> something similar to \copy (frontend copy): submit a COPY query "COPY ... >> FROM STDIN" to the remote server and route data from a file to the remote >> server. For that, I'd like to add new FDW APIs called during CopyFrom that >> allow us to copy to foreign tables: > > The description seems to support only COPY to a foreign table from a > file, but probably we need the support other way round as well. This > looks like a feature (support copy to and from a foreign table) to be > handled by itself. Agreed. I'll consider how to handle copy-from-a-foreign-table as well. Best regards, Etsuro Fujita
On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote: > On 2017/07/11 6:56, Robert Haas wrote: > >On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita > ><fujita.etsuro@lab.ntt.co.jp> wrote: > >>So, I dropped the COPY part. > > > >Ouch. I think we should try to figure out how the COPY part will be > >handled before we commit to a design. > > I spent some time on this. To handle that, I'd like to propose doing > something similar to \copy (frontend copy): submit a COPY query "COPY ... > FROM STDIN" to the remote server and route data from a file to the remote > server. For that, I'd like to add new FDW APIs called during CopyFrom that > allow us to copy to foreign tables: > > * BeginForeignCopyIn: this would be called after creating a ResultRelInfo > for the target table (or each leaf partition of the target partitioned > table) if it's a foreign table, and perform any initialization needed before > the remote copy can start. In the postgres_fdw case, I think this function > would be a good place to send "COPY ... FROM STDIN" to the remote server. > > * ExecForeignCopyInOneRow: this would be called instead of heap_insert if > the target is a foreign table, and route the tuple read from the file by > NextCopyFrom to the remote server. In the postgres_fdw case, I think this > function would convert the tuple to text format for portability, and then > send the data to the remote server using PQputCopyData. > > * EndForeignCopyIn: this would be called at the bottom of CopyFrom, and > release resources such as connections to the remote server. In the > postgres_fdw case, this function would do PQputCopyEnd to terminate data > transfer. These primitives look good. I know it seems unlikely at first blush, but do we know of bulk load APIs for non-PostgreSQL data stores that this would be unable to serve? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2017/08/17 23:48, David Fetter wrote: > On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote: >> On 2017/07/11 6:56, Robert Haas wrote: >>> On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita >>> <fujita.etsuro@lab.ntt.co.jp> wrote: >>>> So, I dropped the COPY part. >>> >>> Ouch. I think we should try to figure out how the COPY part will be >>> handled before we commit to a design. >> >> I spent some time on this. To handle that, I'd like to propose doing >> something similar to \copy (frontend copy): submit a COPY query "COPY ... >> FROM STDIN" to the remote server and route data from a file to the remote >> server. For that, I'd like to add new FDW APIs called during CopyFrom that >> allow us to copy to foreign tables: >> >> * BeginForeignCopyIn: this would be called after creating a ResultRelInfo >> for the target table (or each leaf partition of the target partitioned >> table) if it's a foreign table, and perform any initialization needed before >> the remote copy can start. In the postgres_fdw case, I think this function >> would be a good place to send "COPY ... FROM STDIN" to the remote server. >> >> * ExecForeignCopyInOneRow: this would be called instead of heap_insert if >> the target is a foreign table, and route the tuple read from the file by >> NextCopyFrom to the remote server. In the postgres_fdw case, I think this >> function would convert the tuple to text format for portability, and then >> send the data to the remote server using PQputCopyData. >> >> * EndForeignCopyIn: this would be called at the bottom of CopyFrom, and >> release resources such as connections to the remote server. In the >> postgres_fdw case, this function would do PQputCopyEnd to terminate data >> transfer. > > These primitives look good. I know it seems unlikely at first blush, > but do we know of bulk load APIs for non-PostgreSQL data stores that > this would be unable to serve? Maybe I'm missing something, but I think these would allow the FDW to do the remote copy the way it would like; in ExecForeignCopyInOneRow, for example, the FDW could buffer tuples in a buffer memory and transmit the buffered data to the remote server if the data size exceeds a threshold. The naming is not so good? Suggestions are welcome. Best regards, Etsuro Fujita
On Fri, Aug 18, 2017 at 05:10:29PM +0900, Etsuro Fujita wrote: > On 2017/08/17 23:48, David Fetter wrote: > >On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote: > >>On 2017/07/11 6:56, Robert Haas wrote: > >>>On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita > >>><fujita.etsuro@lab.ntt.co.jp> wrote: > >>>>So, I dropped the COPY part. > >>> > >>>Ouch. I think we should try to figure out how the COPY part will be > >>>handled before we commit to a design. > >> > >>I spent some time on this. To handle that, I'd like to propose doing > >>something similar to \copy (frontend copy): submit a COPY query "COPY ... > >>FROM STDIN" to the remote server and route data from a file to the remote > >>server. For that, I'd like to add new FDW APIs called during CopyFrom that > >>allow us to copy to foreign tables: > >> > >>* BeginForeignCopyIn: this would be called after creating a ResultRelInfo > >>for the target table (or each leaf partition of the target partitioned > >>table) if it's a foreign table, and perform any initialization needed before > >>the remote copy can start. In the postgres_fdw case, I think this function > >>would be a good place to send "COPY ... FROM STDIN" to the remote server. > >> > >>* ExecForeignCopyInOneRow: this would be called instead of heap_insert if > >>the target is a foreign table, and route the tuple read from the file by > >>NextCopyFrom to the remote server. In the postgres_fdw case, I think this > >>function would convert the tuple to text format for portability, and then > >>send the data to the remote server using PQputCopyData. > >> > >>* EndForeignCopyIn: this would be called at the bottom of CopyFrom, and > >>release resources such as connections to the remote server. In the > >>postgres_fdw case, this function would do PQputCopyEnd to terminate data > >>transfer. > > > >These primitives look good. I know it seems unlikely at first > >blush, but do we know of bulk load APIs for non-PostgreSQL data > >stores that this would be unable to serve? > > Maybe I'm missing something, but I think these would allow the FDW > to do the remote copy the way it would like; in > ExecForeignCopyInOneRow, for example, the FDW could buffer tuples in > a buffer memory and transmit the buffered data to the remote server > if the data size exceeds a threshold. The naming is not so good? > Suggestions are welcome. The naming seems reasonable. I was trying to figure out whether this fits well enough with the bulk load APIs for databases other than PostgreSQL. I'm guessing it's "well enough" based on checking MySQL, Oracle, and MS SQL Server. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Aug 17, 2017 at 7:58 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: >> The description seems to support only COPY to a foreign table from a >> file, but probably we need the support other way round as well. This >> looks like a feature (support copy to and from a foreign table) to be >> handled by itself. > > Agreed. I'll consider how to handle copy-from-a-foreign-table as well. That's a completely different feature which has nothing to do with tuple routing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Aug 17, 2017 at 4:27 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > I think that would be much more efficient than INSERTing tuples into the > remote server one by one. What do you think about that? I agree, but I wonder if we ought to make it work first using the existing APIs and then add these new APIs as an optimization. postgres_fdw isn't the only FDW in the world, and it's better if getting a working implementation doesn't require too many interfaces. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/08/18 22:41, David Fetter wrote: > On Fri, Aug 18, 2017 at 05:10:29PM +0900, Etsuro Fujita wrote: >> On 2017/08/17 23:48, David Fetter wrote: >>> On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote: >>>> On 2017/07/11 6:56, Robert Haas wrote: >>>>> On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita >>>>> <fujita.etsuro@lab.ntt.co.jp> wrote: >>>>>> So, I dropped the COPY part. >>>>> >>>>> Ouch. I think we should try to figure out how the COPY part will be >>>>> handled before we commit to a design. >>>> >>>> I spent some time on this. To handle that, I'd like to propose doing >>>> something similar to \copy (frontend copy): submit a COPY query "COPY ... >>> >FROM STDIN" to the remote server and route data from a file to the remote >>>> server. For that, I'd like to add new FDW APIs called during CopyFrom that >>>> allow us to copy to foreign tables: >>>> >>>> * BeginForeignCopyIn: this would be called after creating a ResultRelInfo >>>> for the target table (or each leaf partition of the target partitioned >>>> table) if it's a foreign table, and perform any initialization needed before >>>> the remote copy can start. In the postgres_fdw case, I think this function >>>> would be a good place to send "COPY ... FROM STDIN" to the remote server. >>>> >>>> * ExecForeignCopyInOneRow: this would be called instead of heap_insert if >>>> the target is a foreign table, and route the tuple read from the file by >>>> NextCopyFrom to the remote server. In the postgres_fdw case, I think this >>>> function would convert the tuple to text format for portability, and then >>>> send the data to the remote server using PQputCopyData. >>>> >>>> * EndForeignCopyIn: this would be called at the bottom of CopyFrom, and >>>> release resources such as connections to the remote server. In the >>>> postgres_fdw case, this function would do PQputCopyEnd to terminate data >>>> transfer. >>> >>> These primitives look good. I know it seems unlikely at first >>> blush, but do we know of bulk load APIs for non-PostgreSQL data >>> stores that this would be unable to serve? >> >> Maybe I'm missing something, but I think these would allow the FDW >> to do the remote copy the way it would like; in >> ExecForeignCopyInOneRow, for example, the FDW could buffer tuples in >> a buffer memory and transmit the buffered data to the remote server >> if the data size exceeds a threshold. The naming is not so good? >> Suggestions are welcome. > > The naming seems reasonable. > > I was trying to figure out whether this fits well enough with the bulk > load APIs for databases other than PostgreSQL. I'm guessing it's > "well enough" based on checking MySQL, Oracle, and MS SQL Server. Good to know. Best regards, Etsuro Fujita
On 2017/08/19 2:12, Robert Haas wrote: > On Thu, Aug 17, 2017 at 4:27 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> I think that would be much more efficient than INSERTing tuples into the >> remote server one by one. What do you think about that? > > I agree, but I wonder if we ought to make it work first using the > existing APIs and then add these new APIs as an optimization. I'm not sure that's a good idea because that once we support INSERT tuple-routing for foreign partitions, we would have a workaround: INSERT INTO partitioned_table SELECT * from data_table where data_table is a foreign table defined for an input file using file_fdw. > postgres_fdw isn't the only FDW in the world, and it's better if > getting a working implementation doesn't require too many interfaces. Agreed. My vote would be to leave the COPY part as-is until we have these new APIs. Best regards, Etsuro Fujita
On Sun, Aug 20, 2017 at 11:25 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: >> I agree, but I wonder if we ought to make it work first using the >> existing APIs and then add these new APIs as an optimization. > > I'm not sure that's a good idea because that once we support INSERT > tuple-routing for foreign partitions, we would have a workaround: INSERT > INTO partitioned_table SELECT * from data_table where data_table is a > foreign table defined for an input file using file_fdw. That's true, but I don't see how it refutes the point I was trying to raise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/08/26 1:43, Robert Haas wrote: > On Sun, Aug 20, 2017 at 11:25 PM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >>> I agree, but I wonder if we ought to make it work first using the >>> existing APIs and then add these new APIs as an optimization. >> >> I'm not sure that's a good idea because that once we support INSERT >> tuple-routing for foreign partitions, we would have a workaround: INSERT >> INTO partitioned_table SELECT * from data_table where data_table is a >> foreign table defined for an input file using file_fdw. > > That's true, but I don't see how it refutes the point I was trying to raise. My concern is: the existing APIs can really work well for any FDW to do COPY tuple-routing? I know the original version of the patch showed the existing APIs would work well for postgres_fdw but nothing beyond. For example: the original version made a dummy Query/Plan only containing a leaf partition as its result relation, and passed it to PlanForeignModify, IIRC. That would work well for postgres_fdw because it doesn't look at the Query/Plan except the result relation, but might not for other FDWs that look at more stuff of the given Query/Plan and do something based on that information. Some FDW might check to see whether the Plan is to do INSERT .. VALUES with a single VALUES sublist or INSERT .. VALUES with multiple VALUES sublists, for optimizing remote INSERT, for example. Best regards, Etsuro Fujita
On 2017/08/17 17:27, Etsuro Fujita wrote: > On 2017/07/11 6:56, Robert Haas wrote: >> I have to admit that I'm a little bit fuzzy about why foreign insert >> routing requires all of these changes. I think this patch would >> benefit from being accompanied by several paragraphs of explanation >> outlining the rationale for each part of the patch. > > Will do. Here is an updated version of the patch. * Query planning: the patch creates copies of Query/Plan with a foreign partition as target from the original Query/Plan for each foreign partition and invokes PlanForeignModify with those copies, to allow the FDW to do query planning for remote INSERT with the existing API. To make such Queries the similar way inheritance_planner does, I modified transformInsertStmt so that the inh flag for the partitioned table's RTE is set to true, which allows (1) expand_inherited_rtentry to build an RTE and AppendRelInfo for each partition in the partitioned table and (2) make_modifytable to build such Queries using adjust_appendrel_attrs and those AppendRelInfos. * explain.c: I modified show_modifytable_info so that we can show remote queries for foreign partitions in EXPLAIN for INSERT into a partitioned table the same way as for inherited UPDATE/DELETE cases. Here is an example: postgres=# explain verbose insert into pt values (1), (2); QUERY PLAN ------------------------------------------------------------------- Insert on public.pt (cost=0.00..0.03 rows=2 width=4) Foreign Insert on public.fp1 Remote SQL: INSERT INTO public.t1(a) VALUES ($1) Foreign Insert on public.fp2 Remote SQL: INSERT INTO public.t2(a) VALUES ($1) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) Output: "*VALUES*".column1 (7 rows) I think I should add more explanation about the patch, but I don't have time today, so I'll write additional explanation in the next email. Sorry about that. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi Fujita-san! On 11.09.2017 16:01, Etsuro Fujita wrote: > Here is an updated version of the patch. > > * Query planning: the patch creates copies of Query/Plan with a > foreign partition as target from the original Query/Plan for each > foreign partition and invokes PlanForeignModify with those copies, to > allow the FDW to do query planning for remote INSERT with the existing > API. To make such Queries the similar way inheritance_planner does, I > modified transformInsertStmt so that the inh flag for the partitioned > table's RTE is set to true, which allows (1) expand_inherited_rtentry > to build an RTE and AppendRelInfo for each partition in the > partitioned table and (2) make_modifytable to build such Queries using > adjust_appendrel_attrs and those AppendRelInfos. > > * explain.c: I modified show_modifytable_info so that we can show > remote queries for foreign partitions in EXPLAIN for INSERT into a > partitioned table the same way as for inherited UPDATE/DELETE cases. > Here is an example: > > postgres=# explain verbose insert into pt values (1), (2); > QUERY PLAN > ------------------------------------------------------------------- > Insert on public.pt (cost=0.00..0.03 rows=2 width=4) > Foreign Insert on public.fp1 > Remote SQL: INSERT INTO public.t1(a) VALUES ($1) > Foreign Insert on public.fp2 > Remote SQL: INSERT INTO public.t2(a) VALUES ($1) > -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) > Output: "*VALUES*".column1 > (7 rows) > > I think I should add more explanation about the patch, but I don't > have time today, so I'll write additional explanation in the next > email. Sorry about that. Could you update your patch, it isn't applied on the actual state of master. Namely conflict in src/backend/executor/execMain.c -- Regards, Maksim Milyutin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hi Maksim, On 2017/10/02 21:37, Maksim Milyutin wrote: > On 11.09.2017 16:01, Etsuro Fujita wrote: >> * Query planning: the patch creates copies of Query/Plan with a >> foreign partition as target from the original Query/Plan for each >> foreign partition and invokes PlanForeignModify with those copies, to >> allow the FDW to do query planning for remote INSERT with the existing >> API. To make such Queries the similar way inheritance_planner does, I >> modified transformInsertStmt so that the inh flag for the partitioned >> table's RTE is set to true, which allows (1) expand_inherited_rtentry >> to build an RTE and AppendRelInfo for each partition in the >> partitioned table and (2) make_modifytable to build such Queries using >> adjust_appendrel_attrs and those AppendRelInfos. >> >> * explain.c: I modified show_modifytable_info so that we can show >> remote queries for foreign partitions in EXPLAIN for INSERT into a >> partitioned table the same way as for inherited UPDATE/DELETE cases. >> Here is an example: >> >> postgres=# explain verbose insert into pt values (1), (2); >> QUERY PLAN >> ------------------------------------------------------------------- >> Insert on public.pt (cost=0.00..0.03 rows=2 width=4) >> Foreign Insert on public.fp1 >> Remote SQL: INSERT INTO public.t1(a) VALUES ($1) >> Foreign Insert on public.fp2 >> Remote SQL: INSERT INTO public.t2(a) VALUES ($1) >> -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) >> Output: "*VALUES*".column1 >> (7 rows) >> >> I think I should add more explanation about the patch, but I don't >> have time today, so I'll write additional explanation in the next >> email. Sorry about that. > > Could you update your patch, it isn't applied on the actual state of > master. Namely conflict in src/backend/executor/execMain.c Attached is an updated version. * As mentioned in "Query planning", the patch builds an RTE for each partition so that the FDW can make reference to that RTE in eg, PlanForeignModify. set_plan_references also uses such RTEs to record plan dependencies for plancache.c to invalidate cached plans. See an example for that added to the regression tests in postgres_fdw. * As mentioned in "explain.c", the EXPLAIN shows all partitions beneath the ModifyTable node. One merit of that is we can show remote queries for foreign partitions in the output as shown above. Another one I can think of is when reporting execution stats for triggers. Here is an example for that: postgres=# explain analyze insert into list_parted values (1, 'hi there'), (2, 'hi there'); QUERY PLAN -------------------------------------------------------------------------------------------------------------- Insert on list_parted (cost=0.00..0.03 rows=2 width=36) (actual time=0.375..0.375 rows=0 loops=1) Insert on part1 Insert on part2 -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=36) (actual time=0.004..0.007 rows=2 loops=1) Planning time: 0.089 ms Trigger part1brtrig on part1: time=0.059 calls=1 Trigger part2brtrig on part2: time=0.021 calls=1 Execution time: 0.422 ms (8 rows) This would allow the user to understand easily that "part1" and "part2" in the trigger lines are the partitions of list_parted. So, I think it's useful to show partition info in the ModifyTable node even in the case where a partitioned table only contains plain tables. * The patch modifies make_modifytable and ExecInitModifyTable so that the former can allow the FDW to construct private plan data for each foreign partition and accumulate it all into a list, and that the latter can perform BeginForeignModify for each partition using that plan data stored in the list passed from make_modifytable. Other changes I made to the executor are: (1) currently, we set the RT index for the root partitioned table to ri_RangeTableIndex of partitions' ResultRelInfos, but the proposed EXPLAIN requires that the partition's ri_RangeTableIndex is set to the RT index for that partition's RTE, to show that partition info in the output. So, I made that change. Because of that, ExecPartitionCheck, ExecConstraints, and ExecWithCheckOptions are adjusted accordingly. (2) I added a new member to ResultRelInfo (ie, ri_PartitionIsValid), and modified CheckValidResultRel so that it checks a given partition without throwing an error and save the result in that flag so that ExecInsert determines using that flag whether a partition chosen by ExecFindPartition is valid for tuple-routing as proposed before. * copy.c: I still don't think it's a good idea to implement COPY tuple-routing for foreign partitions using PlanForeignModify. (I plan to propose new FDW APIs for bulkload as discussed before, to implement this feature.) So, I kept that as-is. Two things I changed there are: (1) currently, ExecSetupPartitionTupleRouting verifies partitions using CheckValidResultRel, but I don't think we need the CheckValidResultRel check in the COPY case. So, I refactored that function and checked partitions directly. (2) I think it'd be better to distinguish the error message "cannot route inserted tuples to a foreign partition" in the COPY case from the INSERT case, I changed it to "cannot route copied tuples to a foreign partition". * Fixed some bugs, revised comments, added a bit more regression tests, and rebased the patch. Comments are welcome! My apologies for the very late reply. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2017/10/26 16:40, Etsuro Fujita wrote: > Other changes I made > to the executor are: (1) currently, we set the RT index for the root > partitioned table to ri_RangeTableIndex of partitions' ResultRelInfos, > but the proposed EXPLAIN requires that the partition's > ri_RangeTableIndex is set to the RT index for that partition's RTE, to > show that partition info in the output. So, I made that change. One thing I forgot to mention is: that would be also required to call BeginForeignModify, ExecForeignInsert, and EndForeignModify with the partition's ResultRelInfo. I updated docs in doc/src/sgml/ddl.sgml the same way as [1]. (I used only the ddl.sgml change proposed by [1], not all the changes.) I did some cleanup as well. Please find attached an updated version of the patch. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/b19a8e2b-e000-f592-3e0b-3e90ba0fa816%40lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
(2017/10/27 20:00), Etsuro Fujita wrote: > Please find attached an updated version of the patch. Amit rebased this patch and sent me the rebased version off list. Thanks for that, Amit! One thing I noticed I overlooked is about this change I added to make_modifytable to make a valid-looking plan node to pass to PlanForeignModify to plan remote insert to each foreign partition: + /* + * The column list of the child might have a different column + * order and/or a different set of dropped columns than that + * of its parent, so adjust the subplan's tlist as well. + */ + tlist = preprocess_targetlist(root, + child_parse->targetList); This would be needed because the FDW might reference the tlist. Since preprocess_targetlist references root->parse, it's needed to replace that with the child query before calling that function, but I forgot to do that. So I fixed that. Attached is an updated version of the patch. Best regards, Etsuro Fujita
Attachment
On Fri, Nov 24, 2017 at 10:03 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > (2017/10/27 20:00), Etsuro Fujita wrote: >> >> Please find attached an updated version of the patch. > > > Amit rebased this patch and sent me the rebased version off list. Thanks for > that, Amit! > > One thing I noticed I overlooked is about this change I added to > make_modifytable to make a valid-looking plan node to pass to > PlanForeignModify to plan remote insert to each foreign partition: > > + /* > + * The column list of the child might have a different > column > + * order and/or a different set of dropped columns than that > + * of its parent, so adjust the subplan's tlist as well. > + */ > + tlist = preprocess_targetlist(root, > + child_parse->targetList); > > This would be needed because the FDW might reference the tlist. Since > preprocess_targetlist references root->parse, it's needed to replace that > with the child query before calling that function, but I forgot to do that. > So I fixed that. Attached is an updated version of the patch. Moved to next CF as this got no reviews and the last version is fresh. -- Michael
Hi, Fujita-san! On 24.11.2017 16:03, Etsuro Fujita wrote: > (2017/10/27 20:00), Etsuro Fujita wrote: >> Please find attached an updated version of the patch. > > Amit rebased this patch and sent me the rebased version off list. > Thanks for that, Amit! > > One thing I noticed I overlooked is about this change I added to > make_modifytable to make a valid-looking plan node to pass to > PlanForeignModify to plan remote insert to each foreign partition: > > + /* > + * The column list of the child might have a different > column > + * order and/or a different set of dropped columns > than that > + * of its parent, so adjust the subplan's tlist as well. > + */ > + tlist = preprocess_targetlist(root, > + child_parse->targetList); > > This would be needed because the FDW might reference the tlist. Since > preprocess_targetlist references root->parse, it's needed to replace > that with the child query before calling that function, but I forgot > to do that. So I fixed that. Attached is an updated version of the > patch. Your patch already is not applied on master. Please rebase it. -- Regards, Maksim Milyutin
Hi Maksim, (2017/12/12 17:57), Maksim Milyutin wrote: > Your patch already is not applied on master. Please rebase it. Done. Please find attached an updated version of the patch. Best regards, Etsuro Fujita
Attachment
Fujita-san, On 2017/12/12 21:21, Etsuro Fujita wrote: > Hi Maksim, > > (2017/12/12 17:57), Maksim Milyutin wrote: >> Your patch already is not applied on master. Please rebase it. > > Done. Please find attached an updated version of the patch. Thanks for sending the updated version of the patch. I noticed that this patch introduces a partition_rels field in ModifyTable, which contains the RT indexes of *all* leaf partitions in the partition tree. That means we now expand the partition inheritance tree in the planner even in the INSERT case, simply because some of the leaf partitions might be foreign tables which must be looked at by the planner. I'm somewhat concerned about the performance implications of that. Now, to insert even a single row into a partitioned table, which may not even be routed to any of its foreign partitions, we are going to have to expand the inheritance twice -- once by the planner to handle foreign partitions and then by the executor when setting up the tuple routing information. Is there any reason why we have to to things this way, beside the fact that the PlanForeignModify() API appears to be callable from only within a valid planner context? Thanks, Amit
(2017/12/18 18:14), Amit Langote wrote: > I noticed that this patch introduces a partition_rels field in > ModifyTable, which contains the RT indexes of *all* leaf partitions in the > partition tree. That means we now expand the partition inheritance tree > in the planner even in the INSERT case, simply because some of the leaf > partitions might be foreign tables which must be looked at by the planner. Yeah, the patch expands the inheritance tree at planning time as well to build an RTE for each partition so that the FDW can use that RTE eg, when called from PlanForeignModify. > I'm somewhat concerned about the performance implications of that. Now, > to insert even a single row into a partitioned table, which may not even > be routed to any of its foreign partitions, we are going to have to expand > the inheritance twice -- once by the planner to handle foreign partitions > and then by the executor when setting up the tuple routing information. > > Is there any reason why we have to to things this way, beside the fact > that the PlanForeignModify() API appears to be callable from only within a > valid planner context? Another reason for that is for set_plan_references to get such RTEs to record plan dependencies so that plancache.c invalidates cached plans for foreign partitions. I suspect that we could avoid the planning-time expansion by doing some optimization when inserting a single row into a partitioned table. Best regards, Etsuro Fujita
InitResultRelInfo becomes unintelligible after this patch -- it was straightforward but adding partition_root makes things shaky. Please add a proper comment indicating what each argument is. (I wonder why this function needs a local variable "partition_check" -- seems pointless). -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/12/18 23:25, Alvaro Herrera wrote: > (I wonder why > this function needs a local variable "partition_check" -- seems > pointless). Before 15ce775faa4 [1], there were more than one line where partition_check was being set, but maybe it still didn't have to be a separate variable. Thanks, Amit [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=15ce775faa4
(2017/12/18 23:25), Alvaro Herrera wrote: > InitResultRelInfo becomes unintelligible after this patch -- it was > straightforward but adding partition_root makes things shaky. Please > add a proper comment indicating what each argument is. I was thiking that the comment I added to the definition of the ResultRelInfo struct in execnodes.h would make that function intelligible, but I agree on that point. Please fined attached a new version of the patch adding such comments. Best regards, Etsuro Fujita
Attachment
Etsuro, * Etsuro Fujita (fujita.etsuro@lab.ntt.co.jp) wrote: > (2017/12/18 23:25), Alvaro Herrera wrote: > >InitResultRelInfo becomes unintelligible after this patch -- it was > >straightforward but adding partition_root makes things shaky. Please > >add a proper comment indicating what each argument is. > > I was thiking that the comment I added to the definition of the > ResultRelInfo struct in execnodes.h would make that function intelligible, > but I agree on that point. Please fined attached a new version of the patch > adding such comments. I'm afraid a good bit of this patch is now failing to apply. I don't have much else to say except to echo the performance concern that Amit Langote raised about expanding the inheritence tree twice. Thanks! Stephen
Attachment
(2018/01/25 23:33), Stephen Frost wrote: > I'm afraid a good bit of this patch is now failing to apply. I don't > have much else to say except to echo the performance concern that Amit > Langote raised about expanding the inheritence tree twice. To address that concern, I'm thinking to redesign the patch so that it wouldn't expand the tree at planning time anymore. I don't have any clear solution for that yet, but what I have in mind now is to add new FDW APIs to the executor, instead, so that the FDW could 1) create stuff such as a query for remote INSERT as PlanForeignModify and 2) initialize/end the remote INSERT operation as BeginForeignModify and EndForeignModify, somewhere in the executor. (For #1, I'm thinking to add an API for that to ExecSetupPartitionTupleRouting or ExecInitPartitionResultRelInfo proposed by the patch by Amit Langote [1].) Anyway, I'll work on this after reviewing that patch, so I'll mark this as Returned with feedback. Thanks for the comment! Best regards, Etsuro Fujita [1] https://commitfest.postgresql.org/16/1415/
(2018/02/02 19:33), Etsuro Fujita wrote: > (2018/01/25 23:33), Stephen Frost wrote: >> I'm afraid a good bit of this patch is now failing to apply. I don't >> have much else to say except to echo the performance concern that Amit >> Langote raised about expanding the inheritence tree twice. > > To address that concern, I'm thinking to redesign the patch so that it > wouldn't expand the tree at planning time anymore. I don't have any > clear solution for that yet, but what I have in mind now is to add new > FDW APIs to the executor, instead, so that the FDW could 1) create stuff > such as a query for remote INSERT as PlanForeignModify and 2) > initialize/end the remote INSERT operation as BeginForeignModify and > EndForeignModify, somewhere in the executor. (For #1, I'm thinking to > add an API for that to ExecSetupPartitionTupleRouting or > ExecInitPartitionResultRelInfo proposed by the patch by Amit Langote > [1].) Anyway, I'll work on this after reviewing that patch, so I'll mark > this as Returned with feedback. CORRECTION: I'm planning to submit a new version to the March CF, so I set the status to "Moved to next CF". Best regards, Etsuro Fujita
(2018/02/02 19:33), Etsuro Fujita wrote: > (2018/01/25 23:33), Stephen Frost wrote: >> I'm afraid a good bit of this patch is now failing to apply. I don't >> have much else to say except to echo the performance concern that Amit >> Langote raised about expanding the inheritence tree twice. > > To address that concern, I'm thinking to redesign the patch so that it > wouldn't expand the tree at planning time anymore. I don't have any > clear solution for that yet, but what I have in mind now is to add new > FDW APIs to the executor, instead, so that the FDW could 1) create stuff > such as a query for remote INSERT as PlanForeignModify and 2) > initialize/end the remote INSERT operation as BeginForeignModify and > EndForeignModify, somewhere in the executor. New FDW APIs I would like to propose for that are: void BeginForeignRouting(ModifyTableState *mtstate, ResultRelInfo *resultRelInfo, int partition_index); Prepare for a tuple-routing operation on a foreign table. This is called from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo. TupleTableSlot * ExecForeignRouting(EState *estate, ResultRelInfo *resultRelInfo, TupleTableSlot *slot); Route one tuple to the foreign table. This is called from ExecInsert. void EndForeignRouting(EState *estate, ResultRelInfo *resultRelInfo); End the operation and release resources. This is called from ExecCleanupTupleRouting. Attached are WIP patches for that: Patch postgres-fdw-refactoring-WIP.patch: refactoring patch for postgres_fdw.c to reduce duplicate code. Patch foreign-routing-fdwapi-WIP.patch: main patch to add new FDW APIs, which is created on top of patch postgres-fdw-refactoring-WIP.patch and the lazy-initialization-of-partition-info patch [1]. By this change we don't need to expand the inheritance tree at planning time, so no need to worry about the performance concern. Maybe I'm missing something, though. Early feedback would be greatly appreciated. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5A8BFB31.6030707@lab.ntt.co.jp
Attachment
Fujita-san, On 2018/02/21 20:54, Etsuro Fujita wrote: > (2018/02/02 19:33), Etsuro Fujita wrote: >> (2018/01/25 23:33), Stephen Frost wrote: >>> I'm afraid a good bit of this patch is now failing to apply. I don't >>> have much else to say except to echo the performance concern that Amit >>> Langote raised about expanding the inheritence tree twice. >> >> To address that concern, I'm thinking to redesign the patch so that it >> wouldn't expand the tree at planning time anymore. I don't have any >> clear solution for that yet, but what I have in mind now is to add new >> FDW APIs to the executor, instead, so that the FDW could 1) create stuff >> such as a query for remote INSERT as PlanForeignModify and 2) >> initialize/end the remote INSERT operation as BeginForeignModify and >> EndForeignModify, somewhere in the executor. > > New FDW APIs I would like to propose for that are: Thanks for updating the proposal. > void > BeginForeignRouting(ModifyTableState *mtstate, > ResultRelInfo *resultRelInfo, > int partition_index); > > Prepare for a tuple-routing operation on a foreign table. This is called > from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo. I wonder why partition_index needs to be made part of this API? > TupleTableSlot * > ExecForeignRouting(EState *estate, > ResultRelInfo *resultRelInfo, > TupleTableSlot *slot); > > Route one tuple to the foreign table. This is called from ExecInsert. > > void > EndForeignRouting(EState *estate, > ResultRelInfo *resultRelInfo); > > End the operation and release resources. This is called from > ExecCleanupTupleRouting. > > Attached are WIP patches for that: > > Patch postgres-fdw-refactoring-WIP.patch: refactoring patch for > postgres_fdw.c to reduce duplicate code. > > Patch foreign-routing-fdwapi-WIP.patch: main patch to add new FDW APIs, > which is created on top of patch postgres-fdw-refactoring-WIP.patch and > the lazy-initialization-of-partition-info patch [1]. Noticed a typo in the patch (s/parition/partition/g): + * Also let the FDW init itself if this parition is foreign. + * Also let the FDW init itself if this parition is foreign. > By this change we don't need to expand the inheritance tree at planning > time, so no need to worry about the performance concern. Maybe I'm > missing something, though. Early feedback would be greatly appreciated. Perhaps an independent concern, but one thing I noticed is that it does not seem to play well with the direct modification (update push-down) feature. Now because updates (at least local, let's say) support re-routing, I thought we'd be able move rows across servers via the local server, but with direct modification we'd never get the chance. However, since update tuple routing is triggered by partition constraint failure, which we don't enforce for foreign tables/partitions anyway, I'm not sure if we need to do anything about that, and even if we did, whether it concerns this proposal. That said, I saw in the changes to ExecSetupPartitionTupleRouting() that BeginForeignRouting() is called for a foreign partition even if direct modification might already have been set up. If direct modification is set up, then ExecForeignRouting() will never get called, because we'd never call ExecUpdate() or ExecInsert(). Thanks, Amit
(2018/02/22 11:52), Amit Langote wrote: > On 2018/02/21 20:54, Etsuro Fujita wrote: >> void >> BeginForeignRouting(ModifyTableState *mtstate, >> ResultRelInfo *resultRelInfo, >> int partition_index); >> >> Prepare for a tuple-routing operation on a foreign table. This is called >> from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo. > > I wonder why partition_index needs to be made part of this API? The reason for that is because I think the FDW might want to look at the partition info stored in mtstate->mt_partition_tuple_routing for some reason or other, such as parent_child_tupconv_maps and child_parent_tupconv_maps, which are accessed with the given partition index. >> Patch foreign-routing-fdwapi-WIP.patch: main patch to add new FDW APIs, >> which is created on top of patch postgres-fdw-refactoring-WIP.patch and >> the lazy-initialization-of-partition-info patch [1]. > > Noticed a typo in the patch (s/parition/partition/g): > > + * Also let the FDW init itself if this parition is foreign. > > + * Also let the FDW init itself if this parition is foreign. Good catch! Will fix. > Perhaps an independent concern, but one thing I noticed is that it does > not seem to play well with the direct modification (update push-down) > feature. Now because updates (at least local, let's say) support > re-routing, I thought we'd be able move rows across servers via the local > server, but with direct modification we'd never get the chance. However, > since update tuple routing is triggered by partition constraint failure, > which we don't enforce for foreign tables/partitions anyway, I'm not sure > if we need to do anything about that, and even if we did, whether it > concerns this proposal. Good point! Actually, update tuple routing we have in HEAD doesn't allow re-routing tuples from foreign partitions even without direct modification. Here is an example using postgres_fdw: postgres=# create table pt (a int, b text) partition by list (a); CREATE TABLE postgres=# create table loct (a int check (a in (1)), b text); CREATE TABLE postgres=# create foreign table remp partition of pt for values in (1) server loopback options (table_name 'loct'); CREATE FOREIGN TABLE postgres=# create table locp partition of pt for values in (2); CREATE TABLE postgres=# insert into remp values (1, 'foo'); INSERT 0 1 postgres=# insert into locp values (2, 'bar'); INSERT 0 1 postgres=# select tableoid::regclass, * from pt; tableoid | a | b ----------+---+----- remp | 1 | foo locp | 2 | bar (2 rows) postgres=# create function postgres_fdw_abs(int) returns int as $$begin return abs($1); end$$ language plpgsql immutable; CREATE FUNCTION postgres=# explain verbose update pt set a = postgres_fdw_abs(a) + 1 where b = 'foo'; QUERY PLAN -------------------------------------------------------------------------------- ------------- Update on public.pt (cost=100.00..154.54 rows=12 width=42) Foreign Update on public.remp Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 Update on public.locp -> Foreign Scan on public.remp (cost=100.00..127.15 rows=6 width=42) Output: (postgres_fdw_abs(remp.a) + 1), remp.b, remp.ctid Remote SQL: SELECT a, b, ctid FROM public.loct WHERE ((b = 'foo'::text)) FOR UPDATE -> Seq Scan on public.locp (cost=0.00..27.39 rows=6 width=42) Output: (postgres_fdw_abs(locp.a) + 1), locp.b, locp.ctid Filter: (locp.b = 'foo'::text) (10 rows) postgres=# update pt set a = postgres_fdw_abs(a) + 1 where b = 'foo'; ERROR: new row for relation "loct" violates check constraint "loct_a_check" DETAIL: Failing row contains (2, foo). CONTEXT: Remote SQL command: UPDATE public.loct SET a = $2 WHERE ctid = $1 To be able to move tuples across foreign servers, I think we would first have to do something to allow re-routing tuples from foreign partitions. The patches I proposed hasn't done anything about that, so the patched version would behave the same way as HEAD with/without direct modification. > That said, I saw in the changes to ExecSetupPartitionTupleRouting() that > BeginForeignRouting() is called for a foreign partition even if direct > modification might already have been set up. If direct modification is > set up, then ExecForeignRouting() will never get called, because we'd > never call ExecUpdate() or ExecInsert(). Yeah, but I am thinking to leave the support for re-routing tuples across foreign servers for another patch. One difference between HEAD and the patched version would be: we can re-route tuples from a plain partition to a foreign partition if the foreign partition supports this tuple-routing API. Here is an example using the above data: postgres=# select tableoid::regclass, * from pt; tableoid | a | b ----------+---+----- remp | 1 | foo locp | 2 | bar (2 rows) postgres=# update pt set a = 1 where b = 'bar'; UPDATE 1 postgres=# select tableoid::regclass, * from pt; tableoid | a | b ----------+---+----- remp | 1 | foo remp | 1 | bar (2 rows) This would introduce an asymmetry (we can move tuples from plain partitions to foreign partitions, but the reverse is not true), but I am thinking that it would be probably okay to document about that. Thank you for the review! Best regards, Etsuro Fujita
Fujita-san, On Thu, Feb 22, 2018 at 8:49 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > (2018/02/22 11:52), Amit Langote wrote: >> I wonder why partition_index needs to be made part of this API? > > The reason for that is because I think the FDW might want to look at the > partition info stored in mtstate->mt_partition_tuple_routing for some reason > or other, such as parent_child_tupconv_maps and child_parent_tupconv_maps, > which are accessed with the given partition index. I see. I guess that makes sense. >> Perhaps an independent concern, but one thing I noticed is that it does >> not seem to play well with the direct modification (update push-down) >> feature. Now because updates (at least local, let's say) support >> re-routing, I thought we'd be able move rows across servers via the local >> server, but with direct modification we'd never get the chance. However, >> since update tuple routing is triggered by partition constraint failure, >> which we don't enforce for foreign tables/partitions anyway, I'm not sure >> if we need to do anything about that, and even if we did, whether it >> concerns this proposal. > > Good point! Actually, update tuple routing we have in HEAD doesn't allow > re-routing tuples from foreign partitions even without direct modification. > Here is an example using postgres_fdw: > > postgres=# create table pt (a int, b text) partition by list (a); > CREATE TABLE > postgres=# create table loct (a int check (a in (1)), b text); > CREATE TABLE > postgres=# create foreign table remp partition of pt for values in (1) > server loopback options (table_name 'loct'); > CREATE FOREIGN TABLE > postgres=# create table locp partition of pt for values in (2); > CREATE TABLE > postgres=# insert into remp values (1, 'foo'); > INSERT 0 1 > postgres=# insert into locp values (2, 'bar'); > INSERT 0 1 > postgres=# select tableoid::regclass, * from pt; > tableoid | a | b > ----------+---+----- > remp | 1 | foo > locp | 2 | bar > (2 rows) > > postgres=# create function postgres_fdw_abs(int) returns int as $$begin > return abs($1); end$$ language plpgsql immutable; > CREATE FUNCTION > postgres=# explain verbose update pt set a = postgres_fdw_abs(a) + 1 where b > = 'foo'; > QUERY PLAN > > -------------------------------------------------------------------------------- > ------------- > Update on public.pt (cost=100.00..154.54 rows=12 width=42) > Foreign Update on public.remp > Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 > Update on public.locp > -> Foreign Scan on public.remp (cost=100.00..127.15 rows=6 width=42) > Output: (postgres_fdw_abs(remp.a) + 1), remp.b, remp.ctid > Remote SQL: SELECT a, b, ctid FROM public.loct WHERE ((b = > 'foo'::text)) FOR UPDATE > -> Seq Scan on public.locp (cost=0.00..27.39 rows=6 width=42) > Output: (postgres_fdw_abs(locp.a) + 1), locp.b, locp.ctid > Filter: (locp.b = 'foo'::text) > (10 rows) > > postgres=# update pt set a = postgres_fdw_abs(a) + 1 where b = 'foo'; > ERROR: new row for relation "loct" violates check constraint "loct_a_check" > DETAIL: Failing row contains (2, foo). > CONTEXT: Remote SQL command: UPDATE public.loct SET a = $2 WHERE ctid = $1 > > To be able to move tuples across foreign servers, I think we would first > have to do something to allow re-routing tuples from foreign partitions. > The patches I proposed hasn't done anything about that, so the patched > version would behave the same way as HEAD with/without direct modification. Yes. As I said, since update re-routing is triggered by partition constraint failure for the new row and we don't check (any) constraints for foreign tables, that means re-routing won't occur for foreign partitions. >> That said, I saw in the changes to ExecSetupPartitionTupleRouting() that >> BeginForeignRouting() is called for a foreign partition even if direct >> modification might already have been set up. If direct modification is >> set up, then ExecForeignRouting() will never get called, because we'd >> never call ExecUpdate() or ExecInsert(). > > Yeah, but I am thinking to leave the support for re-routing tuples across > foreign servers for another patch. > > One difference between HEAD and the patched version would be: we can > re-route tuples from a plain partition to a foreign partition if the foreign > partition supports this tuple-routing API. Here is an example using the > above data: > > postgres=# select tableoid::regclass, * from pt; > tableoid | a | b > ----------+---+----- > remp | 1 | foo > locp | 2 | bar > (2 rows) > > postgres=# update pt set a = 1 where b = 'bar'; > UPDATE 1 > postgres=# select tableoid::regclass, * from pt; > tableoid | a | b > ----------+---+----- > remp | 1 | foo > remp | 1 | bar > (2 rows) > > This would introduce an asymmetry (we can move tuples from plain partitions > to foreign partitions, but the reverse is not true), but I am thinking that > it would be probably okay to document about that. I see. Thanks for the explanation. About just documenting the asymmetry you mentioned that's caused by the fact that we don't enforce constraints on foreign tables, I started wondering if we shouldn't change our stance on the matter wrt "partition" constraints? But, admittedly, that's a topic for a different thread. Thanks, Amit
(2018/02/23 16:38), Amit Langote wrote: > On Thu, Feb 22, 2018 at 8:49 PM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> This would introduce an asymmetry (we can move tuples from plain partitions >> to foreign partitions, but the reverse is not true), but I am thinking that >> it would be probably okay to document about that. > About just documenting the asymmetry you mentioned that's caused by > the fact that we don't enforce constraints on foreign tables, I > started wondering if we shouldn't change our stance on the matter wrt > "partition" constraints? I'm not sure that it's a good idea to make an exception in that case. Another concern is triggers on the remote side; those might change the row so that the partition constraint of the containing partition is no longer satisfied. > But, admittedly, that's a topic for a > different thread. OK, I'll leave that for another patch. Will post a new version. Thanks for the comments! Best regards, Etsuro Fujita
(2018/02/21 20:54), Etsuro Fujita wrote: > void > BeginForeignRouting(); > > Prepare for a tuple-routing operation on a foreign table. This is called > from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo. I modified execPartition.c so that this callback routine is called from a single function that I added to execPartition.c and it is called the first time the foreign partition is chose as the target partition to route a tuple to. That removes CheckValidResultRel, the tuple-conversion setup, and the FDW initialization for each UPDATE subplan from ExecSetupPartitionTupleRouting, so it would minimize the possibly-useless overhead in doing that function. Changes other than that are: * Fixed typo and revised code/comments * Added more regression tests * Added docs Attached is a new version of the patch set. Best regards, Etsuro Fujita
Attachment
Fujita-san, Thanks for sending the updated patches. On 2018/02/27 21:01, Etsuro Fujita wrote: > (2018/02/21 20:54), Etsuro Fujita wrote: >> void >> BeginForeignRouting(); >> >> Prepare for a tuple-routing operation on a foreign table. This is called >> from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo. > > I modified execPartition.c so that this callback routine is called from a > single function that I added to execPartition.c and it is called the first > time the foreign partition is chose as the target partition to route a > tuple to. That removes CheckValidResultRel, the tuple-conversion setup, > and the FDW initialization for each UPDATE subplan from > ExecSetupPartitionTupleRouting, so it would minimize the possibly-useless > overhead in doing that function. > > Changes other than that are: > > * Fixed typo and revised code/comments > * Added more regression tests > * Added docs > > Attached is a new version of the patch set. Patches still seem to apply cleanly to HEAD, compile fine, tests pass. * Comments postgres-fdw-refactoring-1.patch: 1. Just a minor nitpick: wouldn't it be better to call it create_foreign_modify_state just like its "finish" counterpart that's named finish_foreign_modify? Other than that, it looks good to me. * Comments on foreign-routing-fdwapi-1.patch: 1. In the following sentence, s/rows/a tuple/g, to be consistent with other text added by the patch + <para> + If the <function>ExecForeignRouting</function> pointer is set to + <literal>NULL</literal>, attempts to route rows to the foreign table will fail + with an error message. + </para> 2. If I understand the description you provided in [1] correctly, the point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to avoid possibly-redundantly performing following two steps in ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos that may not be used for tuple routing after all: - create the parent_child_tupconv_maps[] entry for it - perform FDW tuple routing initialization. If that's true, the following comment could be expanded just a little bit to make that clearer: /* * ExecInitPartitionExtraInfo * Do the remaining initialization work for the given partition 3. You removed the following from ExecInitPartitionInfo: /* - * Verify result relation is a valid target for an INSERT. An UPDATE of a - * partition-key becomes a DELETE+INSERT operation, so this check is still - * required when the operation is CMD_UPDATE. - */ - CheckValidResultRel(leaf_part_rri, CMD_INSERT); but, only added back the following in ExecInsert: + /* + * Verify the specified partition is a valid target for INSERT if we + * didn't yet. + */ + if (!resultRelInfo->ri_PartitionIsValid) + { + CheckValidResultRel(resultRelInfo, CMD_INSERT); Maybe, the new comment should add a "Note: " line the comment saying something about this code being invoked as part of an UPDATE. Tests and documentation added by this patch look quite good. That's all I have for now. Thanks, Amit [1] https://www.postgresql.org/message-id/5A95487E.9050808%40lab.ntt.co.jp
On 2018/03/19 20:25, Amit Langote wrote: > That's all I have for now. While testing this patch, I noticed a crash when performing EXPLAIN on update of a partition tree containing foreign partitions. Crash occurs in postgresEndForeignRouting() due to the following Assert failing: Assert(fmstate != NULL); It seems the problem is that ExecCleanupTupleRouting() invokes the EndForeignRouting() function even if ri_PartitionIsValid is not set. So I suppose we need this: /* - * If this is INSERT/UPDATE, allow any FDWs to shut down + * If this is INSERT/UPDATE, allow any FDWs to shut down if it has + * initialized tuple routing information at all. */ if (node && + resultRelInfo->ri_PartitionIsValid && resultRelInfo->ri_FdwRoutine != NULL && resultRelInfo->ri_FdwRoutine->EndForeignRouting != NULL) resultRelInfo->ri_FdwRoutine->EndForeignRouting(node->ps.state, BTW,patch needs to be rebased because of two commits this morning: 6666ee49f [1] and ee0a1fc84 [2]. Thanks, Amit [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6666ee49f [2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ee0a1fc84
Hi Amit, (2018/03/20 15:57), Amit Langote wrote: > On 2018/03/19 20:25, Amit Langote wrote: >> That's all I have for now. Will reply to your previous email. > While testing this patch, I noticed a crash when performing EXPLAIN on > update of a partition tree containing foreign partitions. Crash occurs in > postgresEndForeignRouting() due to the following Assert failing: > > Assert(fmstate != NULL); > > It seems the problem is that ExecCleanupTupleRouting() invokes the > EndForeignRouting() function even if ri_PartitionIsValid is not set. So I > suppose we need this: > > /* > - * If this is INSERT/UPDATE, allow any FDWs to shut down > + * If this is INSERT/UPDATE, allow any FDWs to shut down if it has > + * initialized tuple routing information at all. > */ > if (node&& > + resultRelInfo->ri_PartitionIsValid&& > resultRelInfo->ri_FdwRoutine != NULL&& > resultRelInfo->ri_FdwRoutine->EndForeignRouting != NULL) > resultRelInfo->ri_FdwRoutine->EndForeignRouting(node->ps.state, Will look into this. > BTW,patch needs to be rebased because of two commits this morning: > 6666ee49f [1] and ee0a1fc84 [2]. Will do. Thanks for reviewing the patch! Best regards, Etsuro Fujita
Hi, Amit Langote wrote: > 2. If I understand the description you provided in [1] correctly, the > point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to > avoid possibly-redundantly performing following two steps in > ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos > that may not be used for tuple routing after all: > > - create the parent_child_tupconv_maps[] entry for it > - perform FDW tuple routing initialization. > > If that's true, the following comment could be expanded just a little bit > to make that clearer: > > /* > * ExecInitPartitionExtraInfo > * Do the remaining initialization work for the given partition Yeah, I think the comments on why this is a separate step should be more extensive. Probably add something to ExecInitPartitionInfo too. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Feb 27, 2018 at 7:01 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > Changes other than that are: > > * Fixed typo and revised code/comments > * Added more regression tests > * Added docs > > Attached is a new version of the patch set. I took a look at this patch set today but I really don't think we should commit something so minimal. It's got at least four issues that I see: 1. It still doesn't work for COPY, because COPY isn't going to have a ModifyTableState. I really think it ought to be possible to come up with an API that can handle both INSERT and COPY; I don't think it should be necessary to have two different APIs for those two problems. Amit managed to do it for regular tables, and I don't really see a good reason why foreign tables need to be different. 2. I previously asked why we couldn't use the existing APIs for this, and you gave me some answer, but I can't help noticing that postgresExecForeignRouting is an exact copy of postgresExecForeignInsert. Apparently, some code reuse is indeed possible here! Why not reuse the same function instead of calling a new one? If the answer is that planSlot might be NULL in this case, or something like that, then let's just document that. A needless proliferation of FDW APIs is really undesirable, as it raises the bar for every FDW author. 3. It looks like we're just doing an INSERT for every row, which is pretty much an anti-pattern for inserting data into a PostgreSQL database. COPY is way faster, and even multi-row inserts are significantly faster. 4. It doesn't do anything about the UPDATE tuple routing problem mentioned upthread. I don't necessarily think that the first patch in this area has to solve all of those problems, and #4 in particular seems like it might be a pretty big can of worms. However, I think that getting INSERT but not COPY, with bad performance, and with duplicated APIs, is moving more in the wrong direction than the right one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
(2018/03/23 4:09), Robert Haas wrote: > On Tue, Feb 27, 2018 at 7:01 AM, Etsuro Fujita >> Attached is a new version of the patch set. > > I took a look at this patch set today but I really don't think we > should commit something so minimal. It's got at least four issues > that I see: > > 1. It still doesn't work for COPY, because COPY isn't going to have a > ModifyTableState. I really think it ought to be possible to come up > with an API that can handle both INSERT and COPY; I don't think it > should be necessary to have two different APIs for those two problems. > Amit managed to do it for regular tables, and I don't really see a > good reason why foreign tables need to be different. Maybe I'm missing something, but I think the proposed FDW API could be used for the COPY case as well with some modifications to core. If so, my question is: should we support COPY into foreign tables as well? I think that if we support COPY tuple routing for foreign partitions, it would be better to support direct COPY into foreign partitions as well. > 2. I previously asked why we couldn't use the existing APIs for this, > and you gave me some answer, but I can't help noticing that > postgresExecForeignRouting is an exact copy of > postgresExecForeignInsert. Apparently, some code reuse is indeed > possible here! Why not reuse the same function instead of calling a > new one? If the answer is that planSlot might be NULL in this case, > or something like that, then let's just document that. A needless > proliferation of FDW APIs is really undesirable, as it raises the bar > for every FDW author. You've got a point! I'll change the patch that way. > 3. It looks like we're just doing an INSERT for every row, which is > pretty much an anti-pattern for inserting data into a PostgreSQL > database. COPY is way faster, and even multi-row inserts are > significantly faster. I planed to work on new FDW API for using COPY for COPY tuple routing [1], but I didn't have time for that in this development cycle, so I'm re-planning to work on that for PG12. I'm not sure we can optimize that insertion using multi-row inserts because tuple routing works row by row, as you know. Anyway, I think these would be beyond the scope of the first version. > 4. It doesn't do anything about the UPDATE tuple routing problem > mentioned upthread. > > I don't necessarily think that the first patch in this area has to > solve all of those problems, and #4 in particular seems like it might > be a pretty big can of worms. OK > However, I think that getting INSERT > but not COPY, with bad performance, and with duplicated APIs, is > moving more in the wrong direction than the right one. Will fix. Thanks for reviewing the patch! Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp
On Fri, Mar 23, 2018 at 7:55 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > Maybe I'm missing something, but I think the proposed FDW API could be used > for the COPY case as well with some modifications to core. If so, my > question is: should we support COPY into foreign tables as well? I think > that if we support COPY tuple routing for foreign partitions, it would be > better to support direct COPY into foreign partitions as well. Yes, I really, really want to be able to support COPY. If you think you can make that work with this API -- let's see it! >> 3. It looks like we're just doing an INSERT for every row, which is >> pretty much an anti-pattern for inserting data into a PostgreSQL >> database. COPY is way faster, and even multi-row inserts are >> significantly faster. > > I planed to work on new FDW API for using COPY for COPY tuple routing [1], > but I didn't have time for that in this development cycle, so I'm > re-planning to work on that for PG12. I'm not sure we can optimize that > insertion using multi-row inserts because tuple routing works row by row, as > you know. Anyway, I think these would be beyond the scope of the first > version. My concern is that if we add APIs now that only support single-row inserts, we'll have to rework them again in order to support multi-row inserts. I'd like to avoid that, if possible. I think for bulk inserts we'll need an API that says "here's a row, store it or buffer it as you like" and then another API that says "flush any buffered rows to the actual table and perform any necessary cleanup". Or maybe in postgres_fdw the first API could start a COPY if not already done and send the row, and the second one could end the COPY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
(2018/03/23 21:02), Robert Haas wrote: > On Fri, Mar 23, 2018 at 7:55 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> Maybe I'm missing something, but I think the proposed FDW API could be used >> for the COPY case as well with some modifications to core. If so, my >> question is: should we support COPY into foreign tables as well? I think >> that if we support COPY tuple routing for foreign partitions, it would be >> better to support direct COPY into foreign partitions as well. > > Yes, I really, really want to be able to support COPY. If you think > you can make that work with this API -- let's see it! OK >>> 3. It looks like we're just doing an INSERT for every row, which is >>> pretty much an anti-pattern for inserting data into a PostgreSQL >>> database. COPY is way faster, and even multi-row inserts are >>> significantly faster. >> >> I planed to work on new FDW API for using COPY for COPY tuple routing [1], >> but I didn't have time for that in this development cycle, so I'm >> re-planning to work on that for PG12. I'm not sure we can optimize that >> insertion using multi-row inserts because tuple routing works row by row, as >> you know. Anyway, I think these would be beyond the scope of the first >> version. > > My concern is that if we add APIs now that only support single-row > inserts, we'll have to rework them again in order to support multi-row > inserts. I'd like to avoid that, if possible. Yeah, but we would have this issue for normal inserts into foreign tables. For the normal case, what I'm thinking to support multi-row inserts is to use DirectModify FDW API. With this API, I think we could support pushing down INSERT with multiple VALUES sublists to the remote, but I'm not sure we could extend that to the tuple-routing case. > I think for bulk > inserts we'll need an API that says "here's a row, store it or buffer > it as you like" and then another API that says "flush any buffered > rows to the actual table and perform any necessary cleanup". Or maybe > in postgres_fdw the first API could start a COPY if not already done > and send the row, and the second one could end the COPY. That's really what I have in mind. Best regards, Etsuro Fujita
On Fri, Mar 23, 2018 at 8:22 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: >> I think for bulk >> inserts we'll need an API that says "here's a row, store it or buffer >> it as you like" and then another API that says "flush any buffered >> rows to the actual table and perform any necessary cleanup". Or maybe >> in postgres_fdw the first API could start a COPY if not already done >> and send the row, and the second one could end the COPY. > That's really what I have in mind. So let's do it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
(2018/03/23 20:55), Etsuro Fujita wrote: > (2018/03/23 4:09), Robert Haas wrote: >> 1. It still doesn't work for COPY, because COPY isn't going to have a >> ModifyTableState. I really think it ought to be possible to come up >> with an API that can handle both INSERT and COPY; I don't think it >> should be necessary to have two different APIs for those two problems. >> Amit managed to do it for regular tables, and I don't really see a >> good reason why foreign tables need to be different. > > Maybe I'm missing something, but I think the proposed FDW API could be > used for the COPY case as well with some modifications to core. If so, > my question is: should we support COPY into foreign tables as well? I > think that if we support COPY tuple routing for foreign partitions, it > would be better to support direct COPY into foreign partitions as well. Done. >> 2. I previously asked why we couldn't use the existing APIs for this, >> and you gave me some answer, but I can't help noticing that >> postgresExecForeignRouting is an exact copy of >> postgresExecForeignInsert. Apparently, some code reuse is indeed >> possible here! Why not reuse the same function instead of calling a >> new one? If the answer is that planSlot might be NULL in this case, >> or something like that, then let's just document that. A needless >> proliferation of FDW APIs is really undesirable, as it raises the bar >> for every FDW author. > > You've got a point! I'll change the patch that way. Done. I modified the patch so that the existing API postgresExecForeignInsert is called as-is (ie, with the planSlot parameter) in the INSERT case and is called with that parameter set to NULL in the COPY case. So, I removed postgresExecForeignRouting and the postgres_fdw refactoring for that API. Also, I changed the names of the remaining new APIs to postgresBeginForeignInsert and postgresEndForeignInsert, which I think would be better because these are used not only for tuple routing but for directly copying into foreign tables. Also, I dropped partition_index from the parameter list for postgresBeginForeignInsert; I thought that it could be used for the FDW to access the partition info stored in mt_partition_tuple_routing for something in the case of tuple-routing, but I started to think that the FDW wouldn't need that in practice. >> However, I think that getting INSERT >> but not COPY, with bad performance, and with duplicated APIs, is >> moving more in the wrong direction than the right one. > > Will fix. Done. Attached is the new version of the patch. Patch foreign-routing-fdwapi-2.patch is created on top of patch postgres-fdw-refactoring-2.patch. (The former contains the bug-fix [1].) Any feedback is welcome! Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5ABA4074.1090500@lab.ntt.co.jp
Attachment
(2018/03/19 20:25), Amit Langote wrote: > On 2018/02/27 21:01, Etsuro Fujita wrote: >> Attached is a new version of the patch set. > * Comments postgres-fdw-refactoring-1.patch: > > 1. Just a minor nitpick: wouldn't it be better to call it > create_foreign_modify_state just like its "finish" counterpart that's > named finish_foreign_modify? Good idea! Done. > * Comments on foreign-routing-fdwapi-1.patch: > > 1. In the following sentence, s/rows/a tuple/g, to be consistent with > other text added by the patch > > +<para> > + If the<function>ExecForeignRouting</function> pointer is set to > +<literal>NULL</literal>, attempts to route rows to the foreign table > will fail > + with an error message. > +</para> I modified the patch to use the existing API ExecForeignInsert instead of that API and removed that API including this doc. > 2. If I understand the description you provided in [1] correctly, the > point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to > avoid possibly-redundantly performing following two steps in > ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos > that may not be used for tuple routing after all: > > - create the parent_child_tupconv_maps[] entry for it > - perform FDW tuple routing initialization. Sorry, my explanation was not enough, but that was just one of the reasons why I introduced those; another is to do CheckValidResultRel against a partition after the partition was chosen for tuple routing rather than in ExecSetupPartitionTupleRouting, to avoid aborting an UPDATE of a partition key unnecessarily due to non-routable foreign-partitions that may not be chosen for tuple routing after all. Now we have ON CONFLICT for partitioned tables, which requires the conversion map to be computed in ExecInitPartitionInfo, so I updated the patch so that we keep the former step in ExecInitPartitionInfo and ExecSetupPartitionTupleRouting and so that we just init the FDW in ExecInitExtraPartitionInfo (changed to ExecInitForeignRouting). And I added comments to ExecInitForeignRouting and ExecPrepareTupleRouting. > 3. You removed the following from ExecInitPartitionInfo: > > /* > - * Verify result relation is a valid target for an INSERT. An UPDATE > of a > - * partition-key becomes a DELETE+INSERT operation, so this check is > still > - * required when the operation is CMD_UPDATE. > - */ > - CheckValidResultRel(leaf_part_rri, CMD_INSERT); > > but, only added back the following in ExecInsert: > > + /* > + * Verify the specified partition is a valid target for INSERT if we > + * didn't yet. > + */ > + if (!resultRelInfo->ri_PartitionIsValid) > + { > + CheckValidResultRel(resultRelInfo, CMD_INSERT); > > Maybe, the new comment should add a "Note: " line the comment saying > something about this code being invoked as part of an UPDATE. Done. Also, I fixed a bug reported from you the way you proposed [1], and added regression tests for that. Good catch! Thanks! Thank you for the review! Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/2d72275d-3574-92c9-9241-5c9b456c87a2%40lab.ntt.co.jp
(2018/03/20 21:31), Alvaro Herrera wrote: > Amit Langote wrote: > >> 2. If I understand the description you provided in [1] correctly, the >> point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to >> avoid possibly-redundantly performing following two steps in >> ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos >> that may not be used for tuple routing after all: >> >> - create the parent_child_tupconv_maps[] entry for it >> - perform FDW tuple routing initialization. >> >> If that's true, the following comment could be expanded just a little bit >> to make that clearer: >> >> /* >> * ExecInitPartitionExtraInfo >> * Do the remaining initialization work for the given partition > > Yeah, I think the comments on why this is a separate step should be more > extensive. Probably add something to ExecInitPartitionInfo too. Done. Please see the reply to Amit that I sent just now. Thank you for the review! Best regards, Etsuro Fujita
Etsuro Fujita wrote: > Now we have ON CONFLICT for partitioned tables, which requires the > conversion map to be computed in ExecInitPartitionInfo, so I updated the > patch so that we keep the former step in ExecInitPartitionInfo and > ExecSetupPartitionTupleRouting and so that we just init the FDW in > ExecInitExtraPartitionInfo (changed to ExecInitForeignRouting). And I added > comments to ExecInitForeignRouting and ExecPrepareTupleRouting. Hmm, I may be misreading what you mean here ... but as far as I understand, ON CONFLICT does not support foreign tables, does it? If that is so, I'm not sure why the conversion map would be necessary, if it is for ON CONFLICT alone. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Fujita-san, Thanks for updating the patch. On 2018/03/30 21:56, Etsuro Fujita wrote: > I modified the patch to use the existing API ExecForeignInsert instead of > that API and removed that API including this doc. OK. >> 2. If I understand the description you provided in [1] correctly, the >> point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to >> avoid possibly-redundantly performing following two steps in >> ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos >> that may not be used for tuple routing after all: >> >> - create the parent_child_tupconv_maps[] entry for it >> - perform FDW tuple routing initialization. > > Sorry, my explanation was not enough, but that was just one of the reasons > why I introduced those; another is to do CheckValidResultRel against a > partition after the partition was chosen for tuple routing rather than in > ExecSetupPartitionTupleRouting, to avoid aborting an UPDATE of a partition > key unnecessarily due to non-routable foreign-partitions that may not be > chosen for tuple routing after all. I see. So, currently in ExecSetupPartitionTupleRouting, we call CheckValidResultRel() to check if resultRelInfos reused from those initialized for UPDATE are valid for insertions (or rather for routing inserted tuples into it). But you're proposing to delay that check until ExecPrepareTupleRouting is called for a given resultRelInfo, at which point it's clear that we actually want to insert using that resultRelInfo. That seems reasonable. > Now we have ON CONFLICT for partitioned tables, which requires the > conversion map to be computed in ExecInitPartitionInfo, so I updated the > patch so that we keep the former step in ExecInitPartitionInfo and > ExecSetupPartitionTupleRouting and so that we just init the FDW in > ExecInitExtraPartitionInfo (changed to ExecInitForeignRouting). And I > added comments to ExecInitForeignRouting and ExecPrepareTupleRouting. That looks good. I looked at the new patch. It looks good overall, although I have one question -- IIUC, BeginForeignInsert() performs actions that are equivalent of performing PlanForeignModify() + BeginForeignModify() for an INSERT as if it was directly executed on a given foreign table/partition. Did you face any problems in doing the latter itself in the core code? Doing it that way would mean no changes to a FDW itself will be required (and hence no need for additional APIs), but I may be missing something. One thing that seems good about your approach is that newly added support for COPY FROM on foreign tables/partitions takes minimal changes as implemented by using the new API, whereas with the alternative approach it would require duplicated code (same code would have to be present in both copy.c and execPartition.c, but it could perhaps be avoided). Thanks, Amit
(2018/03/30 22:28), Alvaro Herrera wrote: > Etsuro Fujita wrote: > >> Now we have ON CONFLICT for partitioned tables, which requires the >> conversion map to be computed in ExecInitPartitionInfo, so I updated the >> patch so that we keep the former step in ExecInitPartitionInfo and >> ExecSetupPartitionTupleRouting and so that we just init the FDW in >> ExecInitExtraPartitionInfo (changed to ExecInitForeignRouting). And I added >> comments to ExecInitForeignRouting and ExecPrepareTupleRouting. > > Hmm, I may be misreading what you mean here ... but as far as I > understand, ON CONFLICT does not support foreign tables, does it? It doesn't, except for ON CONFLICT DO NOTHING without an inference specification. > If > that is so, I'm not sure why the conversion map would be necessary, if > it is for ON CONFLICT alone. We wouldn't need that for foreign partitions (When DO NOTHING with an inference specification or DO UPDATE on a partitioned table containing foreign partitions, the planner would throw an error before we get to ExecInitPartitionInfo). The reason why I updated the patch that way is just to make the patch simple, but on reflection I don't think that's a good idea; I think we should delay the map-creation step as well as the FDW-initialization step for UPDATE subplan partitions as before for improved efficiency for UPDATE-of-partition-key. However, I don't think it'd still be a good idea to delay those steps for partitions created by ExecInitPartitionInfo the same way as for the subplan partitions, because in that case it'd be better to perform CheckValidResultRel against a partition right after we do InitResultRelInfo for the partition in ExecInitPartitionInfo, as that would save cycles in cases when the partition is considered nonvalid by that check. So, What I'm thinking is: a) for the subplan partitions, we delay those steps as before, and b) for the partitions created by ExecInitPartitionInfo, we do that check for a partition right after InitResultRelInfo in that function, (and if valid, we create a map and initialize the FDW for the partition in that function). Best regards, Etsuro Fujita
(2018/04/02 18:49), Amit Langote wrote: >>> 2. If I understand the description you provided in [1] correctly, the >>> point of having ri_PartitionIsValid and ExecInitExtraPartitionInfo() is to >>> avoid possibly-redundantly performing following two steps in >>> ExecSetupPartitionTupleRouting() for *reused* partition ResultRelInfos >>> that may not be used for tuple routing after all: >>> >>> - create the parent_child_tupconv_maps[] entry for it >>> - perform FDW tuple routing initialization. >> >> Sorry, my explanation was not enough, but that was just one of the reasons >> why I introduced those; another is to do CheckValidResultRel against a >> partition after the partition was chosen for tuple routing rather than in >> ExecSetupPartitionTupleRouting, to avoid aborting an UPDATE of a partition >> key unnecessarily due to non-routable foreign-partitions that may not be >> chosen for tuple routing after all. > > I see. So, currently in ExecSetupPartitionTupleRouting, we call > CheckValidResultRel() to check if resultRelInfos reused from those > initialized for UPDATE are valid for insertions (or rather for routing > inserted tuples into it). But you're proposing to delay that check until > ExecPrepareTupleRouting is called for a given resultRelInfo, at which > point it's clear that we actually want to insert using that resultRelInfo. > That seems reasonable. That's exactly what I'm thinking. >> Now we have ON CONFLICT for partitioned tables, which requires the >> conversion map to be computed in ExecInitPartitionInfo, so I updated the >> patch so that we keep the former step in ExecInitPartitionInfo and >> ExecSetupPartitionTupleRouting and so that we just init the FDW in >> ExecInitExtraPartitionInfo (changed to ExecInitForeignRouting). And I >> added comments to ExecInitForeignRouting and ExecPrepareTupleRouting. > > That looks good. I revisited this. Please see the reply to Alvaro I sent right now. > I looked at the new patch. It looks good overall, although I have one > question -- IIUC, BeginForeignInsert() performs actions that are > equivalent of performing PlanForeignModify() + BeginForeignModify() for an > INSERT as if it was directly executed on a given foreign table/partition. > Did you face any problems in doing the latter itself in the core code? > Doing it that way would mean no changes to a FDW itself will be required > (and hence no need for additional APIs), but I may be missing something. The biggest issue in performing PlanForeignModify() plus BeginForeignModify() instead of the new FDW API would be: can the core code create a valid-looking planner state passed to PlanForeignModify() such as the ModifyTable plan node or the query tree stored in PlannerInfo? > One thing that seems good about your approach is that newly added support > for COPY FROM on foreign tables/partitions takes minimal changes as > implemented by using the new API, whereas with the alternative approach it > would require duplicated code (same code would have to be present in both > copy.c and execPartition.c, but it could perhaps be avoided). I agree. Thanks for the review! Best regards, Etsuro Fujita
On 2018/04/02 21:26, Etsuro Fujita wrote: > (2018/03/30 22:28), Alvaro Herrera wrote: >> Etsuro Fujita wrote: >> >>> Now we have ON CONFLICT for partitioned tables, which requires the >>> conversion map to be computed in ExecInitPartitionInfo, so I updated the >>> patch so that we keep the former step in ExecInitPartitionInfo and >>> ExecSetupPartitionTupleRouting and so that we just init the FDW in >>> ExecInitExtraPartitionInfo (changed to ExecInitForeignRouting). And I >>> added >>> comments to ExecInitForeignRouting and ExecPrepareTupleRouting. >> >> Hmm, I may be misreading what you mean here ... but as far as I >> understand, ON CONFLICT does not support foreign tables, does it? > > It doesn't, except for ON CONFLICT DO NOTHING without an inference > specification. > >> If >> that is so, I'm not sure why the conversion map would be necessary, if >> it is for ON CONFLICT alone. > > We wouldn't need that for foreign partitions (When DO NOTHING with an > inference specification or DO UPDATE on a partitioned table containing > foreign partitions, the planner would throw an error before we get to > ExecInitPartitionInfo). Actually, as you might know, since it is not possible to create an index on a partitioned table that has a foreign partition, there is no possibility of supporting any form of ON CONFLICT that requires an inference specification. > The reason why I updated the patch that way is > just to make the patch simple, but on reflection I don't think that's a > good idea; I think we should delay the map-creation step as well as the > FDW-initialization step for UPDATE subplan partitions as before for > improved efficiency for UPDATE-of-partition-key. However, I don't think > it'd still be a good idea to delay those steps for partitions created by > ExecInitPartitionInfo the same way as for the subplan partitions, because > in that case it'd be better to perform CheckValidResultRel against a > partition right after we do InitResultRelInfo for the partition in > ExecInitPartitionInfo, as that would save cycles in cases when the > partition is considered nonvalid by that check. It seems like a good idea to perform CheckValidResultRel right after the InitResultRelInfo in ExecInitPartitionInfo. However, if the partition is considered non-valid, that results in an error afaik, so I don't understand the point about saving cycles. > So, What I'm thinking is: > a) for the subplan partitions, we delay those steps as before, and b) for > the partitions created by ExecInitPartitionInfo, we do that check for a > partition right after InitResultRelInfo in that function, (and if valid, > we create a map and initialize the FDW for the partition in that function). Sounds good to me. I'm assuming that you're talking about delaying the is-valid-for-insert-routing check (using CheckValidResultRel) and parent-to-child map creation for a sub-plan result relation until ExecPrepareTupleRouting(). On a related note, I wonder if it'd be a good idea to rename the flag ri_PartitionIsValid to something that signifies that we only need it to be true if we want to do tuple routing (aka insert) using it. Maybe, ri_PartitionReadyForRouting or ri_PartitionReadyForInsert. I'm afraid that ri_PartitionIsValid is a bit ambiguous, although I'm not saying the name options I'm suggesting are the best. Thanks, Amit
Fujita-san, On 2018/04/02 21:29, Etsuro Fujita wrote: > (2018/04/02 18:49), Amit Langote wrote: >> I looked at the new patch. It looks good overall, although I have one >> question -- IIUC, BeginForeignInsert() performs actions that are >> equivalent of performing PlanForeignModify() + BeginForeignModify() for an >> INSERT as if it was directly executed on a given foreign table/partition. >> Did you face any problems in doing the latter itself in the core code? >> Doing it that way would mean no changes to a FDW itself will be required >> (and hence no need for additional APIs), but I may be missing something. > > The biggest issue in performing PlanForeignModify() plus > BeginForeignModify() instead of the new FDW API would be: can the core > code create a valid-looking planner state passed to PlanForeignModify() > such as the ModifyTable plan node or the query tree stored in PlannerInfo? Hmm, I can see the point. Passing mostly-dummy (garbage) PlannerInfo and query tree from the core code to FDW seems bad. By defining the new API with a clean interface (receiving fully valid ModifyTableState), we're not required to do that across the interface, but only inside the FDW's implementation. I was just slightly concerned that the new FDW function would have to implement what would normally be carried out across multiple special purpose callbacks, but maybe that's OK as long as it's clearly documented what its job is. Noticed a couple of things in the patch: + <para> + When this is called by a <command>COPY FROM</command> command, the + plan-related global data in <literal>mtstate</literal> is not provided + and the <literal>planSlot</literal> parameter of + <function>ExecForeignInsert</function> called for each inserted tuple is How about s/called/subsequently called/g? + <literal>NULL</literal>, wether the foreign table is the partition chosen Typo: s/wether/whether/g Thanks, Amit
(2018/04/03 13:32), Amit Langote wrote: > On 2018/04/02 21:26, Etsuro Fujita wrote: >> We wouldn't need that for foreign partitions (When DO NOTHING with an >> inference specification or DO UPDATE on a partitioned table containing >> foreign partitions, the planner would throw an error before we get to >> ExecInitPartitionInfo). > > Actually, as you might know, since it is not possible to create an index > on a partitioned table that has a foreign partition, there is no > possibility of supporting any form of ON CONFLICT that requires an > inference specification. Right. >> The reason why I updated the patch that way is >> just to make the patch simple, but on reflection I don't think that's a >> good idea; I think we should delay the map-creation step as well as the >> FDW-initialization step for UPDATE subplan partitions as before for >> improved efficiency for UPDATE-of-partition-key. However, I don't think >> it'd still be a good idea to delay those steps for partitions created by >> ExecInitPartitionInfo the same way as for the subplan partitions, because >> in that case it'd be better to perform CheckValidResultRel against a >> partition right after we do InitResultRelInfo for the partition in >> ExecInitPartitionInfo, as that would save cycles in cases when the >> partition is considered nonvalid by that check. > > It seems like a good idea to perform CheckValidResultRel right after the > InitResultRelInfo in ExecInitPartitionInfo. However, if the partition is > considered non-valid, that results in an error afaik, so I don't > understand the point about saving cycles. I think that we could abort the query without doing the remaining work after the check in ExecInitPartitionInfo in that case. >> So, What I'm thinking is: >> a) for the subplan partitions, we delay those steps as before, and b) for >> the partitions created by ExecInitPartitionInfo, we do that check for a >> partition right after InitResultRelInfo in that function, (and if valid, >> we create a map and initialize the FDW for the partition in that function). > > Sounds good to me. I'm assuming that you're talking about delaying the > is-valid-for-insert-routing check (using CheckValidResultRel) and > parent-to-child map creation for a sub-plan result relation until > ExecPrepareTupleRouting(). That's exactly what I have in mind. I modified the patch that way. > On a related note, I wonder if it'd be a good idea to rename the flag > ri_PartitionIsValid to something that signifies that we only need it to be > true if we want to do tuple routing (aka insert) using it. Maybe, > ri_PartitionReadyForRouting or ri_PartitionReadyForInsert. I'm afraid > that ri_PartitionIsValid is a bit ambiguous, although I'm not saying the > name options I'm suggesting are the best. That's a good idea! I like the first one, so I changed the name to that. Thanks for the review! Attached is an updated version of the patch. Patch foreign-routing-fdwapi-3.patch is created on top of patch postgres-fdw-refactoring-3.patch and the bug-fix patch [1]. Other changes: * Fixed/revised docs as you pointed out in another post. * Added docs to update.sgml * Revised some code/comments a little bit * Revised regression tests * Rebased against the latest HEAD Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5ABA4074.1090500@lab.ntt.co.jp
Attachment
(2018/04/03 13:59), Amit Langote wrote: > On 2018/04/02 21:29, Etsuro Fujita wrote: >> (2018/04/02 18:49), Amit Langote wrote: >>> I looked at the new patch. It looks good overall, although I have one >>> question -- IIUC, BeginForeignInsert() performs actions that are >>> equivalent of performing PlanForeignModify() + BeginForeignModify() for an >>> INSERT as if it was directly executed on a given foreign table/partition. >>> Did you face any problems in doing the latter itself in the core code? >>> Doing it that way would mean no changes to a FDW itself will be required >>> (and hence no need for additional APIs), but I may be missing something. >> >> The biggest issue in performing PlanForeignModify() plus >> BeginForeignModify() instead of the new FDW API would be: can the core >> code create a valid-looking planner state passed to PlanForeignModify() >> such as the ModifyTable plan node or the query tree stored in PlannerInfo? > > Hmm, I can see the point. Passing mostly-dummy (garbage) PlannerInfo and > query tree from the core code to FDW seems bad. By defining the new API > with a clean interface (receiving fully valid ModifyTableState), we're not > required to do that across the interface, but only inside the FDW's > implementation. I really think so too. > I was just slightly concerned that the new FDW function > would have to implement what would normally be carried out across multiple > special purpose callbacks, but maybe that's OK as long as it's clearly > documented what its job is. OK > Noticed a couple of things in the patch: > > +<para> > + When this is called by a<command>COPY FROM</command> command, the > + plan-related global data in<literal>mtstate</literal> is not provided > + and the<literal>planSlot</literal> parameter of > +<function>ExecForeignInsert</function> called for each inserted > tuple is > > How about s/called/subsequently called/g? Done. > +<literal>NULL</literal>, wether the foreign table is the partition > chosen > > Typo: s/wether/whether/g Done. Thanks again, Amit! Best regards, Etsuro Fujita
(2018/04/03 22:01), Etsuro Fujita wrote: > Attached is an updated version of the patch. Patch > foreign-routing-fdwapi-3.patch is created on top of patch > postgres-fdw-refactoring-3.patch and the bug-fix patch [1]. One thing I noticed about patch foreign-routing-fdwapi-3.patch is this bug: the server will crash when copying data into a foreign table that doesn't support the proposed APIs (eg, file_fdw foreign tables). The reason is that the patch doesn't perform CheckValidResultRel before that operation in that case. So I modified the patch as such and added regression tests for that. Attached is an updated version of the patch set: * As before, patch foreign-routing-fdwapi-4.patch is created on top of patch postgres-fdw-refactoring-4.patch and the bug-fix patch [1]. * I revised comments, docs, and regression tests a bit further, but no code changes other than the bug fix. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5ABA4074.1090500@lab.ntt.co.jp
Attachment
(2018/04/04 19:31), Etsuro Fujita wrote: > Attached is an updated version of the patch set: > * As before, patch foreign-routing-fdwapi-4.patch is created on top of > patch postgres-fdw-refactoring-4.patch and the bug-fix patch [1]. I did a bit of cleanup and comment-rewording to patch foreign-routing-fdwapi-4.patch. Attached is a new version of the patch set. I renamed the postgres_fdw refactoring patch but no changes to that patch. Best regards, Etsuro Fujita
Attachment
Fujita-san, On 2018/04/05 15:02, Etsuro Fujita wrote: > (2018/04/04 19:31), Etsuro Fujita wrote: >> Attached is an updated version of the patch set: >> * As before, patch foreign-routing-fdwapi-4.patch is created on top of >> patch postgres-fdw-refactoring-4.patch and the bug-fix patch [1]. > > I did a bit of cleanup and comment-rewording to patch > foreign-routing-fdwapi-4.patch. Attached is a new version of the patch > set. I renamed the postgres_fdw refactoring patch but no changes to that > patch. I noticed that the 2nd patch (foreign-routing-fdwapi-5.patch) fails to apply to copy.c: Hunk #9 FAILED at 2719. Hunk #10 succeeded at 2752 (offset -1 lines). Hunk #11 succeeded at 2795 (offset -1 lines). Hunk #12 succeeded at 2843 (offset -1 lines). 1 out of 12 hunks FAILED -- saving rejects to file src/backend/commands/copy.c.rej cat src/backend/commands/copy.c.rej *** src/backend/commands/copy.c --- src/backend/commands/copy.c *************** *** 2719,2727 **** resultRelInfo->ri_TrigDesc->trig_insert_before_row)) check_partition_constr = false; ! /* Check the constraints of the tuple */ ! if (resultRelInfo->ri_RelationDesc->rd_att->constr || ! check_partition_constr) ExecConstraints(resultRelInfo, slot, estate, true); if (useHeapMultiInsert) --- 2730,2742 ---- resultRelInfo->ri_TrigDesc->trig_insert_before_row)) check_partition_constr = false; ! /* ! * If the target is a plain table, check the constraints of ! * the tuple. ! */ ! if (resultRelInfo->ri_FdwRoutine == NULL && ! (resultRelInfo->ri_RelationDesc->rd_att->constr || ! check_partition_constr)) ExecConstraints(resultRelInfo, slot, estate, true); if (useHeapMultiInsert) Can you check? Thanks, Amit
(2018/04/05 15:37), Amit Langote wrote: > On 2018/04/05 15:02, Etsuro Fujita wrote: >> (2018/04/04 19:31), Etsuro Fujita wrote: >>> Attached is an updated version of the patch set: >>> * As before, patch foreign-routing-fdwapi-4.patch is created on top of >>> patch postgres-fdw-refactoring-4.patch and the bug-fix patch [1]. >> >> I did a bit of cleanup and comment-rewording to patch >> foreign-routing-fdwapi-4.patch. Attached is a new version of the patch >> set. I renamed the postgres_fdw refactoring patch but no changes to that >> patch. > > I noticed that the 2nd patch (foreign-routing-fdwapi-5.patch) fails to > apply to copy.c: > > Hunk #9 FAILED at 2719. > Hunk #10 succeeded at 2752 (offset -1 lines). > Hunk #11 succeeded at 2795 (offset -1 lines). > Hunk #12 succeeded at 2843 (offset -1 lines). > 1 out of 12 hunks FAILED -- saving rejects to file > src/backend/commands/copy.c.rej > > cat src/backend/commands/copy.c.rej > *** src/backend/commands/copy.c > --- src/backend/commands/copy.c > *************** > *** 2719,2727 **** > resultRelInfo->ri_TrigDesc->trig_insert_before_row)) > check_partition_constr = false; > > ! /* Check the constraints of the tuple */ > ! if (resultRelInfo->ri_RelationDesc->rd_att->constr || > ! check_partition_constr) > ExecConstraints(resultRelInfo, slot, estate, true); > > if (useHeapMultiInsert) > --- 2730,2742 ---- > resultRelInfo->ri_TrigDesc->trig_insert_before_row)) > check_partition_constr = false; > > ! /* > ! * If the target is a plain table, check the constraints of > ! * the tuple. > ! */ > ! if (resultRelInfo->ri_FdwRoutine == NULL&& > ! (resultRelInfo->ri_RelationDesc->rd_att->constr || > ! check_partition_constr)) > ExecConstraints(resultRelInfo, slot, estate, true); > > if (useHeapMultiInsert) > > Can you check? I forgot to mention this: the second patch is created on top of the first patch (postgres-fdw-refactoring-5.patch) and the patch in [1] as before. Thanks for reviewing, Amit! Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5ABA4074.1090500@lab.ntt.co.jp
Fuiita-san, On 2018/04/05 15:56, Etsuro Fujita wrote: > (2018/04/05 15:37), Amit Langote wrote: >> I noticed that the 2nd patch (foreign-routing-fdwapi-5.patch) fails to >> apply to copy.c: > > I forgot to mention this: the second patch is created on top of the first > patch (postgres-fdw-refactoring-5.patch) and the patch in [1] as before. Ah, sorry I hadn't noticed that in your previous email. Might be a good idea to attach the bug-fix patch here as well, and perhaps add numbers to the file names like: 0001_postgres-fdw-refactoring-5.patch 0002_BUGFIX-copy-from-check-constraint-fix.patch 0003_foreign-routing-fdwapi-5.patch Just one minor comment: I wonder why you decided not to have the CheckValidResultRel() call and the statement that sets ri_PartitionReadyForRouting inside the newly added ExecInitRoutingInfo itself. If ExecInitRoutingInfo does the last necessary steps for a ResultRelInfo (and hence the partition) to be ready to be used for routing, why not finish everything there. So the changes to ExecPrepareTupleRouting which look like this in the patch: + if (!partrel->ri_PartitionReadyForRouting) + { + CheckValidResultRel(partrel, CMD_INSERT); + + /* Set up information needed for routing tuples to the partition */ + ExecInitRoutingInfo(mtstate, estate, proute, partrel, partidx); + + partrel->ri_PartitionReadyForRouting = true; + } will become: + if (!partrel->ri_PartitionReadyForRouting) + ExecInitRoutingInfo(mtstate, estate, proute, partrel, partidx); As I see no other issues, I will mark this as Ready for Committer. Thanks, Amit
On 2018/04/05 16:31, Amit Langote wrote: > Fuiita-san, Oops, sorry about misspelling your name here, Fujita-san. - Amit
(2018/04/05 16:31), Amit Langote wrote: > Might be a good idea to attach the bug-fix patch here as well, and perhaps > add numbers to the file names like: > > 0001_postgres-fdw-refactoring-5.patch > 0002_BUGFIX-copy-from-check-constraint-fix.patch > 0003_foreign-routing-fdwapi-5.patch OK > Just one minor comment: > > I wonder why you decided not to have the CheckValidResultRel() call and > the statement that sets ri_PartitionReadyForRouting inside the newly added > ExecInitRoutingInfo itself. If ExecInitRoutingInfo does the last > necessary steps for a ResultRelInfo (and hence the partition) to be ready > to be used for routing, why not finish everything there. So the changes > to ExecPrepareTupleRouting which look like this in the patch: > > + if (!partrel->ri_PartitionReadyForRouting) > + { > + CheckValidResultRel(partrel, CMD_INSERT); > + > + /* Set up information needed for routing tuples to the partition */ > + ExecInitRoutingInfo(mtstate, estate, proute, partrel, partidx); > + > + partrel->ri_PartitionReadyForRouting = true; > + } > > will become: > > + if (!partrel->ri_PartitionReadyForRouting) > + ExecInitRoutingInfo(mtstate, estate, proute, partrel, partidx); Good idea! Modified that way. > As I see no other issues, I will mark this as Ready for Committer. Thanks! Attached is an updated version of the patch set plus the patch in [1]. Patch 0003_foreign-routing-fdwapi-6.patch can be applied on top of patch 0001_postgres-fdw-refactoring-6.patch and 0002_copy-from-check-constraint-fix.patch. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/5ABA4074.1090500@lab.ntt.co.jp
Attachment
On Thu, Apr 5, 2018 at 6:21 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote: > Attached is an updated version of the patch set plus the patch in [1]. Patch > 0003_foreign-routing-fdwapi-6.patch can be applied on top of patch > 0001_postgres-fdw-refactoring-6.patch and > 0002_copy-from-check-constraint-fix.patch. Committed. Let me say that you do nice work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2018-04-06 19:25:20 -0400, Robert Haas wrote: > On Thu, Apr 5, 2018 at 6:21 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: > > Attached is an updated version of the patch set plus the patch in [1]. Patch > > 0003_foreign-routing-fdwapi-6.patch can be applied on top of patch > > 0001_postgres-fdw-refactoring-6.patch and > > 0002_copy-from-check-constraint-fix.patch. > > Committed. Let me say that you do nice work. The CF entry for this is still marked as 'ready for committer'. Does anything remain? https://commitfest.postgresql.org/17/1184/ Greetings, Andres Freund
On Sun, Apr 8, 2018 at 5:37 AM, Andres Freund <andres@anarazel.de> wrote: > On 2018-04-06 19:25:20 -0400, Robert Haas wrote: >> On Thu, Apr 5, 2018 at 6:21 AM, Etsuro Fujita >> <fujita.etsuro@lab.ntt.co.jp> wrote: >> > Attached is an updated version of the patch set plus the patch in [1]. Patch >> > 0003_foreign-routing-fdwapi-6.patch can be applied on top of patch >> > 0001_postgres-fdw-refactoring-6.patch and >> > 0002_copy-from-check-constraint-fix.patch. >> >> Committed. Let me say that you do nice work. > > The CF entry for this is still marked as 'ready for committer'. Does anything remain? > > https://commitfest.postgresql.org/17/1184/ Nothing remains, so marked as committed. Thanks, Amit
(2018/04/07 8:25), Robert Haas wrote: > On Thu, Apr 5, 2018 at 6:21 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >> Attached is an updated version of the patch set plus the patch in [1]. Patch >> 0003_foreign-routing-fdwapi-6.patch can be applied on top of patch >> 0001_postgres-fdw-refactoring-6.patch and >> 0002_copy-from-check-constraint-fix.patch. > > Committed. Let me say that you do nice work. Thanks to Robert for committing! Thanks to everyone who got involved in this patch for all of the help with review and commentary! Best regards, Etsuro Fujita