Re: [POC] Fast COPY FROM command for the table with foreign partitions - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [POC] Fast COPY FROM command for the table with foreign partitions |
Date | |
Msg-id | CA+HiwqHzM=U74RPYJbG3g8RQGxCtQCHzc2h_GQs=uLRytD962w@mail.gmail.com Whole thread Raw |
In response to | RE: [POC] Fast COPY FROM command for the table with foreign partitions ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Re: [POC] Fast COPY FROM command for the table with foreign partitions |
List | pgsql-hackers |
Tsunakawa-san, Andrey, On Mon, Feb 15, 2021 at 1:54 PM tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote: > From: Justin Pryzby <pryzby@telsasoft.com> > > This is crashing during fdw check. > > http://cfbot.cputube.org/andrey-lepikhov.html > > > > Maybe it's related to this patch: > > |commit 6214e2b2280462cbc3aa1986e350e167651b3905 > > | Fix permission checks on constraint violation errors on partitions. > > | Security: CVE-2021-3393 > > Thank you for your kind detailed investigation. The rebased version is attached. Thanks for updating the patch. The commit message says this: Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo structure. However, it is no longer InitResultRelInfo() that sets ri_usesMultiInsert. Doing that is now left for concerned functions who set it when they have enough information to do that correctly. Maybe update the message to make that clear to interested readers. + /* + * Use multi-insert mode if the condition checking passes for the + * parent and its child. + */ + leaf_part_rri->ri_usesMultiInsert = + ExecMultiInsertAllowed(leaf_part_rri, rootResultRelInfo); Think I have mentioned upthread that this looks better as: if (rootResultRelInfo->ri_usesMultiInsert) leaf_part_rri->ri_usesMultiInsert = ExecMultiInsertAllowed(leaf_part_rri); This keeps the logic confined to ExecInitPartitionInfo() where it belongs. No point in burdening other callers of ExecMultiInsertAllowed() in deciding whether or not it should pass a valid value for the 2nd parameter. +static void +postgresBeginForeignCopy(ModifyTableState *mtstate, + ResultRelInfo *resultRelInfo) +{ ... + if (resultRelInfo->ri_RangeTableIndex == 0) + { + ResultRelInfo *rootResultRelInfo = resultRelInfo->ri_RootResultRelInfo; + + rte = exec_rt_fetch(rootResultRelInfo->ri_RangeTableIndex, estate); It's better to add an Assert(rootResultRelInfo != NULL) here. Apparently, there are cases where ri_RangeTableIndex == 0 without ri_RootResultRelInfo being set. The Assert will ensure that BeginForeignCopy() is not mistakenly called on such ResultRelInfos. +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} I can't parse what the function's comment says about "using list of parameters". Maybe it means to say "list of columns" specified in the COPY FROM statement. How about writing this as: /* * Deparse remote COPY FROM statement * * Note that this explicitly specifies the list of COPY's target columns * to account for the fact that the remote table's columns may not match * exactly with the columns declared in the local definition. */ I'm hoping that I'm interpreting the original note correctly. Andrey? + <para> + <literal>mtstate</literal> is the overall state of the + <structname>ModifyTable</structname> plan node being executed; global data about + the plan and execution state is available via this structure. ... +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate, + ResultRelInfo *rinfo); Maybe a bit late realizing this, but why does BeginForeignCopy() accept a ModifyTableState pointer whereas maybe just an EState pointer will do? I can't imagine why an FDW would want to look at the ModifyTableState. Case in point, I see that postgresBeginForeignCopy() only uses the EState from the ModifyTableState passed to it. I think the ResultRelInfo that's being passed to the Copy APIs contains most of the necessary information. Also, EndForeignCopy() seems fine with just receiving the EState. + TupleDesc tupDesc; /* COPY TO will be used for manual tuple copying + * into the destination */ ... @@ -382,19 +393,24 @@ EndCopy(CopyToState cstate) CopyToState BeginCopyTo(ParseState *pstate, Relation rel, + TupleDesc srcTupDesc, I think that either the commentary around tupDesc/srcTupDesc needs to be improved or we should really find a way to do this without maintaining TupleDesc separately from the CopyState.rel. IIUC, this change is merely to allow postgres_fdw's ExecForeignCopy() to use CopyOneRowTo() which needs to be passed a valid CopyState. postgresBeginForeignCopy() initializes one by calling BeginCopyTo(), but it can't just pass the target foreign Relation to it, because generic BeginCopyTo() has this: if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION) { ... else if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot copy from foreign table \"%s\"", RelationGetRelationName(rel)), errhint("Try the COPY (SELECT ...) TO variant."))); If the intention is to only prevent this error, maybe the condition above could be changed as this: /* * Check whether we support copying data out of the specified relation, * unless the caller also passed a non-NULL data_dest_cb, in which case, * the callback will take care of it */ if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION && data_dest_cb == NULL) I just checked that this works or at least doesn't break any newly added tests. -- Amit Langote EDB: http://www.enterprisedb.com
pgsql-hackers by date: