Re: [POC] Fast COPY FROM command for the table with foreign partitions - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [POC] Fast COPY FROM command for the table with foreign partitions |
Date | |
Msg-id | CA+HiwqFxJksYQtGhje-VozoC3GYxyOtv5QxNb3fJ5O1FHOB-+g@mail.gmail.com Whole thread Raw |
In response to | RE: [POC] Fast COPY FROM command for the table with foreign partitions ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>) |
Responses |
RE: [POC] Fast COPY FROM command for the table with foreign partitions
|
List | pgsql-hackers |
On Thu, Nov 26, 2020 at 11:42 AM tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote: > From: Amit Langote <amitlangote09@gmail.com> > > Anyway, one thing we could do is rename > > ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(), > > that is, to make it actually set ri_usesMultiInsert and have places > > like CopyFrom() call it if (and only if) its local logic allows > > multi-insert to be used. So, ri_usesMultiInsert starts out set to > > false and if a module wants to use multi-insert for a given target > > relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag > > on. Also, given the confusion regarding how execPartition.c > > I think separating the setting and inspection of the property into different functions will be good, at least. > > > manipulates the flag, maybe change ExecFindPartition() to accept a > > Boolean parameter multi_insert, which it will pass down to > > ExecInitPartitionInfo(), which in turn will call > > ExecSetRelationUsesMultiInsert() for a given partition. Of course, if > > the logic in ExecSetRelationUsesMultiInsert() determines that > > multi-insert can't be used, for the reasons listed in the function, > > then the caller will have to live with that decision. > > I can't say for sure, but it looks strange to me, because I can't find a good description of multi_insert argument forExecFindPartition(). If we add multi_insert, I'm afraid we may want to add further arguments for other properties inthe future like "Hey, get me the partition that has triggers.", "Next, pass me a partition that uses a foreign table.",etc. I think the current ExecFindPartition() is good -- "Get me a partition that accepts this row." > > I wonder if ri_usesMultiInsert is really necessary. Would it cut down enough costs in the intended use case(s), say theheavyweight COPY FROM? Thinking on this more, I think I'm starting to agree with you on this. I skimmed the CopyFrom()'s main loop again today and indeed it doesn't seem that the cost of checking the individual conditions for whether or not to buffer the current tuple for the given target relation is all that big to save with ri_usesMultiInsert. So my argument that it is good for performance is perhaps not that strong. Andrey's original patch had the flag to, as I understand it, make the partitioning case work correctly. When inserting into a non-partitioned table, there's only one relation to care about. In that case, CopyFrom() can use either the new COPY interface or the INSERT interface for the entire operation when talking to a foreign target relation's FDW driver. With partitions, that has to be considered separately for each partition. What complicates the matter further is that while the original target relation (the root partitioned table in the partitioning case) is fully initialized in CopyFrom(), partitions are lazily initialized by ExecFindPartition(). Note that the initialization of a given target relation can also optionally involve calling the FDW to perform any pre-COPY initializations. So if a given partition is a foreign table, whether the copy operation was initialized using the COPY interface or the INSERT interface is determined away from CopyFrom(). Andrey created ri_usesMultiInsert to remember which was used so that CopyFrom() can use the correct interface during the subsequent interactions with the partition's driver. Now, it does not seem outright impossible to do this without the flag, but maybe Andrey thinks it is good for readability? If it is confusing from a modularity standpoint, maybe we should rethink that. That said, I still think that there should be a way for CopyFrom() to tell ExecFindPartition() which FDW interface to initialize a given foreign table partition's copy operation with -- COPY if the copy allows multi-insert, INSERT if not. Maybe the multi_insert parameter I mentioned earlier would serve that purpose. -- Amit Langote EDB: http://www.enterprisedb.com
pgsql-hackers by date: