On 2017/07/11 6:56, Robert Haas wrote:
> On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> So, I dropped the COPY part.
>
> Ouch. I think we should try to figure out how the COPY part will be
> handled before we commit to a design.
I spent some time on this. To handle that, I'd like to propose doing
something similar to \copy (frontend copy): submit a COPY query "COPY
... FROM STDIN" to the remote server and route data from a file to the
remote server. For that, I'd like to add new FDW APIs called during
CopyFrom that allow us to copy to foreign tables:
* BeginForeignCopyIn: this would be called after creating a
ResultRelInfo for the target table (or each leaf partition of the target
partitioned table) if it's a foreign table, and perform any
initialization needed before the remote copy can start. In the
postgres_fdw case, I think this function would be a good place to send
"COPY ... FROM STDIN" to the remote server.
* ExecForeignCopyInOneRow: this would be called instead of heap_insert
if the target is a foreign table, and route the tuple read from the file
by NextCopyFrom to the remote server. In the postgres_fdw case, I think
this function would convert the tuple to text format for portability,
and then send the data to the remote server using PQputCopyData.
* EndForeignCopyIn: this would be called at the bottom of CopyFrom, and
release resources such as connections to the remote server. In the
postgres_fdw case, this function would do PQputCopyEnd to terminate data
transfer.
I think that would be much more efficient than INSERTing tuples into the
remote server one by one. What do you think about that?
> I have to admit that I'm a little bit fuzzy about why foreign insert
> routing requires all of these changes. I think this patch would
> benefit from being accompanied by several paragraphs of explanation
> outlining the rationale for each part of the patch.
Will do.
Best regards,
Etsuro Fujita