Re: [POC] Fast COPY FROM command for the table with foreign partitions - Mailing list pgsql-hackers

From Andrey V. Lepikhov
Subject Re: [POC] Fast COPY FROM command for the table with foreign partitions
Date
Msg-id cf1e92eb-9a8e-f5c1-f0cc-598882f1e546@postgrespro.ru
Whole thread Raw
In response to Re: [POC] Fast COPY FROM command for the table with foreign partitions  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: [POC] Fast COPY FROM command for the table with foreign partitions
List pgsql-hackers
On 7/16/20 2:14 PM, Amit Langote wrote:
> Hi Andrey,
> 
> Thanks for this work.  I have been reading through your patch and
> here's a what I understand it does and how:
> 
> The patch aims to fix the restriction that COPYing into a foreign
> table can't use multi-insert buffer mechanism effectively.  That's
> because copy.c currently uses the ExecForeignInsert() FDW API which
> can be passed only 1 row at a time.  postgres_fdw's implementation
> issues an `INSERT INTO remote_table VALUES (...)` statement to the
> remote side for each row, which is pretty inefficient for bulk loads.
> The patch introduces a new FDW API ExecForeignCopyIn() that can
> receive multiple rows and copy.c now calls it every time it flushes
> the multi-insert buffer so that all the flushed rows can be sent to
> the remote side in one go.  postgres_fdw's now issues a `COPY
> remote_table FROM STDIN` to the remote server and
> postgresExecForeignCopyIn() funnels the tuples flushed by the local
> copy to the server side waiting for tuples on the COPY protocol.

Fine

> Here are some comments on the patch.
> 
> * Why the "In" in these API names?
> 
> +   /* COPY a bulk of tuples into a foreign relation */
> +   BeginForeignCopyIn_function BeginForeignCopyIn;
> +   EndForeignCopyIn_function EndForeignCopyIn;
> +   ExecForeignCopyIn_function ExecForeignCopyIn;

I used an analogy from copy.c.

> * fdwhandler.sgml should be updated with the description of these new APIs.


> * As far as I can tell, the following copy.h additions are for an FDW
> to use copy.c to obtain an external representation (char string) to
> send to the remote side of the individual rows that are passed to
> ExecForeignCopyIn():
> 
> +typedef void (*copy_data_dest_cb) (void *outbuf, int len);
> +extern CopyState BeginForeignCopyTo(Relation rel);
> +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot);
> +extern void EndForeignCopyTo(CopyState cstate);
> 
> So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow()
> which in turn calls copy.c: CopyOneRowTo() which fills
> CopyState.fe_msgbuf.  The data_dest_cb() callback that runs after
> fe_msgbuf contains the full row simply copies it into a palloc'd char
> buffer whose pointer is returned back to ExecForeignCopyIn().  I
> wonder why not let FDWs implement the callback and pass it to copy.c
> through BeginForeignCopyTo()?  For example, you could implement a
> pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct
> pointer of fe_msgbuf to send it to the remote server.
It is good point! Thank you.

> Do you think all FDWs would want to use copy,c like above?  If not,
> maybe the above APIs are really postgres_fdw-specific?  Anyway, adding
> comments above the definitions of these functions would be helpful.
Agreed
> 
> * I see that the remote copy is performed from scratch on every call
> of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> send the `COPY remote_table FROM STDIN` in
> postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> when there are no errors during the copy?
It is not possible. FDW share one connection between all foreign 
relations from a server. If two or more partitions will be placed at one 
foreign server you will have problems with concurrent COPY command. May 
be we can create new connection for each partition?
> 
> I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> and sending `COPY remote_table FROM STDIN` only once instead of on
> every flush -- and I see significant speedup.  Please check the
> attached patch that applies on top of yours.
I integrated first change and rejected the second by the reason as above.
   One problem I spotted
> when trying my patch but didn't spend much time debugging is that
> local COPY cannot be interrupted by Ctrl+C anymore, but that should be
> fixable by adjusting PG_TRY() blocks.
Thanks
> 
> * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency.
+1

I will post a new version of the patch a little bit later.

-- 
regards,
Andrey Lepikhov
Postgres Professional



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Parallel Seq Scan vs kernel read ahead
Next
From: Amit Kapila
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions