Re: [POC] Fast COPY FROM command for the table with foreign partitions - Mailing list pgsql-hackers
From | Andrey V. Lepikhov |
---|---|
Subject | Re: [POC] Fast COPY FROM command for the table with foreign partitions |
Date | |
Msg-id | cf1e92eb-9a8e-f5c1-f0cc-598882f1e546@postgrespro.ru Whole thread Raw |
In response to | Re: [POC] Fast COPY FROM command for the table with foreign partitions (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: [POC] Fast COPY FROM command for the table with foreign partitions
|
List | pgsql-hackers |
On 7/16/20 2:14 PM, Amit Langote wrote: > Hi Andrey, > > Thanks for this work. I have been reading through your patch and > here's a what I understand it does and how: > > The patch aims to fix the restriction that COPYing into a foreign > table can't use multi-insert buffer mechanism effectively. That's > because copy.c currently uses the ExecForeignInsert() FDW API which > can be passed only 1 row at a time. postgres_fdw's implementation > issues an `INSERT INTO remote_table VALUES (...)` statement to the > remote side for each row, which is pretty inefficient for bulk loads. > The patch introduces a new FDW API ExecForeignCopyIn() that can > receive multiple rows and copy.c now calls it every time it flushes > the multi-insert buffer so that all the flushed rows can be sent to > the remote side in one go. postgres_fdw's now issues a `COPY > remote_table FROM STDIN` to the remote server and > postgresExecForeignCopyIn() funnels the tuples flushed by the local > copy to the server side waiting for tuples on the COPY protocol. Fine > Here are some comments on the patch. > > * Why the "In" in these API names? > > + /* COPY a bulk of tuples into a foreign relation */ > + BeginForeignCopyIn_function BeginForeignCopyIn; > + EndForeignCopyIn_function EndForeignCopyIn; > + ExecForeignCopyIn_function ExecForeignCopyIn; I used an analogy from copy.c. > * fdwhandler.sgml should be updated with the description of these new APIs. > * As far as I can tell, the following copy.h additions are for an FDW > to use copy.c to obtain an external representation (char string) to > send to the remote side of the individual rows that are passed to > ExecForeignCopyIn(): > > +typedef void (*copy_data_dest_cb) (void *outbuf, int len); > +extern CopyState BeginForeignCopyTo(Relation rel); > +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot); > +extern void EndForeignCopyTo(CopyState cstate); > > So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow() > which in turn calls copy.c: CopyOneRowTo() which fills > CopyState.fe_msgbuf. The data_dest_cb() callback that runs after > fe_msgbuf contains the full row simply copies it into a palloc'd char > buffer whose pointer is returned back to ExecForeignCopyIn(). I > wonder why not let FDWs implement the callback and pass it to copy.c > through BeginForeignCopyTo()? For example, you could implement a > pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct > pointer of fe_msgbuf to send it to the remote server. It is good point! Thank you. > Do you think all FDWs would want to use copy,c like above? If not, > maybe the above APIs are really postgres_fdw-specific? Anyway, adding > comments above the definitions of these functions would be helpful. Agreed > > * I see that the remote copy is performed from scratch on every call > of postgresExecForeignCopyIn(), but wouldn't it be more efficient to > send the `COPY remote_table FROM STDIN` in > postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() > when there are no errors during the copy? It is not possible. FDW share one connection between all foreign relations from a server. If two or more partitions will be placed at one foreign server you will have problems with concurrent COPY command. May be we can create new connection for each partition? > > I tried implementing these two changes -- pgfdw_copy_data_dest_cb() > and sending `COPY remote_table FROM STDIN` only once instead of on > every flush -- and I see significant speedup. Please check the > attached patch that applies on top of yours. I integrated first change and rejected the second by the reason as above. One problem I spotted > when trying my patch but didn't spend much time debugging is that > local COPY cannot be interrupted by Ctrl+C anymore, but that should be > fixable by adjusting PG_TRY() blocks. Thanks > > * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency. +1 I will post a new version of the patch a little bit later. -- regards, Andrey Lepikhov Postgres Professional
pgsql-hackers by date: