Thread: [POC] Fast COPY FROM command for the table with foreign partitions

[POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

01 June 2020, 09:29:23

Hi, hackers!

Currently i see, COPY FROM insertion into the partitioned table with 
foreign partitions is not optimal: even if table constraints allows can 
do multi insert copy, we will flush the buffers and prepare new INSERT 
query for each tuple, routed into the foreign partition.
To solve this problem i tried to use the multi insert buffers for 
foreign tuples too. Flushing of these buffers performs by the analogy 
with 'COPY .. FROM STDIN' machinery as it is done by the psql '\copy' 
command.
The patch in attachment was prepared from the private scratch developed 
by Arseny Sher a couple of years ago.
Benchmarks shows that it speeds up COPY FROM operation:
Command "COPY pgbench_accounts FROM ..." (test file contains 1e7 tuples, 
copy to three partitions) executes on my laptop in 14 minutes without 
the patch and in 1.5 minutes with the patch. Theoretical minimum here 
(with infinite buffer size) is 40 seconds.

A couple of questions:
1. Can this feature be interesting for the PostgreSQL core or not?
2. If this is a useful feature, is the correct way chosen?

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Attachment

0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Etsuro Fujita

Date:

02 June 2020, 00:02:10

Hi Andrey,

On Mon, Jun 1, 2020 at 6:29 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Currently i see, COPY FROM insertion into the partitioned table with
> foreign partitions is not optimal: even if table constraints allows can
> do multi insert copy, we will flush the buffers and prepare new INSERT
> query for each tuple, routed into the foreign partition.
> To solve this problem i tried to use the multi insert buffers for
> foreign tuples too. Flushing of these buffers performs by the analogy
> with 'COPY .. FROM STDIN' machinery as it is done by the psql '\copy'
> command.
> The patch in attachment was prepared from the private scratch developed
> by Arseny Sher a couple of years ago.
> Benchmarks shows that it speeds up COPY FROM operation:
> Command "COPY pgbench_accounts FROM ..." (test file contains 1e7 tuples,
> copy to three partitions) executes on my laptop in 14 minutes without
> the patch and in 1.5 minutes with the patch. Theoretical minimum here
> (with infinite buffer size) is 40 seconds.

Great!

> A couple of questions:
> 1. Can this feature be interesting for the PostgreSQL core or not?

Yeah, I think this is especially useful for sharding.

> 2. If this is a useful feature, is the correct way chosen?

I think I also thought something similar to this before [1].  Will take a look.

Thanks!

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp

Re: [POC] Fast COPY FROM command for the table with foreignpartitions

From

Andrey Lepikhov

Date:

02 June 2020, 05:51:22

Thank you for the answer,

02.06.2020 05:02, Etsuro Fujita пишет:
> I think I also thought something similar to this before [1]. Will take a look.

> [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp
>
I have looked into the thread.
My first version of the patch was like your idea. But when developing
the “COPY FROM” code, the following features were discovered:
1. Two or more partitions can be placed at the same node. We need to
finish COPY into one partition before start COPY into another partition
at the same node.
2. On any error we need to send EOF to all started "COPY .. FROM STDIN"
operations. Otherwise FDW can't cancel operation.

Hiding the COPY code under the buffers management machinery allows us to
generalize buffers machinery, execute one COPY operation on each buffer
and simplify error handling.

As i understand, main idea of the thread, mentioned by you, is to add
"COPY FROM" support without changes in FDW API.
It is possible to remove BeginForeignCopy() and EndForeignCopy() from
the patch. But it is not trivial to change ExecForeignInsert() for the
COPY purposes.
All that I can offer in this place now is to introduce one new
ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
STDIN" operation, send tuples and close the operation. We can use the
ExecForeignInsert() routine for each buffer tuple if
ExecForeignBulkInsert() is not supported.

One of main questions here is to use COPY TO machinery for serializing a
tuple. It is needed (if you will take a look into the patch) to
transform the CopyTo() routine to an iterative representation:
start/next/finish. May it be acceptable?

In the attachment there is a patch with the correction of a stupid error.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Attachment

0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Ashutosh Bapat

Date:

15 June 2020, 05:26:12

Thanks Andrey for the patch. I am glad that the patch has taken care
of some corner cases already but there exist still more.

COPY command constructed doesn't take care of dropped columns. There
is code in deparseAnalyzeSql which constructs list of columns for a
given foreign relation. 0002 patch attached here, moves that code to a
separate function and reuses it for COPY. If you find that code change
useful please include it in the main patch.

While working on that, I found two issues
1. The COPY command constructed an empty columns list when there were
no non-dropped columns in the relation. This caused a syntax error.
Fixed that in 0002.
2. In the same case, if the foreign table declared locally didn't have
any non-dropped columns but the relation that it referred to on the
foreign server had some non-dropped columns, COPY command fails. I
added a test case for this in 0002 but haven't fixed it.

I think this work is useful. Please add it to the next commitfest so
that it's tracked.

On Tue, Jun 2, 2020 at 11:21 AM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
>
> Thank you for the answer,
>
> 02.06.2020 05:02, Etsuro Fujita пишет:
> > I think I also thought something similar to this before [1].  Will take a look.
>
> > [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp
> >
> I have looked into the thread.
> My first version of the patch was like your idea. But when developing
> the “COPY FROM” code, the following features were discovered:
> 1. Two or more partitions can be placed at the same node. We need to
> finish COPY into one partition before start COPY into another partition
> at the same node.
> 2. On any error we need to send EOF to all started "COPY .. FROM STDIN"
> operations. Otherwise FDW can't cancel operation.
>
> Hiding the COPY code under the buffers management machinery allows us to
> generalize buffers machinery, execute one COPY operation on each buffer
> and simplify error handling.
>
> As i understand, main idea of the thread, mentioned by you, is to add
> "COPY FROM" support without changes in FDW API.
> It is possible to remove BeginForeignCopy() and EndForeignCopy() from
> the patch. But it is not trivial to change ExecForeignInsert() for the
> COPY purposes.
> All that I can offer in this place now is to introduce one new
> ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
> STDIN" operation, send tuples and close the operation. We can use the
> ExecForeignInsert() routine for each buffer tuple if
> ExecForeignBulkInsert() is not supported.
>
> One of main questions here is to use COPY TO machinery for serializing a
> tuple. It is needed (if you will take a look into the patch) to
> transform the CopyTo() routine to an iterative representation:
> start/next/finish. May it be acceptable?
>
> In the attachment there is a patch with the correction of a stupid error.
>
> --
> Andrey Lepikhov
> Postgres Professional
> https://postgrespro.com
> The Russian Postgres Company

--
Best Wishes,
Ashutosh Bapat

Attachment

Re: [POC] Fast COPY FROM command for the table with foreignpartitions

From

"Andrey V. Lepikhov"

Date:

17 June 2020, 06:24:48

On 6/15/20 10:26 AM, Ashutosh Bapat wrote:
> Thanks Andrey for the patch. I am glad that the patch has taken care
> of some corner cases already but there exist still more.
> 
> COPY command constructed doesn't take care of dropped columns. There
> is code in deparseAnalyzeSql which constructs list of columns for a
> given foreign relation. 0002 patch attached here, moves that code to a
> separate function and reuses it for COPY. If you find that code change
> useful please include it in the main patch.

Thanks, i included it.

> 2. In the same case, if the foreign table declared locally didn't have
> any non-dropped columns but the relation that it referred to on the
> foreign server had some non-dropped columns, COPY command fails. I
> added a test case for this in 0002 but haven't fixed it.

I fixed it.
This is very special corner case. The problem was that COPY FROM does 
not support semantics like the "INSERT INTO .. DEFAULT VALUES". To 
simplify the solution, i switched off bulk copying for this case.

 > I think this work is useful. Please add it to the next commitfest so
 > that it's tracked.
Ok.

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com

Attachment

v2-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Etsuro Fujita

Date:

19 June 2020, 14:58:02

On Tue, Jun 2, 2020 at 2:51 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> 02.06.2020 05:02, Etsuro Fujita пишет:
> > I think I also thought something similar to this before [1].  Will take a look.

I'm still reviewing the patch, but let me comment on it.

> > [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp

> I have looked into the thread.
> My first version of the patch was like your idea. But when developing
> the “COPY FROM” code, the following features were discovered:
> 1. Two or more partitions can be placed at the same node. We need to
> finish COPY into one partition before start COPY into another partition
> at the same node.
> 2. On any error we need to send EOF to all started "COPY .. FROM STDIN"
> operations. Otherwise FDW can't cancel operation.
>
> Hiding the COPY code under the buffers management machinery allows us to
> generalize buffers machinery, execute one COPY operation on each buffer
> and simplify error handling.

I'm not sure that it's really a good idea that the bulk-insert API is
designed the way it's tightly coupled with the bulk-insert machinery
in the core, because 1) some FDWs might want to send tuples provided
by the core to the remote, one by one, without storing them in a
buffer, or 2) some other FDWs might want to store the tuples in the
buffer and send them in a lump as postgres_fdw in the proposed patch
but might want to do so independently of MAX_BUFFERED_TUPLES and/or
MAX_BUFFERED_BYTES defined in the bulk-insert machinery.

I agree that we would need special handling for cases you mentioned
above if we design this API based on something like the idea I
proposed in that thread.

> As i understand, main idea of the thread, mentioned by you, is to add
> "COPY FROM" support without changes in FDW API.

I don't think so; I think we should introduce new API for this feature
to keep the ExecForeignInsert() API simple.

> All that I can offer in this place now is to introduce one new
> ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
> STDIN" operation, send tuples and close the operation. We can use the
> ExecForeignInsert() routine for each buffer tuple if
> ExecForeignBulkInsert() is not supported.

Agreed.

> One of main questions here is to use COPY TO machinery for serializing a
> tuple. It is needed (if you will take a look into the patch) to
> transform the CopyTo() routine to an iterative representation:
> start/next/finish. May it be acceptable?

+1 for the general idea.

> In the attachment there is a patch with the correction of a stupid error.

Thanks for the patch!

Sorry for the delay.

Best regards,
Etsuro Fujita

Re: [POC] Fast COPY FROM command for the table with foreignpartitions

From

Andrey Lepikhov

Date:

22 June 2020, 06:13:30


19.06.2020 19:58, Etsuro Fujita пишет:
> On Tue, Jun 2, 2020 at 2:51 PM Andrey Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> Hiding the COPY code under the buffers management machinery allows us to
>> generalize buffers machinery, execute one COPY operation on each buffer
>> and simplify error handling.
> 
> I'm not sure that it's really a good idea that the bulk-insert API is
> designed the way it's tightly coupled with the bulk-insert machinery
> in the core, because 1) some FDWs might want to send tuples provided
> by the core to the remote, one by one, without storing them in a
> buffer, or 2) some other FDWs might want to store the tuples in the
> buffer and send them in a lump as postgres_fdw in the proposed patch
> but might want to do so independently of MAX_BUFFERED_TUPLES and/or
> MAX_BUFFERED_BYTES defined in the bulk-insert machinery.
> 
> I agree that we would need special handling for cases you mentioned
> above if we design this API based on something like the idea I
> proposed in that thread.
Agreed
> 
>> As i understand, main idea of the thread, mentioned by you, is to add
>> "COPY FROM" support without changes in FDW API.
> 
> I don't think so; I think we should introduce new API for this feature
> to keep the ExecForeignInsert() API simple.
Ok
> 
>> All that I can offer in this place now is to introduce one new
>> ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM
>> STDIN" operation, send tuples and close the operation. We can use the
>> ExecForeignInsert() routine for each buffer tuple if
>> ExecForeignBulkInsert() is not supported.
> 
> Agreed.
In the next version (see attachment) of the patch i removed Begin/End 
fdwapi routines. Now we have only the ExecForeignBulkInsert() routine.

-- 
Andrey Lepikhov
Postgres Professional

Attachment

v3-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Ashutosh Bapat

Date:

22 June 2020, 12:11:05

On Wed, 17 Jun 2020 at 11:54, Andrey V. Lepikhov <a.lepikhov@postgrespro.ru> wrote:

On 6/15/20 10:26 AM, Ashutosh Bapat wrote:
> Thanks Andrey for the patch. I am glad that the patch has taken care
> of some corner cases already but there exist still more.
>
> COPY command constructed doesn't take care of dropped columns. There
> is code in deparseAnalyzeSql which constructs list of columns for a
> given foreign relation. 0002 patch attached here, moves that code to a
> separate function and reuses it for COPY. If you find that code change
> useful please include it in the main patch.

Thanks, i included it.

> 2. In the same case, if the foreign table declared locally didn't have
> any non-dropped columns but the relation that it referred to on the
> foreign server had some non-dropped columns, COPY command fails. I
> added a test case for this in 0002 but haven't fixed it.

I fixed it.
This is very special corner case. The problem was that COPY FROM does
not support semantics like the "INSERT INTO .. DEFAULT VALUES". To
simplify the solution, i switched off bulk copying for this case.

> I think this work is useful. Please add it to the next commitfest so
> that it's tracked.
Ok.

It looks like we call BeginForeignInsert and EndForeignInsert even though actual copy is performed using BeginForeignCopy, ExecForeignCopy and EndForeignCopy. BeginForeignInsert constructs the INSERT query which looks unnecessary. Also some of the other PgFdwModifyState members are initialized unnecessarily. It also gives an impression that we are using INSERT underneath the copy. Instead a better way would be to call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy instead of EndForeignInsert, if we are going to use COPY protocol to copy data to the foreign server. Corresponding postgres_fdw implementations need to change in order to do that.

This isn't a full review. I will continue reviewing this patch further.

Best Wishes,

Ashutosh

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

03 July 2020, 05:15:46

On 6/22/20 5:11 PM, Ashutosh Bapat wrote:

> <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:
> It looks like we call BeginForeignInsert and EndForeignInsert even 
> though actual copy is performed using BeginForeignCopy, ExecForeignCopy 
> and EndForeignCopy. BeginForeignInsert constructs the INSERT query which 
> looks unnecessary. Also some of the other PgFdwModifyState members are 
> initialized unnecessarily. It also gives an impression that we are using 
> INSERT underneath the copy. Instead a better way would be to 
> call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy 
> instead of EndForeignInsert, if we are going to use COPY protocol to 
> copy data to the foreign server. Corresponding postgres_fdw 
> implementations need to change in order to do that.
> 
I did not answer for a long time, because of waiting for the results of 
the discussion on Tomas approach to bulk INSERT/UPDATE/DELETE. It seems 
more general.
I can move the query construction into the first execution of INSERT or 
COPY operation. But another changes seems more invasive because 
BeginForeignInsert/EndForeignInsert are used in the execPartition.c 
module. We will need to pass copy/insert state of operation into 
ExecFindPartition() and ExecCleanupTupleRouting().

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

12 July 2020, 17:46:16


22.06.2020 17:11, Ashutosh Bapat пишет:
> 
> 
> On Wed, 17 Jun 2020 at 11:54, Andrey V. Lepikhov 
> <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:
> 
>     On 6/15/20 10:26 AM, Ashutosh Bapat wrote:
>      > Thanks Andrey for the patch. I am glad that the patch has taken care
>      > of some corner cases already but there exist still more.
>      >
>      > COPY command constructed doesn't take care of dropped columns. There
>      > is code in deparseAnalyzeSql which constructs list of columns for a
>      > given foreign relation. 0002 patch attached here, moves that code
>     to a
>      > separate function and reuses it for COPY. If you find that code
>     change
>      > useful please include it in the main patch.
> 
>     Thanks, i included it.
> 
>      > 2. In the same case, if the foreign table declared locally didn't
>     have
>      > any non-dropped columns but the relation that it referred to on the
>      > foreign server had some non-dropped columns, COPY command fails. I
>      > added a test case for this in 0002 but haven't fixed it.
> 
>     I fixed it.
>     This is very special corner case. The problem was that COPY FROM does
>     not support semantics like the "INSERT INTO .. DEFAULT VALUES". To
>     simplify the solution, i switched off bulk copying for this case.
> 
>       > I think this work is useful. Please add it to the next commitfest so
>       > that it's tracked.
>     Ok.
> 
> 
> It looks like we call BeginForeignInsert and EndForeignInsert even 
> though actual copy is performed using BeginForeignCopy, ExecForeignCopy 
> and EndForeignCopy. BeginForeignInsert constructs the INSERT query which 
> looks unnecessary. Also some of the other PgFdwModifyState members are 
> initialized unnecessarily. It also gives an impression that we are using 
> INSERT underneath the copy. Instead a better way would be to 
> call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy 
> instead of EndForeignInsert, if we are going to use COPY protocol to 
> copy data to the foreign server. Corresponding postgres_fdw 
> implementations need to change in order to do that.
Fixed.
I replaced names of CopyIn FDW API. Also the partition routing 
initializer calls BeginForeignInsert or BeginForeignCopyIn routines in 
accordance with value of ResultRelInfo::UseBulkModifying.
I introduced this parameter because foreign partitions can be placed at 
foreign servers with different types of foreign wrapper. Not all 
wrappers can support CopyIn API.
Also I ran the Tomas Vondra benchmark. At my laptop we have results:
* regular: 5000 ms.
* Tomas buffering patch: 11000 ms.
* This CopyIn patch: 8000 ms.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v4-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

16 July 2020, 09:14:04

Hi Andrey,

Thanks for this work.  I have been reading through your patch and
here's a what I understand it does and how:

The patch aims to fix the restriction that COPYing into a foreign
table can't use multi-insert buffer mechanism effectively.  That's
because copy.c currently uses the ExecForeignInsert() FDW API which
can be passed only 1 row at a time.  postgres_fdw's implementation
issues an `INSERT INTO remote_table VALUES (...)` statement to the
remote side for each row, which is pretty inefficient for bulk loads.
The patch introduces a new FDW API ExecForeignCopyIn() that can
receive multiple rows and copy.c now calls it every time it flushes
the multi-insert buffer so that all the flushed rows can be sent to
the remote side in one go.  postgres_fdw's now issues a `COPY
remote_table FROM STDIN` to the remote server and
postgresExecForeignCopyIn() funnels the tuples flushed by the local
copy to the server side waiting for tuples on the COPY protocol.

Here are some comments on the patch.

* Why the "In" in these API names?

+   /* COPY a bulk of tuples into a foreign relation */
+   BeginForeignCopyIn_function BeginForeignCopyIn;
+   EndForeignCopyIn_function EndForeignCopyIn;
+   ExecForeignCopyIn_function ExecForeignCopyIn;

* fdwhandler.sgml should be updated with the description of these new APIs.

* As far as I can tell, the following copy.h additions are for an FDW
to use copy.c to obtain an external representation (char string) to
send to the remote side of the individual rows that are passed to
ExecForeignCopyIn():

+typedef void (*copy_data_dest_cb) (void *outbuf, int len);
+extern CopyState BeginForeignCopyTo(Relation rel);
+extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot);
+extern void EndForeignCopyTo(CopyState cstate);

So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow()
which in turn calls copy.c: CopyOneRowTo() which fills
CopyState.fe_msgbuf.  The data_dest_cb() callback that runs after
fe_msgbuf contains the full row simply copies it into a palloc'd char
buffer whose pointer is returned back to ExecForeignCopyIn().  I
wonder why not let FDWs implement the callback and pass it to copy.c
through BeginForeignCopyTo()?  For example, you could implement a
pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct
pointer of fe_msgbuf to send it to the remote server.

Do you think all FDWs would want to use copy,c like above?  If not,
maybe the above APIs are really postgres_fdw-specific?  Anyway, adding
comments above the definitions of these functions would be helpful.

* I see that the remote copy is performed from scratch on every call
of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
send the `COPY remote_table FROM STDIN` in
postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
when there are no errors during the copy?

I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
and sending `COPY remote_table FROM STDIN` only once instead of on
every flush -- and I see significant speedup.  Please check the
attached patch that applies on top of yours.  One problem I spotted
when trying my patch but didn't spend much time debugging is that
local COPY cannot be interrupted by Ctrl+C anymore, but that should be
fixable by adjusting PG_TRY() blocks.

* ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgfdw-copy-buffering-amit-suggests.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

22 July 2020, 09:09:51

On 7/16/20 2:14 PM, Amit Langote wrote:
> Hi Andrey,
> 
> Thanks for this work.  I have been reading through your patch and
> here's a what I understand it does and how:
> 
> The patch aims to fix the restriction that COPYing into a foreign
> table can't use multi-insert buffer mechanism effectively.  That's
> because copy.c currently uses the ExecForeignInsert() FDW API which
> can be passed only 1 row at a time.  postgres_fdw's implementation
> issues an `INSERT INTO remote_table VALUES (...)` statement to the
> remote side for each row, which is pretty inefficient for bulk loads.
> The patch introduces a new FDW API ExecForeignCopyIn() that can
> receive multiple rows and copy.c now calls it every time it flushes
> the multi-insert buffer so that all the flushed rows can be sent to
> the remote side in one go.  postgres_fdw's now issues a `COPY
> remote_table FROM STDIN` to the remote server and
> postgresExecForeignCopyIn() funnels the tuples flushed by the local
> copy to the server side waiting for tuples on the COPY protocol.

Fine

> Here are some comments on the patch.
> 
> * Why the "In" in these API names?
> 
> +   /* COPY a bulk of tuples into a foreign relation */
> +   BeginForeignCopyIn_function BeginForeignCopyIn;
> +   EndForeignCopyIn_function EndForeignCopyIn;
> +   ExecForeignCopyIn_function ExecForeignCopyIn;

I used an analogy from copy.c.

> * fdwhandler.sgml should be updated with the description of these new APIs.


> * As far as I can tell, the following copy.h additions are for an FDW
> to use copy.c to obtain an external representation (char string) to
> send to the remote side of the individual rows that are passed to
> ExecForeignCopyIn():
> 
> +typedef void (*copy_data_dest_cb) (void *outbuf, int len);
> +extern CopyState BeginForeignCopyTo(Relation rel);
> +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot);
> +extern void EndForeignCopyTo(CopyState cstate);
> 
> So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow()
> which in turn calls copy.c: CopyOneRowTo() which fills
> CopyState.fe_msgbuf.  The data_dest_cb() callback that runs after
> fe_msgbuf contains the full row simply copies it into a palloc'd char
> buffer whose pointer is returned back to ExecForeignCopyIn().  I
> wonder why not let FDWs implement the callback and pass it to copy.c
> through BeginForeignCopyTo()?  For example, you could implement a
> pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct
> pointer of fe_msgbuf to send it to the remote server.
It is good point! Thank you.

> Do you think all FDWs would want to use copy,c like above?  If not,
> maybe the above APIs are really postgres_fdw-specific?  Anyway, adding
> comments above the definitions of these functions would be helpful.
Agreed
> 
> * I see that the remote copy is performed from scratch on every call
> of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> send the `COPY remote_table FROM STDIN` in
> postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> when there are no errors during the copy?
It is not possible. FDW share one connection between all foreign 
relations from a server. If two or more partitions will be placed at one 
foreign server you will have problems with concurrent COPY command. May 
be we can create new connection for each partition?
> 
> I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> and sending `COPY remote_table FROM STDIN` only once instead of on
> every flush -- and I see significant speedup.  Please check the
> attached patch that applies on top of yours.
I integrated first change and rejected the second by the reason as above.
   One problem I spotted
> when trying my patch but didn't spend much time debugging is that
> local COPY cannot be interrupted by Ctrl+C anymore, but that should be
> fixable by adjusting PG_TRY() blocks.
Thanks
> 
> * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency.
+1

I will post a new version of the patch a little bit later.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

23 July 2020, 06:23:42

On 7/16/20 2:14 PM, Amit Langote wrote:
> Amit Langote
> EnterpriseDB: http://www.enterprisedb.com
> 

Version 5 of the patch. With changes caused by Amit's comments.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v5-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Alexey Kondratov

Date:

27 July 2020, 16:34:46

Hi Andrey,

On 2020-07-23 09:23, Andrey V. Lepikhov wrote:
> On 7/16/20 2:14 PM, Amit Langote wrote:
>> Amit Langote
>> EnterpriseDB: http://www.enterprisedb.com
>> 
> 
> Version 5 of the patch. With changes caused by Amit's comments.

Just got a segfault with your v5 patch by deleting from a foreign table. 
Here is a part of backtrace:

   * frame #0: 0x00000001029069ec 
postgres`ExecShutdownForeignScan(node=0x00007ff28c8909b0) at 
nodeForeignscan.c:385:3
     frame #1: 0x00000001028e7b06 
postgres`ExecShutdownNode(node=0x00007ff28c8909b0) at 
execProcnode.c:779:4
     frame #2: 0x000000010299b3fa 
postgres`planstate_walk_members(planstates=0x00007ff28c8906d8, nplans=1, 
walker=(postgres`ExecShutdownNode at execProcnode.c:752), 
context=0x0000000000000000) at nodeFuncs.c:3998:7
     frame #3: 0x000000010299b010 
postgres`planstate_tree_walker(planstate=0x00007ff28c8904c0, 
walker=(postgres`ExecShutdownNode at execProcnode.c:752), 
context=0x0000000000000000) at nodeFuncs.c:3914:8
     frame #4: 0x00000001028e7ab7 
postgres`ExecShutdownNode(node=0x00007ff28c8904c0) at 
execProcnode.c:771:2

(lldb) f 0
frame #0: 0x00000001029069ec 
postgres`ExecShutdownForeignScan(node=0x00007ff28c8909b0) at 
nodeForeignscan.c:385:3
    382         FdwRoutine *fdwroutine = node->fdwroutine;
    383
    384         if (fdwroutine->ShutdownForeignScan)
-> 385             fdwroutine->ShutdownForeignScan(node);
    386     }
(lldb) p node->fdwroutine->ShutdownForeignScan
(ShutdownForeignScan_function) $1 = 0x7f7f7f7f7f7f7f7f

It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a 
correct pointer to the required function.

I haven't had a chance to look closer on the code, but you can easily 
reproduce this error with the attached script (patched Postgres binaries 
should be available in the PATH). It works well with master and fails 
with your patch applied.


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Attachment

setup2.sh

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

28 July 2020, 00:33:19


27.07.2020 21:34, Alexey Kondratov пишет:
> Hi Andrey,
> 
> On 2020-07-23 09:23, Andrey V. Lepikhov wrote:
>> On 7/16/20 2:14 PM, Amit Langote wrote:
>>> Amit Langote
>>> EnterpriseDB: http://www.enterprisedb.com
>>>
>>
>> Version 5 of the patch. With changes caused by Amit's comments.
> 
> Just got a segfault with your v5 patch by deleting from a foreign table. 
> It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a 
> correct pointer to the required function.
> 
> I haven't had a chance to look closer on the code, but you can easily 
> reproduce this error with the attached script (patched Postgres binaries 
> should be available in the PATH). It works well with master and fails 
> with your patch applied.

I used master a3ab7a707d and v5 version of the patch with your script.
No errors found. Can you check your test case?

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Alexey Kondratov

Date:

28 July 2020, 07:54:22

On 2020-07-28 03:33, Andrey Lepikhov wrote:
> 27.07.2020 21:34, Alexey Kondratov пишет:
>> Hi Andrey,
>> 
>> On 2020-07-23 09:23, Andrey V. Lepikhov wrote:
>>> On 7/16/20 2:14 PM, Amit Langote wrote:
>>>> Amit Langote
>>>> EnterpriseDB: http://www.enterprisedb.com
>>>> 
>>> 
>>> Version 5 of the patch. With changes caused by Amit's comments.
>> 
>> Just got a segfault with your v5 patch by deleting from a foreign 
>> table. It seems that ShutdownForeignScan inside node->fdwroutine 
>> doesn't have a correct pointer to the required function.
>> 
>> I haven't had a chance to look closer on the code, but you can easily 
>> reproduce this error with the attached script (patched Postgres 
>> binaries should be available in the PATH). It works well with master 
>> and fails with your patch applied.
> 
> I used master a3ab7a707d and v5 version of the patch with your script.
> No errors found. Can you check your test case?

Yes, my bad. I forgot to re-install postgres_fdw extension, only did it 
for postgres core, sorry for disturb.


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

29 July 2020, 08:03:02

Hi Andrey,

Thanks for updating the patch.  I will try to take a look later.

On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 7/16/20 2:14 PM, Amit Langote wrote:
> > * Why the "In" in these API names?
> >
> > +   /* COPY a bulk of tuples into a foreign relation */
> > +   BeginForeignCopyIn_function BeginForeignCopyIn;
> > +   EndForeignCopyIn_function EndForeignCopyIn;
> > +   ExecForeignCopyIn_function ExecForeignCopyIn;
>
> I used an analogy from copy.c.

Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it
makes sense to have "In" here, but maybe we don't, so how about
leaving out the "In" for clarity?

> > * I see that the remote copy is performed from scratch on every call
> > of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> > send the `COPY remote_table FROM STDIN` in
> > postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> > when there are no errors during the copy?
>
> It is not possible. FDW share one connection between all foreign
> relations from a server. If two or more partitions will be placed at one
> foreign server you will have problems with concurrent COPY command.

Ah, you're right.  I didn't consider multiple foreign partitions
pointing to the same server.  Indeed, we would need separate
connections to a given server to COPY to multiple remote relations on
that server in parallel.

> May be we can create new connection for each partition?

Yeah, perhaps, although it sounds like something that might be more
generally useful and so we should work on that separately if at all.

> > I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> > and sending `COPY remote_table FROM STDIN` only once instead of on
> > every flush -- and I see significant speedup.  Please check the
> > attached patch that applies on top of yours.
>
> I integrated first change and rejected the second by the reason as above.

Thanks.

Will send more comments after reading the v5 patch.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

29 July 2020, 08:36:01

On 7/29/20 1:03 PM, Amit Langote wrote:
> Hi Andrey,
> 
> Thanks for updating the patch.  I will try to take a look later.
> 
> On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 7/16/20 2:14 PM, Amit Langote wrote:
>>> * Why the "In" in these API names?
>>>
>>> +   /* COPY a bulk of tuples into a foreign relation */
>>> +   BeginForeignCopyIn_function BeginForeignCopyIn;
>>> +   EndForeignCopyIn_function EndForeignCopyIn;
>>> +   ExecForeignCopyIn_function ExecForeignCopyIn;
>>
>> I used an analogy from copy.c.
> 
> Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it
> makes sense to have "In" here, but maybe we don't, so how about
> leaving out the "In" for clarity?
Ok, sounds good.
> 
>>> * I see that the remote copy is performed from scratch on every call
>>> of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
>>> send the `COPY remote_table FROM STDIN` in
>>> postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
>>> when there are no errors during the copy?
>>
>> It is not possible. FDW share one connection between all foreign
>> relations from a server. If two or more partitions will be placed at one
>> foreign server you will have problems with concurrent COPY command.
> 
> Ah, you're right.  I didn't consider multiple foreign partitions
> pointing to the same server.  Indeed, we would need separate
> connections to a given server to COPY to multiple remote relations on
> that server in parallel.
> 
>> May be we can create new connection for each partition?
> 
> Yeah, perhaps, although it sounds like something that might be more
> generally useful and so we should work on that separately if at all.
I will try to prepare a separate patch.
> 
>>> I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
>>> and sending `COPY remote_table FROM STDIN` only once instead of on
>>> every flush -- and I see significant speedup.  Please check the
>>> attached patch that applies on top of yours.
>>
>> I integrated first change and rejected the second by the reason as above.
> 
> Thanks.
> 
> Will send more comments after reading the v5 patch.
> 
Ok. I'll be waiting for the end of your review.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

03 August 2020, 11:38:57

Hi Andrey,

On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> > Will send more comments after reading the v5 patch.
> >
> Ok. I'll be waiting for the end of your review.

Sorry about the late reply.

If you'd like to send a new version for other reviewers, please feel
free.  I haven't managed to take more than a brief look at the v5
patch, but will try to look at it (or maybe the new version if you
post) more closely this week.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

07 August 2020, 09:14:41

On Mon, Aug 3, 2020 at 8:38 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
> > > Will send more comments after reading the v5 patch.
> > >
> > Ok. I'll be waiting for the end of your review.
>
> Sorry about the late reply.
>
> If you'd like to send a new version for other reviewers, please feel
> free.  I haven't managed to take more than a brief look at the v5
> patch, but will try to look at it (or maybe the new version if you
> post) more closely this week.

I was playing around with v5 and I noticed an assertion failure which
I concluded is due to improper setting of ri_usesBulkModify.  You can
reproduce it with these steps.

create extension postgres_fdw;
create server lb foreign data wrapper postgres_fdw ;
create user mapping for current_user server lb;
create table foo (a int, b int) partition by list (a);
create table foo1 (like foo);
create foreign table ffoo1 partition of foo for values in (1) server
lb options (table_name 'foo1');
create table foo2 (like foo);
create foreign table ffoo2 partition of foo for values in (2) server
lb options (table_name 'foo2');
create function print_new_row() returns trigger language plpgsql as $$
begin raise notice '%', new; return new; end; $$;
create trigger ffoo1_br_trig before insert on ffoo1 for each row
execute function print_new_row();
copy foo from stdin csv;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,2
>> 2,3
>> \.
NOTICE:  (1,2)
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

#0  0x00007f2d5e266337 in raise () from /lib64/libc.so.6
#1  0x00007f2d5e267a28 in abort () from /lib64/libc.so.6
#2  0x0000000000aafd5d in ExceptionalCondition
(conditionName=0x7f2d37b468d0 "!resultRelInfo->ri_usesBulkModify ||
resultRelInfo->ri_FdwRoutine->BeginForeignCopyIn == NULL",
    errorType=0x7f2d37b46680 "FailedAssertion",
fileName=0x7f2d37b4677f "postgres_fdw.c", lineNumber=1863) at
assert.c:67
#3  0x00007f2d37b3b0fe in postgresExecForeignInsert (estate=0x2456320,
resultRelInfo=0x23a8f58, slot=0x23a9480, planSlot=0x0) at
postgres_fdw.c:1862
#4  0x000000000066362a in CopyFrom (cstate=0x23a8d40) at copy.c:3331

The problem is that partition ffoo1's BR trigger prevents it from
using multi-insert, but its ResultRelInfo.ri_usesBulkModify is true,
which is copied from its parent.  We should really check the same
things for a partition that CopyFrom() checks for the main target
relation (root parent) when deciding whether to use multi-insert.

However instead of duplicating the same logic to do so in two places
(CopyFrom and ExecInitPartitionInfo), I think it might be a good idea
to refactor the code to decide if multi-insert mode can be used for a
given relation by checking its properties and put it in some place
that both the main target relation and partitions need to invoke.
InitResultRelInfo() seems to be one such place.

Also, it might be a good idea to use ri_usesBulkModify more generally
than only for foreign relations as the patch currently does, because I
can see that it can replace the variable insertMethod in CopyFrom().
Having both insertMethod and ri_usesBulkModify in each ResultRelInfo
seems confusing and bug-prone.

Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to
reflect its scope.

Please check the attached delta patch that applies on top of v5 to see
what that would look like.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

ri_usesMultiInsert.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

21 August 2020, 12:19:06


On 8/7/20 2:14 PM, Amit Langote wrote:
> I was playing around with v5 and I noticed an assertion failure which
> I concluded is due to improper setting of ri_usesBulkModify.  You can
> reproduce it with these steps.
> 
> create extension postgres_fdw;
> create server lb foreign data wrapper postgres_fdw ;
> create user mapping for current_user server lb;
> create table foo (a int, b int) partition by list (a);
> create table foo1 (like foo);
> create foreign table ffoo1 partition of foo for values in (1) server
> lb options (table_name 'foo1');
> create table foo2 (like foo);
> create foreign table ffoo2 partition of foo for values in (2) server
> lb options (table_name 'foo2');
> create function print_new_row() returns trigger language plpgsql as $$
> begin raise notice '%', new; return new; end; $$;
> create trigger ffoo1_br_trig before insert on ffoo1 for each row
> execute function print_new_row();
> copy foo from stdin csv;
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
>>> 1,2
>>> 2,3
>>> \.
> NOTICE:  (1,2)
> server closed the connection unexpectedly
>          This probably means the server terminated abnormally
>          before or while processing the request.
> 
> #0  0x00007f2d5e266337 in raise () from /lib64/libc.so.6
> #1  0x00007f2d5e267a28 in abort () from /lib64/libc.so.6
> #2  0x0000000000aafd5d in ExceptionalCondition
> (conditionName=0x7f2d37b468d0 "!resultRelInfo->ri_usesBulkModify ||
> resultRelInfo->ri_FdwRoutine->BeginForeignCopyIn == NULL",
>      errorType=0x7f2d37b46680 "FailedAssertion",
> fileName=0x7f2d37b4677f "postgres_fdw.c", lineNumber=1863) at
> assert.c:67
> #3  0x00007f2d37b3b0fe in postgresExecForeignInsert (estate=0x2456320,
> resultRelInfo=0x23a8f58, slot=0x23a9480, planSlot=0x0) at
> postgres_fdw.c:1862
> #4  0x000000000066362a in CopyFrom (cstate=0x23a8d40) at copy.c:3331
> 
> The problem is that partition ffoo1's BR trigger prevents it from
> using multi-insert, but its ResultRelInfo.ri_usesBulkModify is true,
> which is copied from its parent.  We should really check the same
> things for a partition that CopyFrom() checks for the main target
> relation (root parent) when deciding whether to use multi-insert.
Thnx, I added TAP-test on this problem> However instead of duplicating 
the same logic to do so in two places
> (CopyFrom and ExecInitPartitionInfo), I think it might be a good idea
> to refactor the code to decide if multi-insert mode can be used for a
> given relation by checking its properties and put it in some place
> that both the main target relation and partitions need to invoke.
> InitResultRelInfo() seems to be one such place.
+1
> 
> Also, it might be a good idea to use ri_usesBulkModify more generally
> than only for foreign relations as the patch currently does, because I
> can see that it can replace the variable insertMethod in CopyFrom().
> Having both insertMethod and ri_usesBulkModify in each ResultRelInfo
> seems confusing and bug-prone.
> 
> Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to
> reflect its scope.
> 
> Please check the attached delta patch that applies on top of v5 to see
> what that would look like.
I merged your delta patch (see v6 in attachment) to the main patch.
Currently it seems more commitable than before.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v6-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

24 August 2020, 07:18:36

Hi Andrey,

On Fri, Aug 21, 2020 at 9:19 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 8/7/20 2:14 PM, Amit Langote wrote:
> > I was playing around with v5 and I noticed an assertion failure which
> > I concluded is due to improper setting of ri_usesBulkModify.  You can
> > reproduce it with these steps.
> >
> > create extension postgres_fdw;
> > create server lb foreign data wrapper postgres_fdw ;
> > create user mapping for current_user server lb;
> > create table foo (a int, b int) partition by list (a);
> > create table foo1 (like foo);
> > create foreign table ffoo1 partition of foo for values in (1) server
> > lb options (table_name 'foo1');
> > create table foo2 (like foo);
> > create foreign table ffoo2 partition of foo for values in (2) server
> > lb options (table_name 'foo2');
> > create function print_new_row() returns trigger language plpgsql as $$
> > begin raise notice '%', new; return new; end; $$;
> > create trigger ffoo1_br_trig before insert on ffoo1 for each row
> > execute function print_new_row();
> > copy foo from stdin csv;
> > Enter data to be copied followed by a newline.
> > End with a backslash and a period on a line by itself, or an EOF signal.
> >>> 1,2
> >>> 2,3
> >>> \.
> > NOTICE:  (1,2)
> > server closed the connection unexpectedly
> >          This probably means the server terminated abnormally
> >          before or while processing the request.
>
> Thnx, I added TAP-test on this problem> However instead of duplicating
> the same logic to do so in two places

Good call.

> > (CopyFrom and ExecInitPartitionInfo), I think it might be a good idea
> > to refactor the code to decide if multi-insert mode can be used for a
> > given relation by checking its properties and put it in some place
> > that both the main target relation and partitions need to invoke.
> > InitResultRelInfo() seems to be one such place.
> +1
> >
> > Also, it might be a good idea to use ri_usesBulkModify more generally
> > than only for foreign relations as the patch currently does, because I
> > can see that it can replace the variable insertMethod in CopyFrom().
> > Having both insertMethod and ri_usesBulkModify in each ResultRelInfo
> > seems confusing and bug-prone.
> >
> > Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to
> > reflect its scope.
> >
> > Please check the attached delta patch that applies on top of v5 to see
> > what that would look like.
>
> I merged your delta patch (see v6 in attachment) to the main patch.
> Currently it seems more commitable than before.

Thanks for accepting the changes.

Actually, I was thinking maybe making the patch to replace
CopyInsertMethod enum by ri_usesMultiInsert separate from the rest
might be better as I can see it as independent refactoring.  Attached
is how the division would look like.

I would

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

24 August 2020, 09:19:28

On Mon, Aug 24, 2020 at 4:18 PM Amit Langote <amitlangote09@gmail.com> wrote:
> I would

Oops, thought I'd continue writing, but hit send before actually doing
that.  Please ignore.

I have some comments on v6, which I will share later this week.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Michael Paquier

Date:

07 September 2020, 07:26:51

On Mon, Aug 24, 2020 at 06:19:28PM +0900, Amit Langote wrote:
> On Mon, Aug 24, 2020 at 4:18 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > I would
>
> Oops, thought I'd continue writing, but hit send before actually doing
> that.  Please ignore.
>
> I have some comments on v6, which I will share later this week.

While on it, the CF bot is telling that the documentation of the patch
fails to compile.  This needs to be fixed.
--
Michael

Attachment

signature.asc

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

07 September 2020, 10:31:09

On 9/7/20 12:26 PM, Michael Paquier wrote:
> On Mon, Aug 24, 2020 at 06:19:28PM +0900, Amit Langote wrote:
>> On Mon, Aug 24, 2020 at 4:18 PM Amit Langote <amitlangote09@gmail.com> wrote:
>>> I would
>>
>> Oops, thought I'd continue writing, but hit send before actually doing
>> that.  Please ignore.
>>
>> I have some comments on v6, which I will share later this week.
> 
> While on it, the CF bot is telling that the documentation of the patch
> fails to compile.  This needs to be fixed.
> --
> Michael
> 
v.7 (in attachment) fixes this problem.
I also accepted Amit's suggestion to rename all fdwapi routines such as 
ForeignCopyIn to *ForeignCopy.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v7-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

08 September 2020, 07:34:47

Hi Andrey,

On Mon, Sep 7, 2020 at 7:31 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 9/7/20 12:26 PM, Michael Paquier wrote:
> > While on it, the CF bot is telling that the documentation of the patch
> > fails to compile.  This needs to be fixed.
> > --
> > Michael
> >
> v.7 (in attachment) fixes this problem.
> I also accepted Amit's suggestion to rename all fdwapi routines such as
> ForeignCopyIn to *ForeignCopy.

Any thoughts on the taking out the refactoring changes out of the main
patch as I suggested?

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Alexey Kondratov

Date:

08 September 2020, 09:29:14

Hi,

I've started doing a review of v7 yesterday.

On 2020-09-08 10:34, Amit Langote wrote:
> On Mon, Sep 7, 2020 at 7:31 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> >
>> v.7 (in attachment) fixes this problem.
>> I also accepted Amit's suggestion to rename all fdwapi routines such 
>> as
>> ForeignCopyIn to *ForeignCopy.
> 

It seems that naming is quite inconsistent now:

+    /* COPY a bulk of tuples into a foreign relation */
+    BeginForeignCopyIn_function BeginForeignCopy;
+    EndForeignCopyIn_function EndForeignCopy;
+    ExecForeignCopyIn_function ExecForeignCopy;

You get rid of this 'In' in the function names, but the types are still 
with it:

+typedef void (*BeginForeignCopyIn_function) (ModifyTableState *mtstate,
+        ResultRelInfo *rinfo);
+
+typedef void (*EndForeignCopyIn_function) (EState *estate,
+        ResultRelInfo *rinfo);
+
+typedef void (*ExecForeignCopyIn_function) (ResultRelInfo *rinfo,
+        TupleTableSlot **slots,
+        int nslots);

Also docs refer to old function names:

+void
+BeginForeignCopyIn(ModifyTableState *mtstate,
+                   ResultRelInfo *rinfo);

I think that it'd be better to choose either of these two naming schemes 
and use it everywhere for consistency.

> 
> Any thoughts on the taking out the refactoring changes out of the main
> patch as I suggested?
> 

+1 for splitting the patch. It was rather difficult for me to 
distinguish changes required by COPY via postgres_fdw from this 
refactoring.

Another ambiguous part of the refactoring was in changing 
InitResultRelInfo() arguments:

@@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
                    Relation resultRelationDesc,
                    Index resultRelationIndex,
                    Relation partition_root,
+                  bool use_multi_insert,
                    int instrument_options)

Why do we need to pass this use_multi_insert flag here? Would it be 
better to set resultRelInfo->ri_usesMultiInsert in the 
InitResultRelInfo() unconditionally like it is done for 
ri_usesFdwDirectModify? And after that it will be up to the caller 
whether to use multi-insert or not based on their own circumstances. 
Otherwise now we have a flag to indicate that we want to check for 
another flag, while this check doesn't look costly.

Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

08 September 2020, 10:21:33

On 9/8/20 12:34 PM, Amit Langote wrote:
> Hi Andrey,
> 
> On Mon, Sep 7, 2020 at 7:31 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 9/7/20 12:26 PM, Michael Paquier wrote:
>>> While on it, the CF bot is telling that the documentation of the patch
>>> fails to compile.  This needs to be fixed.
>>> --
>>> Michael
>>>
>> v.7 (in attachment) fixes this problem.
>> I also accepted Amit's suggestion to rename all fdwapi routines such as
>> ForeignCopyIn to *ForeignCopy.
> 
> Any thoughts on the taking out the refactoring changes out of the main
> patch as I suggested?
> 
Sorry I thought you asked to ignore your previous letter. I'll look into 
this patch set shortly.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

08 September 2020, 14:00:12

Hi Alexey,

On Tue, Sep 8, 2020 at 6:29 PM Alexey Kondratov
<a.kondratov@postgrespro.ru> wrote:
> On 2020-09-08 10:34, Amit Langote wrote:
> > Any thoughts on the taking out the refactoring changes out of the main
> > patch as I suggested?
> >
>
> +1 for splitting the patch. It was rather difficult for me to
> distinguish changes required by COPY via postgres_fdw from this
> refactoring.
>
> Another ambiguous part of the refactoring was in changing
> InitResultRelInfo() arguments:
>
> @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
>                                   Relation resultRelationDesc,
>                                   Index resultRelationIndex,
>                                   Relation partition_root,
> +                                 bool use_multi_insert,
>                                   int instrument_options)
>
> Why do we need to pass this use_multi_insert flag here? Would it be
> better to set resultRelInfo->ri_usesMultiInsert in the
> InitResultRelInfo() unconditionally like it is done for
> ri_usesFdwDirectModify? And after that it will be up to the caller
> whether to use multi-insert or not based on their own circumstances.
> Otherwise now we have a flag to indicate that we want to check for
> another flag, while this check doesn't look costly.

Hmm, I think having two flags seems confusing and bug prone,
especially if you consider partitions.  For example, if a partition's
ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then
execPartition.c: ExecInitPartitionInfo() would wrongly perform
BeginForeignCopy() based on only ri_usesMultiInsert, because it
wouldn't know CopyFrom()'s local flag.  Am I missing something?

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Alexey Kondratov

Date:

08 September 2020, 15:34:31

On 2020-09-08 17:00, Amit Langote wrote:
> Hi Alexey,
> 
> On Tue, Sep 8, 2020 at 6:29 PM Alexey Kondratov
> <a.kondratov@postgrespro.ru> wrote:
>> On 2020-09-08 10:34, Amit Langote wrote:
>> > Any thoughts on the taking out the refactoring changes out of the main
>> > patch as I suggested?
>> >
>> 
>> +1 for splitting the patch. It was rather difficult for me to
>> distinguish changes required by COPY via postgres_fdw from this
>> refactoring.
>> 
>> Another ambiguous part of the refactoring was in changing
>> InitResultRelInfo() arguments:
>> 
>> @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
>>                                   Relation resultRelationDesc,
>>                                   Index resultRelationIndex,
>>                                   Relation partition_root,
>> +                                 bool use_multi_insert,
>>                                   int instrument_options)
>> 
>> Why do we need to pass this use_multi_insert flag here? Would it be
>> better to set resultRelInfo->ri_usesMultiInsert in the
>> InitResultRelInfo() unconditionally like it is done for
>> ri_usesFdwDirectModify? And after that it will be up to the caller
>> whether to use multi-insert or not based on their own circumstances.
>> Otherwise now we have a flag to indicate that we want to check for
>> another flag, while this check doesn't look costly.
> 
> Hmm, I think having two flags seems confusing and bug prone,
> especially if you consider partitions.  For example, if a partition's
> ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then
> execPartition.c: ExecInitPartitionInfo() would wrongly perform
> BeginForeignCopy() based on only ri_usesMultiInsert, because it
> wouldn't know CopyFrom()'s local flag.  Am I missing something?

No, you're right. If someone want to share a state and use ResultRelInfo 
(RRI) for that purpose, then it's fine, but CopyFrom() may simply 
override RRI->ri_usesMultiInsert if needed and pass this RRI further.

This is how it's done for RRI->ri_usesFdwDirectModify. 
InitResultRelInfo() initializes it to false and then 
ExecInitModifyTable() changes the flag if needed.

Probably this is just a matter of personal choice, but for me the 
current implementation with additional argument in InitResultRelInfo() 
doesn't look completely right. Maybe because a caller now should pass an 
additional argument (as false) even if it doesn't care about 
ri_usesMultiInsert at all. It also adds additional complexity and feels 
like abstractions leaking.


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

09 September 2020, 08:45:26

On 9/8/20 8:34 PM, Alexey Kondratov wrote:
> On 2020-09-08 17:00, Amit Langote wrote:
>> <a.kondratov@postgrespro.ru> wrote:
>>> On 2020-09-08 10:34, Amit Langote wrote:
>>> Another ambiguous part of the refactoring was in changing
>>> InitResultRelInfo() arguments:
>>>
>>> @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
>>>                                   Relation resultRelationDesc,
>>>                                   Index resultRelationIndex,
>>>                                   Relation partition_root,
>>> +                                 bool use_multi_insert,
>>>                                   int instrument_options)
>>>
>>> Why do we need to pass this use_multi_insert flag here? Would it be
>>> better to set resultRelInfo->ri_usesMultiInsert in the
>>> InitResultRelInfo() unconditionally like it is done for
>>> ri_usesFdwDirectModify? And after that it will be up to the caller
>>> whether to use multi-insert or not based on their own circumstances.
>>> Otherwise now we have a flag to indicate that we want to check for
>>> another flag, while this check doesn't look costly.
>>
>> Hmm, I think having two flags seems confusing and bug prone,
>> especially if you consider partitions.  For example, if a partition's
>> ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then
>> execPartition.c: ExecInitPartitionInfo() would wrongly perform
>> BeginForeignCopy() based on only ri_usesMultiInsert, because it
>> wouldn't know CopyFrom()'s local flag.  Am I missing something?
> 
> No, you're right. If someone want to share a state and use ResultRelInfo 
> (RRI) for that purpose, then it's fine, but CopyFrom() may simply 
> override RRI->ri_usesMultiInsert if needed and pass this RRI further.
> 
> This is how it's done for RRI->ri_usesFdwDirectModify. 
> InitResultRelInfo() initializes it to false and then 
> ExecInitModifyTable() changes the flag if needed.
> 
> Probably this is just a matter of personal choice, but for me the 
> current implementation with additional argument in InitResultRelInfo() 
> doesn't look completely right. Maybe because a caller now should pass an 
> additional argument (as false) even if it doesn't care about 
> ri_usesMultiInsert at all. It also adds additional complexity and feels 
> like abstractions leaking.
I didn't feel what the problem was and prepared a patch version 
according to Alexey's suggestion (see Alternate.patch).
This does not seem very convenient and will lead to errors in the 
future. So, I agree with Amit.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

alternative.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

09 September 2020, 09:38:20

Version 8 split into two patches (in accordance with Amit suggestion).
Also I eliminate naming inconsistency (thanks to Alexey).
Based on master, f481d28232.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Alexey Kondratov

Date:

09 September 2020, 09:42:09

On 2020-09-09 11:45, Andrey V. Lepikhov wrote:
> On 9/8/20 8:34 PM, Alexey Kondratov wrote:
>> On 2020-09-08 17:00, Amit Langote wrote:
>>> <a.kondratov@postgrespro.ru> wrote:
>>>> On 2020-09-08 10:34, Amit Langote wrote:
>>>> Another ambiguous part of the refactoring was in changing
>>>> InitResultRelInfo() arguments:
>>>> 
>>>> @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo 
>>>> *resultRelInfo,
>>>>                                   Relation resultRelationDesc,
>>>>                                   Index resultRelationIndex,
>>>>                                   Relation partition_root,
>>>> +                                 bool use_multi_insert,
>>>>                                   int instrument_options)
>>>> 
>>>> Why do we need to pass this use_multi_insert flag here? Would it be
>>>> better to set resultRelInfo->ri_usesMultiInsert in the
>>>> InitResultRelInfo() unconditionally like it is done for
>>>> ri_usesFdwDirectModify? And after that it will be up to the caller
>>>> whether to use multi-insert or not based on their own circumstances.
>>>> Otherwise now we have a flag to indicate that we want to check for
>>>> another flag, while this check doesn't look costly.
>>> 
>>> Hmm, I think having two flags seems confusing and bug prone,
>>> especially if you consider partitions.  For example, if a partition's
>>> ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, 
>>> then
>>> execPartition.c: ExecInitPartitionInfo() would wrongly perform
>>> BeginForeignCopy() based on only ri_usesMultiInsert, because it
>>> wouldn't know CopyFrom()'s local flag.  Am I missing something?
>> 
>> No, you're right. If someone want to share a state and use 
>> ResultRelInfo (RRI) for that purpose, then it's fine, but CopyFrom() 
>> may simply override RRI->ri_usesMultiInsert if needed and pass this 
>> RRI further.
>> 
>> This is how it's done for RRI->ri_usesFdwDirectModify. 
>> InitResultRelInfo() initializes it to false and then 
>> ExecInitModifyTable() changes the flag if needed.
>> 
>> Probably this is just a matter of personal choice, but for me the 
>> current implementation with additional argument in InitResultRelInfo() 
>> doesn't look completely right. Maybe because a caller now should pass 
>> an additional argument (as false) even if it doesn't care about 
>> ri_usesMultiInsert at all. It also adds additional complexity and 
>> feels like abstractions leaking.
> I didn't feel what the problem was and prepared a patch version
> according to Alexey's suggestion (see Alternate.patch).

Yes, that's very close to what I've meant.

+    leaf_part_rri->ri_usesMultiInsert = (leaf_part_rri->ri_usesMultiInsert 
&&
+        rootResultRelInfo->ri_usesMultiInsert) ? true : false;

This could be just:

+    leaf_part_rri->ri_usesMultiInsert = (leaf_part_rri->ri_usesMultiInsert 
&&
+        rootResultRelInfo->ri_usesMultiInsert);

> This does not seem very convenient and will lead to errors in the
> future. So, I agree with Amit.

And InitResultRelInfo() may set ri_usesMultiInsert to false by default, 
since it's used only by COPY now. Then you won't need this in several 
places:

+    resultRelInfo->ri_usesMultiInsert = false;

While the logic of turning multi-insert on with all the validations 
required could be factored out of InitResultRelInfo() to a separate 
routine.

Anyway, I don't insist at all and think it's fine to stick to the 
original v7's logic.

Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

09 September 2020, 12:51:49

On Wed, Sep 9, 2020 at 6:42 PM Alexey Kondratov
<a.kondratov@postgrespro.ru> wrote:
> On 2020-09-09 11:45, Andrey V. Lepikhov wrote:
> > This does not seem very convenient and will lead to errors in the
> > future. So, I agree with Amit.
>
> And InitResultRelInfo() may set ri_usesMultiInsert to false by default,
> since it's used only by COPY now. Then you won't need this in several
> places:
>
> +       resultRelInfo->ri_usesMultiInsert = false;
>
> While the logic of turning multi-insert on with all the validations
> required could be factored out of InitResultRelInfo() to a separate
> routine.

Interesting idea.  Maybe better to have a separate routine like Alexey says.


--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

10 September 2020, 09:57:43

On 9/9/20 5:51 PM, Amit Langote wrote:
> On Wed, Sep 9, 2020 at 6:42 PM Alexey Kondratov
> <a.kondratov@postgrespro.ru> wrote:
>> On 2020-09-09 11:45, Andrey V. Lepikhov wrote:
>>> This does not seem very convenient and will lead to errors in the
>>> future. So, I agree with Amit.
>>
>> And InitResultRelInfo() may set ri_usesMultiInsert to false by default,
>> since it's used only by COPY now. Then you won't need this in several
>> places:
>>
>> +       resultRelInfo->ri_usesMultiInsert = false;
>>
>> While the logic of turning multi-insert on with all the validations
>> required could be factored out of InitResultRelInfo() to a separate
>> routine.
> 
> Interesting idea.  Maybe better to have a separate routine like Alexey says.
Ok. I rewrited the patch 0001 with the Alexey suggestion.
Patch 0002... required minor changes (new version see in attachment).

Also I added some optimization (see 0003 and 0004 patches). Here we 
execute 'COPY .. FROM  STDIN' at foreign server only once, in the 
BeginForeignCopy routine. It is a proof-of-concept patches.

Also I see that error messages processing needs to be rewritten. Unlike 
the INSERT operation applied to each row, here we find out copy errors 
only after sending the END of copy. Currently implementations 0002 and 
0004 provide uninformative error messages for some cases.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

16 September 2020, 09:10:36

On Thu, Sep 10, 2020 at 6:57 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 9/9/20 5:51 PM, Amit Langote wrote:
> > On Wed, Sep 9, 2020 at 6:42 PM Alexey Kondratov <a.kondratov@postgrespro.ru> wrote:
> >> And InitResultRelInfo() may set ri_usesMultiInsert to false by default,
> >> since it's used only by COPY now. Then you won't need this in several
> >> places:
> >>
> >> +       resultRelInfo->ri_usesMultiInsert = false;
> >>
> >> While the logic of turning multi-insert on with all the validations
> >> required could be factored out of InitResultRelInfo() to a separate
> >> routine.
> >
> > Interesting idea.  Maybe better to have a separate routine like Alexey says.
> Ok. I rewrited the patch 0001 with the Alexey suggestion.

Thank you.  Some mostly cosmetic suggestions on that:

+bool
+checkMultiInsertMode(const ResultRelInfo *rri, const ResultRelInfo *parent)

I think we should put this definition in executor.c and export in
executor.h, not execPartition.c/h.  Also, better to match the naming
style of surrounding executor routines, say,
ExecRelationAllowsMultiInsert?  I'm not sure if we need the 'parent'
parameter but as it's pretty specific to partition's case, maybe
partition_root is a better name.

+   if (!checkMultiInsertMode(target_resultRelInfo, NULL))
+   {
+       /*
+        * Do nothing. Can't allow multi-insert mode if previous conditions
+        * checking disallow this.
+        */
+   }

Personally, I find this notation with empty blocks a bit strange.
Maybe it's easier to read this instead:

    if (!cstate->volatile_defexprs &&
        !contain_volatile_functions(cstate->whereClause) &&
        ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
        target_resultRelInfo->ri_usesMultiInsert = true;

Also, I don't really understand why we need
list_length(cstate->attnumlist) > 0 to use multi-insert on foreign
tables but apparently we do.  The next patch should add that condition
here along with a brief note on that in the comment.

-   if (resultRelInfo->ri_FdwRoutine != NULL &&
-       resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
-       resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
-                                                        resultRelInfo);
+   /*
+    * Init COPY into foreign table. Initialization of copying into foreign
+    * partitions will be done later.
+    */
+   if (target_resultRelInfo->ri_FdwRoutine != NULL &&
+       target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+       target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
+                                                               resultRelInfo);


@@ -3349,11 +3302,10 @@ CopyFrom(CopyState cstate)
    if (target_resultRelInfo->ri_FdwRoutine != NULL &&
        target_resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
        target_resultRelInfo->ri_FdwRoutine->EndForeignInsert(estate,
-
target_resultRelInfo);
+                                                       target_resultRelInfo);

These two hunks seem unnecessary, which I think I introduced into this
patch when breaking it out of the main one.

Please check the attached delta patch which contains the above changes.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment

0001-delta.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

20 September 2020, 09:12:08

16.09.2020 12:10, Amit Langote пишет:
> On Thu, Sep 10, 2020 at 6:57 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 9/9/20 5:51 PM, Amit Langote wrote:
>> Ok. I rewrited the patch 0001 with the Alexey suggestion.
> 
> Thank you.  Some mostly cosmetic suggestions on that:
> 
> +bool
> +checkMultiInsertMode(const ResultRelInfo *rri, const ResultRelInfo *parent)
> 
> I think we should put this definition in executor.c and export in
> executor.h, not execPartition.c/h.  Also, better to match the naming
> style of surrounding executor routines, say,
> ExecRelationAllowsMultiInsert?  I'm not sure if we need the 'parent'
> parameter but as it's pretty specific to partition's case, maybe
> partition_root is a better name.
Agreed

> +   if (!checkMultiInsertMode(target_resultRelInfo, NULL))
> +   {
> +       /*
> +        * Do nothing. Can't allow multi-insert mode if previous conditions
> +        * checking disallow this.
> +        */
> +   }
> 
> Personally, I find this notation with empty blocks a bit strange.
> Maybe it's easier to read this instead:
> 
>      if (!cstate->volatile_defexprs &&
>          !contain_volatile_functions(cstate->whereClause) &&
>          ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
>          target_resultRelInfo->ri_usesMultiInsert = true;
Agreed

> Also, I don't really understand why we need
> list_length(cstate->attnumlist) > 0 to use multi-insert on foreign
> tables but apparently we do.  The next patch should add that condition
> here along with a brief note on that in the comment.
This is a feature of the COPY command. It can't be used without any 
column in braces. However, foreign tables without columns can exist.
You can see this problem if you apply the 0002 patch on top of your 
delta patch. Ashutosh in [1] noticed this problem and anchored it with 
regression test.
I included this expression (with comments) into the 0002 patch.

> 
> -   if (resultRelInfo->ri_FdwRoutine != NULL &&
> -       resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
> -       resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
> -                                                        resultRelInfo);
> +   /*
> +    * Init COPY into foreign table. Initialization of copying into foreign
> +    * partitions will be done later.
> +    */
> +   if (target_resultRelInfo->ri_FdwRoutine != NULL &&
> +       target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
> +       target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
> +                                                               resultRelInfo);
> 
> 
> @@ -3349,11 +3302,10 @@ CopyFrom(CopyState cstate)
>      if (target_resultRelInfo->ri_FdwRoutine != NULL &&
>          target_resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
>          target_resultRelInfo->ri_FdwRoutine->EndForeignInsert(estate,
> -
> target_resultRelInfo);
> +                                                       target_resultRelInfo);
> 
> These two hunks seem unnecessary, which I think I introduced into this
> patch when breaking it out of the main one.
> 
> Please check the attached delta patch which contains the above changes.
I applied your delta patch to the 0001 patch and fix the 0002 patch in 
accordance with these changes.
Patches 0003 and 0004 are experimental and i will not support them 
before discussing on applicability.

[1] 
https://www.postgresql.org/message-id/CAExHW5uAtyAVL-iuu1Hsd0fycqS5UHoHCLfauYDLQwRucwC9Og%40mail.gmail.com

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

21 September 2020, 11:20:16

This patch currently looks very ready for use. And I'm taking a close 
look at the error reporting. Here we have difference in behavior of 
local and foreign table:

regression test in postgres_fdw.sql:
copy rem2 from stdin;
-1    xyzzy
\.

reports error (1):
=================
ERROR:  new row for relation "loc2" violates check constraint...
DETAIL:  Failing row contains (-1, xyzzy).
CONTEXT:  COPY loc2, line 1: "-1    xyzzy"
remote SQL command: COPY public.loc2(f1, f2) FROM STDIN
COPY rem2, line 2

But local COPY into loc2 reports another error (2):
===================================================
copy loc2 from stdin;
ERROR:  new row for relation "loc2" violates check constraint...
DETAIL:  Failing row contains (-1, xyzzy).
CONTEXT:  COPY loc2, line 1: "-1    xyzzy"

Report (2) is shorter and more specific.
Report (1) contains meaningless information.

Maybe we need to improve error report? For example like this:
ERROR: Failed COPY into foreign table "rem2":
new row for relation "loc2" violates check constraint...
DETAIL:  Failing row contains (-1, xyzzy).
remote SQL command: COPY public.loc2(f1, f2) FROM STDIN
COPY rem2, line 1

The problem here is that we run into an error after the COPY FROM 
command completes. And we need to translate lineno from foreign server 
to lineno of overall COPY command.

-- 
regards,
Andrey Lepikhov
Postgres Professional

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

19 October 2020, 04:12:25

Hello Andrey-san,


Thank you for challenging an interesting feature.  Below are my review comments.


(1)
-    /* for use by copy.c when performing multi-inserts */
+    /*
+     * The following fields are currently only relevant to copy.c.
+     *
+     * True if okay to use multi-insert on this relation
+     */
+    bool ri_usesMultiInsert;
+
+    /* Buffer allocated to this relation when using multi-insert mode */
     struct CopyMultiInsertBuffer *ri_CopyMultiInsertBuffer;
 } ResultRelInfo;

It's better to place the new bool member next to an existing bool member, so that the structure doesn't get larger.


(2)
+    Assert(rri->ri_usesMultiInsert == false);

As the above assertion represents, I'm afraid the semantics of ExecRelationAllowsMultiInsert() and
ResultRelInfo->ri_usesMultiInsertare unclear.  In CopyFrom(), ri_usesMultiInsert is set by also considering the
COPY-specificconditions:
 

+    if (!cstate->volatile_defexprs &&
+        !contain_volatile_functions(cstate->whereClause) &&
+        ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
+        target_resultRelInfo->ri_usesMultiInsert = true;

On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set purely based on the relation's
characteristics.

+    leaf_part_rri->ri_usesMultiInsert =
+        ExecRelationAllowsMultiInsert(leaf_part_rri, rootResultRelInfo);

In addition to these differences, I think it's a bit confusing that the function itself doesn't record the check result
inri_usesMultiInsert.
 

It's probably easy to understand to not add ri_usesMultiInsert, and the function just encapsulates the check logic
basedsolely on the relation characteristics and returns the result.  So, the argument is just one ResultRelInfo.  The
caller(e.g. COPY) combines the function result with other specific conditions.
 


(3)
+typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate,
+                                             ResultRelInfo *rinfo);
+
+typedef void (*EndForeignCopy_function) (EState *estate,
+                                           ResultRelInfo *rinfo);
+
+typedef void (*ExecForeignCopy_function) (ResultRelInfo *rinfo,
+                                                       TupleTableSlot **slots,
+                                                       int nslots);

To align with other function groups, it's better to place the functions in order of Begin, Exec, and End.


(4)
+    /* COPY a bulk of tuples into a foreign relation */
+    BeginForeignCopy_function BeginForeignCopy;
+    EndForeignCopy_function EndForeignCopy;
+    ExecForeignCopy_function ExecForeignCopy;

To align with the other functions' comment, the comment should be:
    /* Support functions for COPY */


(5)
+<programlisting>
+TupleTableSlot *
+ExecForeignCopy(ResultRelInfo *rinfo,
+                  TupleTableSlot **slots,
+                  int nslots);
+</programlisting>
+
+     Copy a bulk of tuples into the foreign table.
+     <literal>estate</literal> is global execution state for the query.

The return type is void.


(6)
+     <literal>nslots</literal> cis a number of tuples in the <literal>slots</literal>

cis -> is


(7)
+    <para>
+     If the <function>ExecForeignCopy</function> pointer is set to
+     <literal>NULL</literal>, attempts to insert into the foreign table will fail
+     with an error message.
+    </para>

"attempts to insert into" should be "attempts to run COPY on", because it's used for COPY.
Furthermore, if ExecForeignCopy is NULL, COPY should use ExecForeignInsert() instead, right?  Otherwise, existing FDWs
wouldbecome unable to be used for COPY.
 


(8)
+    bool        pipe = (filename == NULL) && (data_dest_cb == NULL);

The above pipe in BeginCopyTo() is changed to not match pipe in DoCopyTo(), which only refers to filename.  Should pipe
inDoCopyTo() also be changed?  If no, the use of the same variable name for different conditions is confusing.
 


(9)
-     * partitions will be done later.
+-     * partitions will be done later.

This is an unintended addition of '-'?


(10)
-    if (resultRelInfo->ri_FdwRoutine != NULL &&
-        resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
-        resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
-                                                         resultRelInfo);
+    if (target_resultRelInfo->ri_FdwRoutine != NULL)
+    {
+        if (target_resultRelInfo->ri_usesMultiInsert)
+            target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate,
+                                                                  resultRelInfo);
+        else if (target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+            target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
+                                                                    resultRelInfo);
+    }

BeginForeignCopy() should be called if it's defined, because BeginForeignCopy() is an optional function.


(11) 
+        oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+
+        table_multi_insert(resultRelInfo->ri_RelationDesc,
+                           slots,

The extra empty line seems unintended.


(12)
@@ -585,7 +583,8 @@ CopySendEndOfRow(CopyState cstate)
             (void) pq_putmessage('d', fe_msgbuf->data, fe_msgbuf->len);
             break;
         case COPY_CALLBACK:
-            Assert(false);        /* Not yet supported. */
+            CopySendChar(cstate, '\n');
+            cstate->data_dest_cb(fe_msgbuf->data, fe_msgbuf->len);

As in the COPY_FILENAME case, shouldn't the line terminator be sent only in text format, and be changed to \r\n on
Windows? I'm asking this as I'm probably a bit confused about in what situation COPY_CALLBACK could be used.  I thought
thebinary format and \r\n line terminator could be necessary depending on the FDW implementation.
 


(13)
@@ -1001,9 +1001,13 @@ ExecInitRoutingInfo(ModifyTableState *mtstate,
      * If the partition is a foreign table, let the FDW init itself for
      * routing tuples to the partition.
      */
-    if (partRelInfo->ri_FdwRoutine != NULL &&
-        partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
-        partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
+    if (partRelInfo->ri_FdwRoutine != NULL)
+    {
+        if (partRelInfo->ri_usesMultiInsert)
+            partRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, partRelInfo);
+        else if (partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+            partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
+    }

BeginForeignCopy() should be called only if it's defined, because BeginForeignCopy() is an optional function.


(14)
@@ -1205,10 +1209,18 @@ ExecCleanupTupleRouting(ModifyTableState *mtstate,
         ResultRelInfo *resultRelInfo = proute->partitions[i];
 
         /* Allow any FDWs to shut down */
-        if (resultRelInfo->ri_FdwRoutine != NULL &&
-            resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
-            resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
-                                                           resultRelInfo);
+        if (resultRelInfo->ri_FdwRoutine != NULL)
+        {
+            if (resultRelInfo->ri_usesMultiInsert)
+            {
+                Assert(resultRelInfo->ri_FdwRoutine->EndForeignCopy != NULL);
+                resultRelInfo->ri_FdwRoutine->EndForeignCopy(mtstate->ps.state,
+                                                               resultRelInfo);
+            }
+            else if (resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
+                resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
+                                                               resultRelInfo);
+        }

EndForeignCopy() is an optional function, isn't it?  That is, it's called if it's defined.


(15)
+static void
+pgfdw_copy_dest_cb(void *buf, int len)
+{
+    PGconn *conn = copy_fmstate->conn;
+
+    if (PQputCopyData(conn, (char *) buf, len) <= 0)
+    {
+        PGresult *res = PQgetResult(conn);
+
+        pgfdw_report_error(ERROR, res, conn, true, copy_fmstate->query);
+    }
+}

The following page says "Use PQerrorMessage to retrieve details if the return value is -1."  So, it's correct to not
usePGresult here and pass NULL as the second argument to pgfdw_report_error().
 

https://www.postgresql.org/docs/devel/libpq-copy.html


(16)
+        for (i = 0; i < nslots; i++)
+            CopyOneRowTo(fmstate->cstate, slots[i]);
+
+        status = true;
+    }

I'm afraid it's not intuitive what "status is true" means.  I think copy_data_sent or copy_send_success would be better
forthe variable name.
 


(17)
+        if (PQputCopyEnd(conn, status ? NULL : _("canceled by server")) <= 0 ||
+            PQflush(conn))
+            ereport(ERROR,
+                    (errmsg("error returned by PQputCopyEnd: %s",
+                            PQerrorMessage(conn))));

As the places that call PQsendQuery(), it seems preferrable to call pgfdw_report_error() here too.


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

19 October 2020, 14:58:34


19.10.2020 09:12, tsunakawa.takay@fujitsu.com пишет:
> Hello Andrey-san,
> 
> 
> Thank you for challenging an interesting feature.  Below are my review comments.
> 
> 
> (1)
> -    /* for use by copy.c when performing multi-inserts */
> +    /*
> +     * The following fields are currently only relevant to copy.c.
> +     *
> +     * True if okay to use multi-insert on this relation
> +     */
> +    bool ri_usesMultiInsert;
> +
> +    /* Buffer allocated to this relation when using multi-insert mode */
>       struct CopyMultiInsertBuffer *ri_CopyMultiInsertBuffer;
>   } ResultRelInfo;
> 
> It's better to place the new bool member next to an existing bool member, so that the structure doesn't get larger.
Here the variable position chosen in accordance with the logical 
meaning. I don't see large problem with size of this structure.
> 
> 
> (2)
> +    Assert(rri->ri_usesMultiInsert == false);
> 
> As the above assertion represents, I'm afraid the semantics of ExecRelationAllowsMultiInsert() and
ResultRelInfo->ri_usesMultiInsertare unclear.  In CopyFrom(), ri_usesMultiInsert is set by also considering the
COPY-specificconditions:
 
> 
> +    if (!cstate->volatile_defexprs &&
> +        !contain_volatile_functions(cstate->whereClause) &&
> +        ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
> +        target_resultRelInfo->ri_usesMultiInsert = true;
> 
> On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set purely based on the relation's
characteristics.
> 
> +    leaf_part_rri->ri_usesMultiInsert =
> +        ExecRelationAllowsMultiInsert(leaf_part_rri, rootResultRelInfo);
> 
> In addition to these differences, I think it's a bit confusing that the function itself doesn't record the check
resultin ri_usesMultiInsert.
 
> 
> It's probably easy to understand to not add ri_usesMultiInsert, and the function just encapsulates the check logic
basedsolely on the relation characteristics and returns the result.  So, the argument is just one ResultRelInfo.  The
caller(e.g. COPY) combines the function result with other specific conditions.
 
I can't fully agreed with this suggestion. We do so because in the 
future anyone can call this code from another subsystem for another 
purposes. And we want all the relation-related restrictions contains in 
one routine. CopyState-related restrictions used in copy.c only and 
taken out of this function.
> 
> 
> (3)
> +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate,
> +                                             ResultRelInfo *rinfo);
> +
> +typedef void (*EndForeignCopy_function) (EState *estate,
> +                                           ResultRelInfo *rinfo);
> +
> +typedef void (*ExecForeignCopy_function) (ResultRelInfo *rinfo,
> +                                                       TupleTableSlot **slots,
> +                                                       int nslots);
> 
> To align with other function groups, it's better to place the functions in order of Begin, Exec, and End.
Ok, thanks.
> 
> 
> (4)
> +    /* COPY a bulk of tuples into a foreign relation */
> +    BeginForeignCopy_function BeginForeignCopy;
> +    EndForeignCopy_function EndForeignCopy;
> +    ExecForeignCopy_function ExecForeignCopy;
> 
> To align with the other functions' comment, the comment should be:
>     /* Support functions for COPY */
Agreed
> 
> 
> (5)
> +<programlisting>
> +TupleTableSlot *
> +ExecForeignCopy(ResultRelInfo *rinfo,
> +                  TupleTableSlot **slots,
> +                  int nslots);
> +</programlisting>
> +
> +     Copy a bulk of tuples into the foreign table.
> +     <literal>estate</literal> is global execution state for the query.
> 
> The return type is void.
Agreed
> 
> 
> (6)
> +     <literal>nslots</literal> cis a number of tuples in the <literal>slots</literal>
> 
> cis -> is
Ok
> 
> 
> (7)
> +    <para>
> +     If the <function>ExecForeignCopy</function> pointer is set to
> +     <literal>NULL</literal>, attempts to insert into the foreign table will fail
> +     with an error message.
> +    </para>
> 
> "attempts to insert into" should be "attempts to run COPY on", because it's used for COPY.
> Furthermore, if ExecForeignCopy is NULL, COPY should use ExecForeignInsert() instead, right?  Otherwise, existing
FDWswould become unable to be used for COPY.
 
Thanks
> 
> 
> (8)
> +    bool        pipe = (filename == NULL) && (data_dest_cb == NULL);
> 
> The above pipe in BeginCopyTo() is changed to not match pipe in DoCopyTo(), which only refers to filename.  Should
pipein DoCopyTo() also be changed?  If no, the use of the same variable name for different conditions is confusing.
 
Ok
> 
> 
> (9)
> -     * partitions will be done later.
> +-     * partitions will be done later.
> 
> This is an unintended addition of '-'?
Ok
> 
> 
> (10)
> -    if (resultRelInfo->ri_FdwRoutine != NULL &&
> -        resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
> -        resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
> -                                                         resultRelInfo);
> +    if (target_resultRelInfo->ri_FdwRoutine != NULL)
> +    {
> +        if (target_resultRelInfo->ri_usesMultiInsert)
> +            target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate,
> +                                                                  resultRelInfo);
> +        else if (target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
> +            target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
> +                                                                    resultRelInfo);
> +    }
> 
> BeginForeignCopy() should be called if it's defined, because BeginForeignCopy() is an optional function.
Maybe
> 
> 
> (11)
> +        oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
> +
> +        table_multi_insert(resultRelInfo->ri_RelationDesc,
> +                           slots,
> 
> The extra empty line seems unintended.
> 
Ok
> 
> (12)
> @@ -585,7 +583,8 @@ CopySendEndOfRow(CopyState cstate)
>               (void) pq_putmessage('d', fe_msgbuf->data, fe_msgbuf->len);
>               break;
>           case COPY_CALLBACK:
> -            Assert(false);        /* Not yet supported. */
> +            CopySendChar(cstate, '\n');
> +            cstate->data_dest_cb(fe_msgbuf->data, fe_msgbuf->len);
> 
> As in the COPY_FILENAME case, shouldn't the line terminator be sent only in text format, and be changed to \r\n on
Windows? I'm asking this as I'm probably a bit confused about in what situation COPY_CALLBACK could be used.  I thought
thebinary format and \r\n line terminator could be necessary depending on the FDW implementation.
 
> 
Ok. I don't want to allow binary format in callback mode right now. It 
is not a subject of this patch. Maybe it will be done later.
> 
> (13)
> @@ -1001,9 +1001,13 @@ ExecInitRoutingInfo(ModifyTableState *mtstate,
>        * If the partition is a foreign table, let the FDW init itself for
>        * routing tuples to the partition.
>        */
> -    if (partRelInfo->ri_FdwRoutine != NULL &&
> -        partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
> -        partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
> +    if (partRelInfo->ri_FdwRoutine != NULL)
> +    {
> +        if (partRelInfo->ri_usesMultiInsert)
> +            partRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, partRelInfo);
> +        else if (partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
> +            partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
> +    }
> 
> BeginForeignCopy() should be called only if it's defined, because BeginForeignCopy() is an optional function.
Ok
> 
> 
> (14)
> @@ -1205,10 +1209,18 @@ ExecCleanupTupleRouting(ModifyTableState *mtstate,
>           ResultRelInfo *resultRelInfo = proute->partitions[i];
>   
>           /* Allow any FDWs to shut down */
> -        if (resultRelInfo->ri_FdwRoutine != NULL &&
> -            resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
> -            resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
> -                                                           resultRelInfo);
> +        if (resultRelInfo->ri_FdwRoutine != NULL)
> +        {
> +            if (resultRelInfo->ri_usesMultiInsert)
> +            {
> +                Assert(resultRelInfo->ri_FdwRoutine->EndForeignCopy != NULL);
> +                resultRelInfo->ri_FdwRoutine->EndForeignCopy(mtstate->ps.state,
> +                                                               resultRelInfo);
> +            }
> +            else if (resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
> +                resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
> +                                                               resultRelInfo);
> +        }
> 
> EndForeignCopy() is an optional function, isn't it?  That is, it's called if it's defined.
> 
ri_usesMultiInsert must guarantee that we will use multi-insertions. And 
we use only assertions to control this.
> 
> (15)
> +static void
> +pgfdw_copy_dest_cb(void *buf, int len)
> +{
> +    PGconn *conn = copy_fmstate->conn;
> +
> +    if (PQputCopyData(conn, (char *) buf, len) <= 0)
> +    {
> +        PGresult *res = PQgetResult(conn);
> +
> +        pgfdw_report_error(ERROR, res, conn, true, copy_fmstate->query);
> +    }
> +}
> 
> The following page says "Use PQerrorMessage to retrieve details if the return value is -1."  So, it's correct to not
usePGresult here and pass NULL as the second argument to pgfdw_report_error().
 
> 
> https://www.postgresql.org/docs/devel/libpq-copy.html
Ok
> 
> 
> (16)
> +        for (i = 0; i < nslots; i++)
> +            CopyOneRowTo(fmstate->cstate, slots[i]);
> +
> +        status = true;
> +    }
> 
> I'm afraid it's not intuitive what "status is true" means.  I think copy_data_sent or copy_send_success would be
betterfor the variable name.
 
Agreed. renamed to 'OK'. In accordance with psql/copy.c.
> 
> 
> (17)
> +        if (PQputCopyEnd(conn, status ? NULL : _("canceled by server")) <= 0 ||
> +            PQflush(conn))
> +            ereport(ERROR,
> +                    (errmsg("error returned by PQputCopyEnd: %s",
> +                            PQerrorMessage(conn))));
> 
> As the places that call PQsendQuery(), it seems preferrable to call pgfdw_report_error() here too.
Agreed

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

20 October 2020, 02:30:57

Hi Andrey-san,


Thanks for the revision.  The patch looks good except for the following two items.


(18)
+    if (target_resultRelInfo->ri_FdwRoutine != NULL)
+    {
+        if (target_resultRelInfo->ri_usesMultiInsert)
+        {
+            Assert(target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy != NULL);
+            target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate,
+                                                                  resultRelInfo);
+        }

> > (14)
> > @@ -1205,10 +1209,18 @@ ExecCleanupTupleRouting(ModifyTableState
> *mtstate,
> >           ResultRelInfo *resultRelInfo = proute->partitions[i];
> >
> >           /* Allow any FDWs to shut down */
> > -        if (resultRelInfo->ri_FdwRoutine != NULL &&
> > -            resultRelInfo->ri_FdwRoutine->EndForeignInsert !=
> NULL)
> > -
>     resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
> > -
>                        resultRelInfo);
> > +        if (resultRelInfo->ri_FdwRoutine != NULL)
> > +        {
> > +            if (resultRelInfo->ri_usesMultiInsert)
> > +            {
> > +
>     Assert(resultRelInfo->ri_FdwRoutine->EndForeignCopy != NULL);
> > +
>     resultRelInfo->ri_FdwRoutine->EndForeignCopy(mtstate->ps.state,
> > +
>                            resultRelInfo);
> > +            }
> > +            else if
> (resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
> > +
>     resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
> > +
>                            resultRelInfo);
> > +        }
> >
> > EndForeignCopy() is an optional function, isn't it?  That is, it's called if it's
> defined.
> >
> ri_usesMultiInsert must guarantee that we will use multi-insertions. And we
> use only assertions to control this.

The code appears to require both BeginForeignCopy and EndForeignCopy, while the following documentation says they are
optional. Which is correct?  (I suppose the latter is correct just like other existing Begin/End functions are
optional.)

+     If the <function>BeginForeignCopy</function> pointer is set to
+     <literal>NULL</literal>, no action is taken for the initialization.

+     If the <function>EndForeignCopy</function> pointer is set to
+     <literal>NULL</literal>, no action is taken for the termination.




> > (2)
> > +    Assert(rri->ri_usesMultiInsert == false);
> >
> > As the above assertion represents, I'm afraid the semantics of
> ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are
> unclear.  In CopyFrom(), ri_usesMultiInsert is set by also considering the
> COPY-specific conditions:
> >
> > +    if (!cstate->volatile_defexprs &&
> > +        !contain_volatile_functions(cstate->whereClause) &&
> > +        ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
> > +        target_resultRelInfo->ri_usesMultiInsert = true;
> >
> > On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set
> purely based on the relation's characteristics.
> >
> > +    leaf_part_rri->ri_usesMultiInsert =
> > +        ExecRelationAllowsMultiInsert(leaf_part_rri,
> rootResultRelInfo);
> >
> > In addition to these differences, I think it's a bit confusing that the function
> itself doesn't record the check result in ri_usesMultiInsert.
> >
> > It's probably easy to understand to not add ri_usesMultiInsert, and the
> function just encapsulates the check logic based solely on the relation
> characteristics and returns the result.  So, the argument is just one
> ResultRelInfo.  The caller (e.g. COPY) combines the function result with other
> specific conditions.

> I can't fully agreed with this suggestion. We do so because in the future anyone
> can call this code from another subsystem for another purposes. And we want
> all the relation-related restrictions contains in one routine. CopyState-related
> restrictions used in copy.c only and taken out of this function.

I'm sorry if I'm misinterpreting you, but I think the following simply serves its role sufficiently and cleanly without
usingri_usesMultiInsert.
 

bool
ExecRelationAllowsMultiInsert(RelationRelInfo *rri)
{
    check if the relation allows multiinsert based on its characteristics;
    return true or false;
}

I'm concerned that if one subsystem sets ri_usesMultiInsert to true based on its additional specific conditions, it
mightlead to another subsystem's misjudgment.  For example, when subsystem A and B want to do different things
respectively:

[Subsystem A]
if (ExecRelationAllowsMultiInsert(rri) && {A's conditions})
    rri->ri_usesMultiInsert = true;
...
if (rri->ri_usesMultiInsert)
    do A's business;

[Subsystem B]
if (rri->ri_usesMultiInsert)
    do B's business;

Here, what if subsystem A and B don't want each other's specific conditions to hold true?  That is, A wants to do A's
businessonly if B's specific conditions don't hold true.  If A sets rri->ri_usesMultiInsert to true and passes rri to
B,then B wrongly does B's business despite that A's specific conditions are true.
 

(I think this is due to some form of violation of encapsulation.)


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Tomas Vondra

Date:

10 November 2020, 18:17:53

Hi,

I needed to look at this patch while working on something related, and I
found it got broken by 6973533650c a couple days ago. So here's a fixed
version, to keep cfbot happy. I haven't done any serious review yet.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

23 November 2020, 02:49:59

Hi Andrey-san,

From: Tomas Vondra <tomas.vondra@enterprisedb.com>
> I needed to look at this patch while working on something related, and I found it
> got broken by 6973533650c a couple days ago. So here's a fixed version, to keep
> cfbot happy. I haven't done any serious review yet.

Could I or my colleague continue this patch in a few days?  It looks it's stalled over one month.


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

23 November 2020, 08:39:00


On 11/23/20 7:49 AM, tsunakawa.takay@fujitsu.com wrote:
> Hi Andrey-san,
> 
> From: Tomas Vondra <tomas.vondra@enterprisedb.com>
>> I needed to look at this patch while working on something related, and I found it
>> got broken by 6973533650c a couple days ago. So here's a fixed version, to keep
>> cfbot happy. I haven't done any serious review yet.
> 
> Could I or my colleague continue this patch in a few days?  It looks it's stalled over one month.

I don't found any problems with this patch that needed to be corrected. 
It is wait for actions from committers side, i think.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Etsuro Fujita

Date:

24 November 2020, 02:56:20

On Mon, Nov 23, 2020 at 5:39 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 11/23/20 7:49 AM, tsunakawa.takay@fujitsu.com wrote:
> > Could I or my colleague continue this patch in a few days?  It looks it's stalled over one month.
>
> I don't found any problems with this patch that needed to be corrected.
> It is wait for actions from committers side, i think.

I'm planning to review this patch.  I think it would be better for
another pair of eyes to take a look at it, though.

Best regards,
Etsuro Fujita

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

24 November 2020, 04:27:07

Andrey-san, Fujita-san,

From: Etsuro Fujita <etsuro.fujita@gmail.com>
> On Mon, Nov 23, 2020 at 5:39 PM Andrey Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
> > On 11/23/20 7:49 AM, tsunakawa.takay@fujitsu.com wrote:
> > > Could I or my colleague continue this patch in a few days?  It looks it's
> stalled over one month.
> >
> > I don't found any problems with this patch that needed to be corrected.
> > It is wait for actions from committers side, i think.
> 
> I'm planning to review this patch.  I think it would be better for
> another pair of eyes to take a look at it, though.


There are the following two issues left untouched.

https://www.postgresql.org/message-id/TYAPR01MB2990DC396B338C98F27C8ED3FE1F0%40TYAPR01MB2990.jpnprd01.prod.outlook.com


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

24 November 2020, 06:04:27


On 11/24/20 9:27 AM, tsunakawa.takay@fujitsu.com wrote:
> Andrey-san, Fujita-san,
> 
> From: Etsuro Fujita <etsuro.fujita@gmail.com>
>> On Mon, Nov 23, 2020 at 5:39 PM Andrey Lepikhov
>> <a.lepikhov@postgrespro.ru> wrote:
>>> On 11/23/20 7:49 AM, tsunakawa.takay@fujitsu.com wrote:
>>>> Could I or my colleague continue this patch in a few days?  It looks it's
>> stalled over one month.
>>>
>>> I don't found any problems with this patch that needed to be corrected.
>>> It is wait for actions from committers side, i think.
>>
>> I'm planning to review this patch.  I think it would be better for
>> another pair of eyes to take a look at it, though.
> 
> 
> There are the following two issues left untouched.
> 
>
https://www.postgresql.org/message-id/TYAPR01MB2990DC396B338C98F27C8ED3FE1F0%40TYAPR01MB2990.jpnprd01.prod.outlook.com

I disagree with your opinion about changing the interface of the 
ExecRelationAllowsMultiInsert routine. If you insist on the need for 
this change, we need another opinion.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

25 November 2020, 09:47:51

Hi,

On Tue, Oct 20, 2020 at 11:31 AM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> > > (2)
> > > +   Assert(rri->ri_usesMultiInsert == false);
> > >
> > > As the above assertion represents, I'm afraid the semantics of
> > ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are
> > unclear.  In CopyFrom(), ri_usesMultiInsert is set by also considering the
> > COPY-specific conditions:
> > >
> > > +   if (!cstate->volatile_defexprs &&
> > > +           !contain_volatile_functions(cstate->whereClause) &&
> > > +           ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
> > > +           target_resultRelInfo->ri_usesMultiInsert = true;
> > >
> > > On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set
> > purely based on the relation's characteristics.
> > >
> > > +   leaf_part_rri->ri_usesMultiInsert =
> > > +           ExecRelationAllowsMultiInsert(leaf_part_rri,
> > rootResultRelInfo);
> > >
> > > In addition to these differences, I think it's a bit confusing that the function
> > itself doesn't record the check result in ri_usesMultiInsert.
> > >
> > > It's probably easy to understand to not add ri_usesMultiInsert, and the
> > function just encapsulates the check logic based solely on the relation
> > characteristics and returns the result.  So, the argument is just one
> > ResultRelInfo.  The caller (e.g. COPY) combines the function result with other
> > specific conditions.
>
> > I can't fully agreed with this suggestion. We do so because in the future anyone
> > can call this code from another subsystem for another purposes. And we want
> > all the relation-related restrictions contains in one routine. CopyState-related
> > restrictions used in copy.c only and taken out of this function.
>
> I'm sorry if I'm misinterpreting you, but I think the following simply serves its role sufficiently and cleanly
withoutusing ri_usesMultiInsert. 
>
> bool
> ExecRelationAllowsMultiInsert(RelationRelInfo *rri)
> {
>         check if the relation allows multiinsert based on its characteristics;
>         return true or false;
> }
>
> I'm concerned that if one subsystem sets ri_usesMultiInsert to true based on its additional specific conditions, it
mightlead to another subsystem's misjudgment.  For example, when subsystem A and B want to do different things
respectively:
>
> [Subsystem A]
> if (ExecRelationAllowsMultiInsert(rri) && {A's conditions})
>         rri->ri_usesMultiInsert = true;
> ...
> if (rri->ri_usesMultiInsert)
>         do A's business;
>
> [Subsystem B]
> if (rri->ri_usesMultiInsert)
>         do B's business;
>
> Here, what if subsystem A and B don't want each other's specific conditions to hold true?  That is, A wants to do A's
businessonly if B's specific conditions don't hold true.  If A sets rri->ri_usesMultiInsert to true and passes rri to
B,then B wrongly does B's business despite that A's specific conditions are true. 
>
> (I think this is due to some form of violation of encapsulation.)

Sorry about chiming in late, but I think Tsunakawa-san raises some
valid concerns.

First, IIUC, is whether we need the ri_usesMultiInsert flag at all.  I
think yes, because computing that information repeatedly for every row
seems wasteful, especially for a bulk operation, and even more so if
we're going to call a function when doing so.

Second is whether the interface for setting ri_usesMultiInsert
encourages situations where different modules could possibly engage in
conflicting behaviors.  I can't think of a real-life example of that
with the current implementation, but maybe the interface provided in
the patch makes it harder to ensure that that remains true in the
future.  Tsunakawa-san, have you encountered an example of this, maybe
when trying to integrate this patch with some other?

Anyway, one thing we could do is rename
ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(),
that is, to make it actually set ri_usesMultiInsert and have places
like CopyFrom() call it if (and only if) its local logic allows
multi-insert to be used.  So, ri_usesMultiInsert starts out set to
false and if a module wants to use multi-insert for a given target
relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag
on.  Also, given the confusion regarding how execPartition.c
manipulates the flag, maybe change ExecFindPartition() to accept a
Boolean parameter multi_insert, which it will pass down to
ExecInitPartitionInfo(), which in turn will call
ExecSetRelationUsesMultiInsert() for a given partition.  Of course, if
the logic in ExecSetRelationUsesMultiInsert() determines that
multi-insert can't be used, for the reasons listed in the function,
then the caller will have to live with that decision.

Any other ideas on how to make this work and look better?

--
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/d3fbf3bc93b7bcd99ff7fa9ee41e0e20%40postgrespro.ru

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

26 November 2020, 02:42:09

From: Amit Langote <amitlangote09@gmail.com>
> Second is whether the interface for setting ri_usesMultiInsert
> encourages situations where different modules could possibly engage in
> conflicting behaviors.  I can't think of a real-life example of that
> with the current implementation, but maybe the interface provided in
> the patch makes it harder to ensure that that remains true in the
> future.  Tsunakawa-san, have you encountered an example of this, maybe
> when trying to integrate this patch with some other?

Thanks.  No, I pointed out purely from the standpoint of program modularity (based on structured programming?)


> Anyway, one thing we could do is rename
> ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(),
> that is, to make it actually set ri_usesMultiInsert and have places
> like CopyFrom() call it if (and only if) its local logic allows
> multi-insert to be used.  So, ri_usesMultiInsert starts out set to
> false and if a module wants to use multi-insert for a given target
> relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag
> on.  Also, given the confusion regarding how execPartition.c

I think separating the setting and inspection of the property into different functions will be good, at least.


> manipulates the flag, maybe change ExecFindPartition() to accept a
> Boolean parameter multi_insert, which it will pass down to
> ExecInitPartitionInfo(), which in turn will call
> ExecSetRelationUsesMultiInsert() for a given partition.  Of course, if
> the logic in ExecSetRelationUsesMultiInsert() determines that
> multi-insert can't be used, for the reasons listed in the function,
> then the caller will have to live with that decision.

I can't say for sure, but it looks strange to me, because I can't find a good description of multi_insert argument for
ExecFindPartition(). If we add multi_insert, I'm afraid we may want to add further arguments for other properties in
thefuture like "Hey, get me the partition that has triggers.", "Next, pass me a partition that uses a foreign table.",
etc. I think the current ExecFindPartition() is good -- "Get me a partition that accepts this row."
 

I wonder if ri_usesMultiInsert is really necessary.  Would it cut down enough costs in the intended use case(s), say
theheavyweight COPY FROM?
 


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

30 November 2020, 14:06:47

On Thu, Nov 26, 2020 at 11:42 AM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> From: Amit Langote <amitlangote09@gmail.com>
> > Anyway, one thing we could do is rename
> > ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(),
> > that is, to make it actually set ri_usesMultiInsert and have places
> > like CopyFrom() call it if (and only if) its local logic allows
> > multi-insert to be used.  So, ri_usesMultiInsert starts out set to
> > false and if a module wants to use multi-insert for a given target
> > relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag
> > on.  Also, given the confusion regarding how execPartition.c
>
> I think separating the setting and inspection of the property into different functions will be good, at least.
>
> > manipulates the flag, maybe change ExecFindPartition() to accept a
> > Boolean parameter multi_insert, which it will pass down to
> > ExecInitPartitionInfo(), which in turn will call
> > ExecSetRelationUsesMultiInsert() for a given partition.  Of course, if
> > the logic in ExecSetRelationUsesMultiInsert() determines that
> > multi-insert can't be used, for the reasons listed in the function,
> > then the caller will have to live with that decision.
>
> I can't say for sure, but it looks strange to me, because I can't find a good description of multi_insert argument
forExecFindPartition().  If we add multi_insert, I'm afraid we may want to add further arguments for other properties
inthe future like "Hey, get me the partition that has triggers.", "Next, pass me a partition that uses a foreign
table.",etc.  I think the current ExecFindPartition() is good -- "Get me a partition that accepts this row." 
>
> I wonder if ri_usesMultiInsert is really necessary.  Would it cut down enough costs in the intended use case(s), say
theheavyweight COPY FROM? 

Thinking on this more, I think I'm starting to agree with you on this.
I skimmed the CopyFrom()'s main loop again today and indeed it doesn't
seem that the cost of checking the individual conditions for whether
or not to buffer the current tuple for the given target relation is
all that big to save with ri_usesMultiInsert.  So my argument that it
is good for performance is perhaps not that strong.

Andrey's original patch had the flag to, as I understand it, make the
partitioning case work correctly.  When inserting into a
non-partitioned table, there's only one relation to care about.  In
that case, CopyFrom() can use either the new COPY interface or the
INSERT interface for the entire operation when talking to a foreign
target relation's FDW driver.  With partitions, that has to be
considered separately for each partition.  What complicates the matter
further is that while the original target relation (the root
partitioned table in the partitioning case) is fully initialized in
CopyFrom(), partitions are lazily initialized by ExecFindPartition().
Note that the initialization of a given target relation can also
optionally involve calling the FDW to perform any pre-COPY
initializations.  So if a given partition is a foreign table, whether
the copy operation was initialized using the COPY interface or the
INSERT interface is determined away from CopyFrom().  Andrey created
ri_usesMultiInsert to remember which was used so that CopyFrom() can
use the correct interface during the subsequent interactions with the
partition's driver.

Now, it does not seem outright impossible to do this without the flag,
but maybe Andrey thinks it is good for readability?  If it is
confusing from a modularity standpoint, maybe we should rethink that.
That said, I still think that there should be a way for CopyFrom() to
tell ExecFindPartition() which FDW interface to initialize a given
foreign table partition's copy operation with -- COPY if the copy
allows multi-insert, INSERT if not.  Maybe the multi_insert parameter
I mentioned earlier would serve that purpose.

--
Amit Langote
EDB: http://www.enterprisedb.com

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

01 December 2020, 05:39:59

From: Amit Langote <amitlangote09@gmail.com>
> Andrey's original patch had the flag to, as I understand it, make the
> partitioning case work correctly.  When inserting into a
> non-partitioned table, there's only one relation to care about.  In
> that case, CopyFrom() can use either the new COPY interface or the
> INSERT interface for the entire operation when talking to a foreign
> target relation's FDW driver.  With partitions, that has to be
> considered separately for each partition.  What complicates the matter
> further is that while the original target relation (the root
> partitioned table in the partitioning case) is fully initialized in
> CopyFrom(), partitions are lazily initialized by ExecFindPartition().

Yeah, I felt it a bit confusing to see the calls to Begin/EndForeignInsert() in both CopyFrom() and
ExecInitRoutingInfo().


> Note that the initialization of a given target relation can also
> optionally involve calling the FDW to perform any pre-COPY
> initializations.  So if a given partition is a foreign table, whether
> the copy operation was initialized using the COPY interface or the
> INSERT interface is determined away from CopyFrom().  Andrey created
> ri_usesMultiInsert to remember which was used so that CopyFrom() can
> use the correct interface during the subsequent interactions with the
> partition's driver.
> 
> Now, it does not seem outright impossible to do this without the flag,
> but maybe Andrey thinks it is good for readability?  If it is
> confusing from a modularity standpoint, maybe we should rethink that.
> That said, I still think that there should be a way for CopyFrom() to
> tell ExecFindPartition() which FDW interface to initialize a given
> foreign table partition's copy operation with -- COPY if the copy
> allows multi-insert, INSERT if not.  Maybe the multi_insert parameter
> I mentioned earlier would serve that purpose.

I agree with your idea of adding multi_insert argument to ExecFindPartition() to request a multi-insert-capable
partition. At first, I thought ExecFindPartition() is used for all operations, insert/delete/update/select, so I found
itodd to add multi_insert argument.  But ExecFindPartion() is used only for insert, so multi_insert argument seems
okay.


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

01 December 2020, 09:02:48

On Tue, Dec 1, 2020 at 2:40 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> From: Amit Langote <amitlangote09@gmail.com>
> > Andrey's original patch had the flag to, as I understand it, make the
> > partitioning case work correctly.  When inserting into a
> > non-partitioned table, there's only one relation to care about.  In
> > that case, CopyFrom() can use either the new COPY interface or the
> > INSERT interface for the entire operation when talking to a foreign
> > target relation's FDW driver.  With partitions, that has to be
> > considered separately for each partition.  What complicates the matter
> > further is that while the original target relation (the root
> > partitioned table in the partitioning case) is fully initialized in
> > CopyFrom(), partitions are lazily initialized by ExecFindPartition().
>
> Yeah, I felt it a bit confusing to see the calls to Begin/EndForeignInsert() in both CopyFrom() and
ExecInitRoutingInfo().
>
> > Note that the initialization of a given target relation can also
> > optionally involve calling the FDW to perform any pre-COPY
> > initializations.  So if a given partition is a foreign table, whether
> > the copy operation was initialized using the COPY interface or the
> > INSERT interface is determined away from CopyFrom().  Andrey created
> > ri_usesMultiInsert to remember which was used so that CopyFrom() can
> > use the correct interface during the subsequent interactions with the
> > partition's driver.
> >
> > Now, it does not seem outright impossible to do this without the flag,
> > but maybe Andrey thinks it is good for readability?  If it is
> > confusing from a modularity standpoint, maybe we should rethink that.
> > That said, I still think that there should be a way for CopyFrom() to
> > tell ExecFindPartition() which FDW interface to initialize a given
> > foreign table partition's copy operation with -- COPY if the copy
> > allows multi-insert, INSERT if not.  Maybe the multi_insert parameter
> > I mentioned earlier would serve that purpose.
>
> I agree with your idea of adding multi_insert argument to ExecFindPartition() to request a multi-insert-capable
partition. At first, I thought ExecFindPartition() is used for all operations, insert/delete/update/select, so I found
itodd to add multi_insert argument.  But ExecFindPartion() is used only for insert, so multi_insert argument seems
okay.

Good.  Andrey, any thoughts on this?

--
Amit Langote
EDB: http://www.enterprisedb.com

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

14 December 2020, 09:06:12

On 12/1/20 2:02 PM, Amit Langote wrote:
> On Tue, Dec 1, 2020 at 2:40 PM tsunakawa.takay@fujitsu.com
> <tsunakawa.takay@fujitsu.com> wrote:
>> From: Amit Langote <amitlangote09@gmail.com>
 >> The code appears to require both BeginForeignCopy and EndForeignCopy,
 >> while the following documentation says they are optional.  Which is
 >> correct?  (I suppose the latter is correct just like other existing
 >> Begin/End functions are optional.)

Fixed.

 > Anyway, one thing we could do is rename
 > ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(

Renamed.

 >> I agree with your idea of adding multi_insert argument to 
ExecFindPartition() to request a multi-insert-capable partition.  At 
first, I thought ExecFindPartition() is used for all operations, 
insert/delete/update/select, so I found it odd to add multi_insert 
argument.  But ExecFindPartion() is used only for insert, so 
multi_insert argument seems okay.
 >
 > Good.  Andrey, any thoughts on this?

I have no serious technical arguments against this, other than code 
readability and reduce of a routine parameters. Maybe we will be 
rethinking it later?

The new version rebased on commit 525e60b742 is attached.


-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Tang, Haiying"

Date:

22 December 2020, 07:04:34

Hi Andrey,

There is an error report in your patch as follows. Please take a check.

https://travis-ci.org/github/postgresql-cfbot/postgresql/jobs/750682857#L1519

>copyfrom.c:374:21: error: ‘save_cur_lineno’ is used uninitialized in this function [-Werror=uninitialized]

Regards,
Tang

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

23 December 2020, 09:00:18

On 12/22/20 12:04 PM, Tang, Haiying wrote:
> Hi Andrey,
> 
> There is an error report in your patch as follows. Please take a check.
> 
> https://travis-ci.org/github/postgresql-cfbot/postgresql/jobs/750682857#L1519
> 
>> copyfrom.c:374:21: error: ‘save_cur_lineno’ is used uninitialized in this function [-Werror=uninitialized]
> 
> Regards,
> Tang
> 
> 

Thank you,
see new version in attachment.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v13_1-0002-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Hou, Zhijie"

Date:

29 December 2020, 11:20:47

Hi

> see new version in attachment.

I took a look into the patch, and have some comments.

1.
+    PG_FINALLY();
+    {
+        copy_fmstate = NULL; /* Detect problems */
I don't quite understand this comment,
does it means we want to detect something like Null reference ?


2.
+    PG_FINALLY();
+    {
    ...
+        if (!OK)
+            PG_RE_THROW();
+    }
Is this PG_RE_THROW() necessary ? 
IMO, PG_FINALLY will reproduce the PG_RE_THROW action if we get to the code block due to an error being thrown.

3.
+            ereport(ERROR,
+                    (errmsg("unexpected extra results during COPY of table: %s",
+                            PQerrorMessage(conn))));

I found some similar message like the following:

            pg_log_warning("unexpected extra results during COPY of table \"%s\"",
                           tocEntryTag);
How about using existing messages style ?

4.
I noticed some not standard code comment[1].
I think it's better to comment like:
/*
 * line 1
 * line 2
 */

[1]-----------
+        /* Finish COPY IN protocol. It is needed to do after successful copy or
+         * after an error.
+         */


+/*
+ *
+ * postgresExecForeignCopy

+/*
+ *
+ * postgresBeginForeignCopy

-----------
Best regards,
Houzj

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

30 December 2020, 07:16:07

On 29.12.2020 16:20, Hou, Zhijie wrote:
>> see new version in attachment.
> 
> I took a look into the patch, and have some comments.
> 
> 1.
> +    PG_FINALLY();
> +    {
> +        copy_fmstate = NULL; /* Detect problems */
> I don't quite understand this comment,
> does it means we want to detect something like Null reference ?
> 
> 
> 2.
> +    PG_FINALLY();
> +    {
>     ...
> +        if (!OK)
> +            PG_RE_THROW();
> +    }
> Is this PG_RE_THROW() necessary ?
> IMO, PG_FINALLY will reproduce the PG_RE_THROW action if we get to the code block due to an error being thrown.

This is a debugging stage atavisms. fixed.
> 
> 3.
> +            ereport(ERROR,
> +                    (errmsg("unexpected extra results during COPY of table: %s",
> +                            PQerrorMessage(conn))));
> 
> I found some similar message like the following:
> 
>             pg_log_warning("unexpected extra results during COPY of table \"%s\"",
>                            tocEntryTag);
> How about using existing messages style ?

This style is intended for use in frontend utilities, not for contrib 
extensions, i think.
> 
> 4.
> I noticed some not standard code comment[1].
> I think it's better to comment like:
> /*
>   * line 1
>   * line 2
>   */
> 
> [1]-----------
> +        /* Finish COPY IN protocol. It is needed to do after successful copy or
> +         * after an error.
> +         */
> 
> 
> +/*
> + *
> + * postgresExecForeignCopy
> 
> +/*
> + *
> + * postgresBeginForeignCopy
Thanks, fixed.
The patch in attachment rebased on 107a2d4204.


-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v13_2-0002-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Tang, Haiying"

Date:

11 January 2021, 11:59:13

Hi Andrey,

I had a general look at this extension feature, I think it's beneficial for some application scenarios of PostgreSQL.
SoI did 7 performance cases test on your patch(v13). The results are really good. As you can see below we can get 7-10
timesimprovement with this patch.
 

PSA test_copy_from.sql shows my test cases detail(I didn't attach my data file since it's too big). 

Below are the test results:
'Test No' corresponds to the number(0 1...6) in attached test_copy_from.sql.
%reg=(Patched-Unpatched)/Unpatched), Unit is millisecond.

|Test No| Test Case                                                                               |Patched(ms)  |
Unpatched(ms)|%reg   |
 

|-------|-----------------------------------------------------------------------------------------|-------------|---------------|-------|
|0      |COPY FROM insertion into the partitioned table(parition is foreign table)                | 102483.223  |
1083300.907 |  -91% |
 
|1      |COPY FROM insertion into the partitioned table(parition is foreign partition)            | 104779.893  |
1207320.287 |  -91% |
 
|2      |COPY FROM insertion into the foreign table(without partition)                            | 100268.730  |
1077309.158 |  -91% |
 
|3      |COPY FROM insertion into the partitioned table(part of foreign partitions)               | 104110.620  |
1134781.855 |  -91% |
 
|4      |COPY FROM insertion into the partitioned table with constraint(part of foreign partition)| 136356.201  |
1238539.603 |  -89% |
 
|5      |COPY FROM insertion into the foreign table with constraint(without partition)            | 136818.262  |
1189921.742 |  -89% |
 
|6      |\copy insertion into the partitioned table with constraint.                              | 140368.072  |
1242689.924 |  -89% |
 

If there is any question on my tests, please feel free to ask.

Best Regard,
Tang

Attachment

test_copy_from.sql

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Tomas Vondra

Date:

11 January 2021, 18:16:45

Hi Andrey,

Unfortunately, this no longer applies :-( I tried to apply this on top
of c532d15ddd (from 2020/12/30) but even that has non-trivial conflicts.

Can you send a rebased version?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

12 January 2021, 03:45:19

On 1/11/21 4:59 PM, Tang, Haiying wrote:
> Hi Andrey,
> 
> I had a general look at this extension feature, I think it's beneficial for some application scenarios of PostgreSQL.
SoI did 7 performance cases test on your patch(v13). The results are really good. As you can see below we can get 7-10
timesimprovement with this patch.
 
> 
> PSA test_copy_from.sql shows my test cases detail(I didn't attach my data file since it's too big).
> 
> Below are the test results:
> 'Test No' corresponds to the number(0 1...6) in attached test_copy_from.sql.
> %reg=(Patched-Unpatched)/Unpatched), Unit is millisecond.
> 
> |Test No| Test Case                                                                               |Patched(ms)  |
Unpatched(ms)|%reg   |
 
>
|-------|-----------------------------------------------------------------------------------------|-------------|---------------|-------|
> |0      |COPY FROM insertion into the partitioned table(parition is foreign table)                | 102483.223  |
1083300.907 |  -91% |
 
> |1      |COPY FROM insertion into the partitioned table(parition is foreign partition)            | 104779.893  |
1207320.287 |  -91% |
 
> |2      |COPY FROM insertion into the foreign table(without partition)                            | 100268.730  |
1077309.158 |  -91% |
 
> |3      |COPY FROM insertion into the partitioned table(part of foreign partitions)               | 104110.620  |
1134781.855 |  -91% |
 
> |4      |COPY FROM insertion into the partitioned table with constraint(part of foreign partition)| 136356.201  |
1238539.603 |  -89% |
 
> |5      |COPY FROM insertion into the foreign table with constraint(without partition)            | 136818.262  |
1189921.742 |  -89% |
 
> |6      |\copy insertion into the partitioned table with constraint.                              | 140368.072  |
1242689.924 |  -89% |
 
> 
> If there is any question on my tests, please feel free to ask.
> 
> Best Regard,
> Tang
Thank you for this work.
Sometimes before i suggested additional optimization [1] which can 
additionally speed up COPY by 2-4 times. Maybe you can perform the 
benchmark for this solution too?

[1] 
https://www.postgresql.org/message-id/da7ed3f5-b596-2549-3710-4cc2a602ec17%40postgrespro.ru

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

12 January 2021, 04:13:05

On 1/11/21 11:16 PM, Tomas Vondra wrote:
> Hi Andrey,
> 
> Unfortunately, this no longer applies :-( I tried to apply this on top
> of c532d15ddd (from 2020/12/30) but even that has non-trivial conflicts.
> 
> Can you send a rebased version?
> 
> regards
> 
Applied on 044aa9e70e.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v13_3-0002-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Tang, Haiying"

Date:

28 January 2021, 02:40:32

Hi Andrey,

> Sometimes before i suggested additional optimization [1] which can 
> additionally speed up COPY by 2-4 times. Maybe you can perform the 
> benchmark for this solution too?

Sorry for the late reply, I just have time to take this test now.
But the patch no longer applies, I tried to apply on e42b3c3bd6(2021/1/26) but failed.

Can you send a rebased version?

Regards,
Tang

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

02 February 2021, 06:57:04

Hello, Andrey-san,


From: Tang, Haiying <tanghy.fnst@cn.fujitsu.com>
> > Sometimes before i suggested additional optimization [1] which can
> > additionally speed up COPY by 2-4 times. Maybe you can perform the
> > benchmark for this solution too?
...
> But the patch no longer applies, I tried to apply on e42b3c3bd6(2021/1/26) but
> failed.
> 
> Can you send a rebased version?

I think the basic part of this patch set is the following.  The latter file unfortunately no longer applies to HEAD.

v13-0001-Move-multi-insert-decision-logic-into-executor.patch
v13_3-0002-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Plus, as Tang-san said, I'm afraid the following files are older and doesn't apply.

v9-0003-Add-separated-connections-into-the-postgres_fdw.patch
v9-0004-Optimized-version-of-the-Fast-COPY-FROM-feature

When do you think you can submit the rebased version of them?  (IIUC at the off-list HighGo meeting, you're planning to
postthem late this week after the global snapshot patch.)  Just in case you are not going to do them for the moment,
canwe rebase and/or further modify them so that they can be committed in PG 14?
 


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

02 February 2021, 08:28:27

On 2/2/21 11:57, tsunakawa.takay@fujitsu.com wrote:
> Hello, Andrey-san,
> 
> 
> From: Tang, Haiying <tanghy.fnst@cn.fujitsu.com>
>>> Sometimes before i suggested additional optimization [1] which can
>>> additionally speed up COPY by 2-4 times. Maybe you can perform the
>>> benchmark for this solution too?
> ...
>> But the patch no longer applies, I tried to apply on e42b3c3bd6(2021/1/26) but
>> failed.
>>
>> Can you send a rebased version?
> When do you think you can submit the rebased version of them?  (IIUC at the off-list HighGo meeting, you're planning
topost them late this week after the global snapshot patch.)  Just in case you are not going to do them for the moment,
canwe rebase and/or further modify them so that they can be committed in PG 14?
 
Of course, you can rebase it.

-- 
regards,
Andrey Lepikhov
Postgres Professional

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

02 February 2021, 09:13:29

From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
> Of course, you can rebase it.

Thank you.  I might modify the basic part to incorporate my past proposal about improving the layering or modularity
relatedto ri_useMultiInsert.  (But I may end up giving up due to lack of energy.)
 

Also, I might defer working on the extended part (v9 0003 and 0004) and further separate them in a different thread, if
itseems to take longer.
 


Regards
Takayuki Tsunakawa

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

09 February 2021, 04:35:03

From: tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com>
> From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
> > Of course, you can rebase it.
> 
> Thank you.  I might modify the basic part to incorporate my past proposal
> about improving the layering or modularity related to ri_useMultiInsert.  (But I
> may end up giving up due to lack of energy.)

Rebased to HEAD with the following modifications.  It passes make check in the top directory and contrib/postgres_fdw.

(1)
Placed and ordered new three FDW functions consistently among their documentation, declaration and definition.


(2)
Check if BeginForeignCopy is not NULL before calling it, because the documentation says it's not mandatory.


(3)
Changed the function name ExecSetRelationUsesMultiInsert() to ExecMultiInsertAllowed() because it does *not* set
anythingbut returns a boolean value to indicate whether the relation allows multi-insert.  I was bugged about this
function'sinterface and the use of ri_usesMultiInsert in ResultRelInfo.  I still feel a bit uneasy about things like
whetherthe function should really take the partition root (parent) argument, and whether it's a good design that
ri_usesMultiInsertis used for the executor functions to determine which of Begin/EndForeignCopy() or
Begin/EndForeignInsert()should be called.  I'm fine with COPY using executor, but it feels a bit uncomfortable for the
executorfunctions to be aware of COPY.
 


That said, with the reviews from some people and good performance results, I think this can be ready for committer.


> Also, I might defer working on the extended part (v9 0003 and 0004) and further
> separate them in a different thread, if it seems to take longer.

I reviewed them but haven't rebased them (it seems to take more labor.)
Andrey-san, could you tell us:

* Why is a separate FDW connection established for each COPY?  To avoid using the same FDW connection for multiple
foreigntable partitions in a single COPY run?
 

* In what kind of test did you get 2-4x performance gain?  COPY into many foreign table partitions where the input rows
areordered randomly enough that many rows don't accumulate in the COPY buffer?
 


Regards
Takayuki Tsunakawa

Attachment

v14-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

09 February 2021, 05:57:44

On 2/9/21 9:35 AM, tsunakawa.takay@fujitsu.com wrote:
>     From: tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com>
>> From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
>> Also, I might defer working on the extended part (v9 0003 and 0004) and further
>> separate them in a different thread, if it seems to take longer.
> 
> I reviewed them but haven't rebased them (it seems to take more labor.)
> Andrey-san, could you tell us:
> 
> * Why is a separate FDW connection established for each COPY?  To avoid using the same FDW connection for multiple
foreigntable partitions in a single COPY run?
 
With separate connection you can init a 'COPY FROM' session for each 
foreign partition just one time on partition initialization.
> 
> * In what kind of test did you get 2-4x performance gain?  COPY into many foreign table partitions where the input
rowsare ordered randomly enough that many rows don't accumulate in the COPY buffer?
 
I used 'INSERT INTO .. SELECT * FROM generate_series(1, N)' to generate 
test data and HASH partitioning to avoid skews.

-- 
regards,
Andrey Lepikhov
Postgres Professional

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

09 February 2021, 07:47:05

From: Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>
> On 2/9/21 9:35 AM, tsunakawa.takay@fujitsu.com wrote:
> > * Why is a separate FDW connection established for each COPY?  To avoid
> using the same FDW connection for multiple foreign table partitions in a single
> COPY run?
> With separate connection you can init a 'COPY FROM' session for each
> foreign partition just one time on partition initialization.
> >
> > * In what kind of test did you get 2-4x performance gain?  COPY into many
> foreign table partitions where the input rows are ordered randomly enough that
> many rows don't accumulate in the COPY buffer?
> I used 'INSERT INTO .. SELECT * FROM generate_series(1, N)' to generate
> test data and HASH partitioning to avoid skews.

I guess you used many hash partitions.  Sadly, The current COPY implementation only accumulates either 1,000 rows or 64
KBof input data (very small!) before flushing all CopyMultiInsertBuffers.  One CopyMultiInsertBuffer corresponds to one
partition. Flushing a CopyMultiInsertBuffer calls ExecForeignCopy() once, which connects to a remote database, runs
COPYFROM STDIN, and disconnects.  Here, the flushing trigger (1,000 rows or 64 KB input data, whichever comes first) is
sosmall that if there are many target partitions, the amount of data for each partition is small.

Looking at the triggering threshold values, the description (of MAX_BUFFERED_TUPLES at least) seems to indicate that
theytake effect per CopyMultiInsertBuffer:

/*
 * No more than this many tuples per CopyMultiInsertBuffer
 *
 * Caution: Don't make this too big, as we could end up with this many
 * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's
 * multiInsertBuffers list.  Increasing this can cause quadratic growth in
* memory requirements during copies into partitioned tables with a large
 * number of partitions.
 */
#define MAX_BUFFERED_TUPLES     1000

/*
 * Flush buffers if there are >= this many bytes, as counted by the input
 * size, of tuples stored.
 */
#define MAX_BUFFERED_BYTES      65535

But these threshold take effect across all CopyMultiInsertBuffers:

/*
 * Returns true if the buffers are full
 */
static inline bool
CopyMultiInsertInfoIsFull(CopyMultiInsertInfo *miinfo)
{
    if (miinfo->bufferedTuples >= MAX_BUFFERED_TUPLES ||
        miinfo->bufferedBytes >= MAX_BUFFERED_BYTES)
        return true;
    return false;
}

So, I think the direction to take is to allow more data to accumulate before flushing.  I'm not very excited about the
way0003 and 0004 establishes a new connection for each partition; it adds flags to many places, and
postgresfdw_xact_callback()has to be aware of COPY-specific processing.  Plus, we have to take care of the message
differenceyou found in the regression test.

Why don't we focus on committing the basic part and addressing the extended part (0003 and 0004) separately later?  As
Tang-sanand you showed, the basic part already demonstrated impressive improvement.  If there's no objection, I'd like
tomake this ready for committer in a few days.

Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

09 February 2021, 08:22:33

On 2/9/21 12:47 PM, tsunakawa.takay@fujitsu.com wrote:
> From: Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>
> I guess you used many hash partitions.  Sadly, The current COPY implementation only accumulates either 1,000 rows or
64KB of input data (very small!) before flushing all CopyMultiInsertBuffers.  One CopyMultiInsertBuffer corresponds to
onepartition.  Flushing a CopyMultiInsertBuffer calls ExecForeignCopy() once, which connects to a remote database, runs
COPYFROM STDIN, and disconnects.  Here, the flushing trigger (1,000 rows or 64 KB input data, whichever comes first) is
sosmall that if there are many target partitions, the amount of data for each partition is small.
 
I tried to use 1E4 - 1E8 rows in a tuple buffer. But the results weren't 
impressive.
We can use one more GUC instead of a precompiled constant.

> Why don't we focus on committing the basic part and addressing the extended part (0003 and 0004) separately later?
I focused only on the 0001 and 0002 patches.
>  As Tang-san and you showed, the basic part already demonstrated impressive improvement.  If there's no objection,
I'dlike to make this ready for committer in a few days.
 
Good.

-- 
regards,
Andrey Lepikhov
Postgres Professional

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

09 February 2021, 08:30:37

From: Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>
> I tried to use 1E4 - 1E8 rows in a tuple buffer. But the results weren't
> impressive.

I guess that's because the 64 KB threshold came first.


> We can use one more GUC instead of a precompiled constant.

Yes, agreed.


> > Why don't we focus on committing the basic part and addressing the
> extended part (0003 and 0004) separately later?
> I focused only on the 0001 and 0002 patches.
> >  As Tang-san and you showed, the basic part already demonstrated
> impressive improvement.  If there's no objection, I'd like to make this ready for
> committer in a few days.
> Good.

Glad to hear that.


Regards
Takayuki Tsunakawa

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

12 February 2021, 07:43:00

From: Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>
> On 2/9/21 12:47 PM, tsunakawa.takay@fujitsu.com wrote:
> >  As Tang-san and you showed, the basic part already demonstrated
> impressive improvement.  If there's no objection, I'd like to make this ready for
> committer in a few days.
> Good.

I've marked this as ready for committer.  Good luck.


Regards
Takayuki Tsunakawa

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Justin Pryzby

Date:

14 February 2021, 21:03:38

On Tue, Feb 09, 2021 at 04:35:03AM +0000, tsunakawa.takay@fujitsu.com wrote:
> Rebased to HEAD with the following modifications.  It passes make check in the top directory and
contrib/postgres_fdw.
> That said, with the reviews from some people and good performance results, I think this can be ready for committer.

This is crashing during fdw check.
http://cfbot.cputube.org/andrey-lepikhov.html

Maybe it's related to this patch:
|commit 6214e2b2280462cbc3aa1986e350e167651b3905
|    Fix permission checks on constraint violation errors on partitions.
|    Security: CVE-2021-3393

TRAP: FailedAssertion("n >= 0 && n < list->length", File: "../../src/include/nodes/pg_list.h", Line: 259, PID: 19780)

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007fd33a557801 in __GI_abort () at abort.c:79
#2  0x000055f7f53bbc88 in ExceptionalCondition (conditionName=conditionName@entry=0x7fd33b81bc40 "n >= 0 && n <
list->length",errorType=errorType@entry=0x7fd33b81b698 "FailedAssertion", 
 
    fileName=fileName@entry=0x7fd33b81be70 "../../src/include/nodes/pg_list.h", lineNumber=lineNumber@entry=259) at
assert.c:69
#3  0x00007fd33b816b54 in list_nth_cell (n=<optimized out>, list=<optimized out>) at
../../src/include/nodes/pg_list.h:259
#4  list_nth (n=<optimized out>, list=<optimized out>) at ../../src/include/nodes/pg_list.h:281
#5  exec_rt_fetch (estate=<optimized out>, rti=<optimized out>) at ../../src/include/executor/executor.h:558
#6  postgresBeginForeignCopy (mtstate=<optimized out>, resultRelInfo=<optimized out>) at postgres_fdw.c:2208
#7  0x000055f7f5114bb4 in ExecInitRoutingInfo (mtstate=mtstate@entry=0x55f7f710a508,
estate=estate@entry=0x55f7f71a7d50,proute=proute@entry=0x55f7f710a720, dispatch=dispatch@entry=0x55f7f710a778, 
 
    partRelInfo=partRelInfo@entry=0x55f7f710eb20, partidx=partidx@entry=0) at execPartition.c:1004
#8  0x000055f7f511618d in ExecInitPartitionInfo (partidx=0, rootResultRelInfo=0x55f7f710a278, dispatch=0x55f7f710a778,
proute=0x55f7f710a720,estate=0x55f7f71a7d50, mtstate=0x55f7f710a508) at execPartition.c:742
 
#9  ExecFindPartition () at execPartition.c:400
#10 0x000055f7f50a2718 in CopyFrom () at copyfrom.c:857
#11 0x000055f7f50a1b06 in DoCopy () at copy.c:299

(gdb) up
#7  0x000055f7f5114bb4 in ExecInitRoutingInfo (mtstate=mtstate@entry=0x55f7f710a508,
estate=estate@entry=0x55f7f71a7d50,proute=proute@entry=0x55f7f710a720, dispatch=dispatch@entry=0x55f7f710a778, 
 
    partRelInfo=partRelInfo@entry=0x55f7f710eb20, partidx=partidx@entry=0) at execPartition.c:1004
1004                            partRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, partRelInfo);
(gdb) p partRelInfo->ri_RangeTableIndex
$7 = 0
(gdb) p *estate->es_range_table
$9 = {type = T_List, length = 1, max_length = 5, elements = 0x55f7f717a2c0, initial_elements = 0x55f7f717a2c0}

-- 
Justin

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

15 February 2021, 04:54:00

From: Justin Pryzby <pryzby@telsasoft.com>
> This is crashing during fdw check.
> http://cfbot.cputube.org/andrey-lepikhov.html
>
> Maybe it's related to this patch:
> |commit 6214e2b2280462cbc3aa1986e350e167651b3905
> |    Fix permission checks on constraint violation errors on partitions.
> |    Security: CVE-2021-3393

Thank you for your kind detailed investigation.  The rebased version is attached.


Regards
Takayuki Tsunakawa

Attachment

v15-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Amit Langote

Date:

15 February 2021, 08:31:11

Tsunakawa-san, Andrey,

On Mon, Feb 15, 2021 at 1:54 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> From: Justin Pryzby <pryzby@telsasoft.com>
> > This is crashing during fdw check.
> > http://cfbot.cputube.org/andrey-lepikhov.html
> >
> > Maybe it's related to this patch:
> > |commit 6214e2b2280462cbc3aa1986e350e167651b3905
> > |    Fix permission checks on constraint violation errors on partitions.
> > |    Security: CVE-2021-3393
>
> Thank you for your kind detailed investigation.  The rebased version is attached.

Thanks for updating the patch.

The commit message says this:

    Move that decision logic into InitResultRelInfo which now sets a new
    boolean field ri_usesMultiInsert of ResultRelInfo when a target
    relation is first initialized.  That prevents repeated computation
    of the same information in some cases, especially for partitions,
    and the new arrangement results in slightly more readability.
    Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert
    field of the ResultRelInfo structure.

However, it is no longer InitResultRelInfo() that sets
ri_usesMultiInsert.  Doing that is now left for concerned functions
who set it when they have enough information to do that correctly.
Maybe update the message to make that clear to interested readers.

+   /*
+    * Use multi-insert mode if the condition checking passes for the
+    * parent and its child.
+    */
+   leaf_part_rri->ri_usesMultiInsert =
+       ExecMultiInsertAllowed(leaf_part_rri, rootResultRelInfo);

Think I have mentioned upthread that this looks better as:

if (rootResultRelInfo->ri_usesMultiInsert)
    leaf_part_rri->ri_usesMultiInsert = ExecMultiInsertAllowed(leaf_part_rri);

This keeps the logic confined to ExecInitPartitionInfo() where it
belongs.  No point in burdening other callers of
ExecMultiInsertAllowed() in deciding whether or not it should pass a
valid value for the 2nd parameter.

+static void
+postgresBeginForeignCopy(ModifyTableState *mtstate,
+                          ResultRelInfo *resultRelInfo)
+{
...
+   if (resultRelInfo->ri_RangeTableIndex == 0)
+   {
+       ResultRelInfo *rootResultRelInfo = resultRelInfo->ri_RootResultRelInfo;
+
+       rte = exec_rt_fetch(rootResultRelInfo->ri_RangeTableIndex, estate);

It's better to add an Assert(rootResultRelInfo != NULL) here.
Apparently, there are cases where ri_RangeTableIndex == 0 without
ri_RootResultRelInfo being set.  The Assert will ensure that
BeginForeignCopy() is not mistakenly called on such ResultRelInfos.

+/*
+ * Deparse COPY FROM into given buf.
+ * We need to use list of parameters at each query.
+ */
+void
+deparseCopyFromSql(StringInfo buf, Relation rel)
+{
+   appendStringInfoString(buf, "COPY ");
+   deparseRelation(buf, rel);
+   (void) deparseRelColumnList(buf, rel, true);
+
+   appendStringInfoString(buf, " FROM STDIN ");
+}

I can't parse what the function's comment says about "using list of
parameters".  Maybe it means to say "list of columns" specified in the
COPY FROM statement.  How about writing this as:

/*
 * Deparse remote COPY FROM statement
 *
 * Note that this explicitly specifies the list of COPY's target columns
 * to account for the fact that the remote table's columns may not match
 * exactly with the columns declared in the local definition.
 */

I'm hoping that I'm interpreting the original note correctly.  Andrey?

+    <para>
+     <literal>mtstate</literal> is the overall state of the
+     <structname>ModifyTable</structname> plan node being executed;
global data about
+     the plan and execution state is available via this structure.
...
+typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate,
+                                          ResultRelInfo *rinfo);

Maybe a bit late realizing this, but why does BeginForeignCopy()
accept a ModifyTableState pointer whereas maybe just an EState pointer
will do?  I can't imagine why an FDW would want to look at the
ModifyTableState.  Case in point, I see that
postgresBeginForeignCopy() only uses the EState from the
ModifyTableState passed to it.  I think the ResultRelInfo that's being
passed to the Copy APIs contains most of the necessary information.
Also, EndForeignCopy() seems fine with just receiving the EState.

+   TupleDesc   tupDesc;        /* COPY TO will be used for manual tuple copying
+                                 * into the destination */
...
@@ -382,19 +393,24 @@ EndCopy(CopyToState cstate)
 CopyToState
 BeginCopyTo(ParseState *pstate,
            Relation rel,
+           TupleDesc srcTupDesc,

I think that either the commentary around tupDesc/srcTupDesc needs to
be improved or we should really find a way to do this without
maintaining TupleDesc separately from the CopyState.rel.  IIUC, this
change is merely to allow postgres_fdw's ExecForeignCopy() to use
CopyOneRowTo() which needs to be passed a valid CopyState.
postgresBeginForeignCopy() initializes one by calling BeginCopyTo(),
but it can't just pass the target foreign Relation to it, because
generic BeginCopyTo() has this:

    if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
    {
        ...
        else if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
            ereport(ERROR,
                    (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                     errmsg("cannot copy from foreign table \"%s\"",
                            RelationGetRelationName(rel)),
                     errhint("Try the COPY (SELECT ...) TO variant.")));

If the intention is to only prevent this error, maybe the condition
above could be changed as this:

    /*
     * Check whether we support copying data out of the specified relation,
     * unless the caller also passed a non-NULL data_dest_cb, in which case,
     * the callback will take care of it
     */
    if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION &&
        data_dest_cb == NULL)

I just checked that this works or at least doesn't break any newly added tests.

--
Amit Langote
EDB: http://www.enterprisedb.com

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

16 February 2021, 05:39:57

From: Amit Langote <amitlangote09@gmail.com>
> Think I have mentioned upthread that this looks better as:
> 
> if (rootResultRelInfo->ri_usesMultiInsert)
>     leaf_part_rri->ri_usesMultiInsert = ExecMultiInsertAllowed(leaf_part_rri);
> 
> This keeps the logic confined to ExecInitPartitionInfo() where it
> belongs.  No point in burdening other callers of
> ExecMultiInsertAllowed() in deciding whether or not it should pass a
> valid value for the 2nd parameter.

Oh, that's a good idea.  (Why didn't I think of such a simple idea?)



> Maybe a bit late realizing this, but why does BeginForeignCopy()
> accept a ModifyTableState pointer whereas maybe just an EState pointer
> will do?  I can't imagine why an FDW would want to look at the
> ModifyTableState.  Case in point, I see that
> postgresBeginForeignCopy() only uses the EState from the
> ModifyTableState passed to it.  I think the ResultRelInfo that's being
> passed to the Copy APIs contains most of the necessary information.

You're right.  COPY is not under the control of a ModifyTable plan, so it's strange to pass ModifyTableState.


> Also, EndForeignCopy() seems fine with just receiving the EState.

I think this can have the ResultRelInfo like EndForeignInsert() and EndForeignModify() to correspond to the Begin
function:"begin/end COPYing into this relation."
 


>     /*
>      * Check whether we support copying data out of the specified relation,
>      * unless the caller also passed a non-NULL data_dest_cb, in which case,
>      * the callback will take care of it
>      */
>     if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION &&
>         data_dest_cb == NULL)
> 
> I just checked that this works or at least doesn't break any newly added tests.

Good idea, too.  The code has become more readable.

Thank you a lot.  Your other comments that are not mentioned above are also reflected.  The attached patch passes the
postgres_fdwregression test.
 


Regards
Takayuki Tsunakawa

Attachment

v16-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

"Andrey V. Lepikhov"

Date:

16 February 2021, 08:23:34

On 2/15/21 1:31 PM, Amit Langote wrote:
> Tsunakawa-san, Andrey,
> +static void
> +postgresBeginForeignCopy(ModifyTableState *mtstate,
> +                          ResultRelInfo *resultRelInfo)
> +{
> ...
> +   if (resultRelInfo->ri_RangeTableIndex == 0)
> +   {
> +       ResultRelInfo *rootResultRelInfo = resultRelInfo->ri_RootResultRelInfo;
> +
> +       rte = exec_rt_fetch(rootResultRelInfo->ri_RangeTableIndex, estate);
> 
> It's better to add an Assert(rootResultRelInfo != NULL) here.
> Apparently, there are cases where ri_RangeTableIndex == 0 without
> ri_RootResultRelInfo being set.  The Assert will ensure that
> BeginForeignCopy() is not mistakenly called on such ResultRelInfos.

+1

> I can't parse what the function's comment says about "using list of
> parameters".  Maybe it means to say "list of columns" specified in the
> COPY FROM statement.  How about writing this as:
> 
> /*
>   * Deparse remote COPY FROM statement
>   *
>   * Note that this explicitly specifies the list of COPY's target columns
>   * to account for the fact that the remote table's columns may not match
>   * exactly with the columns declared in the local definition.
>   */
> 
> I'm hoping that I'm interpreting the original note correctly.  Andrey?

Yes, this is a good option.

> 
> +    <para>
> +     <literal>mtstate</literal> is the overall state of the
> +     <structname>ModifyTable</structname> plan node being executed;
> global data about
> +     the plan and execution state is available via this structure.
> ...
> +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate,
> +                                          ResultRelInfo *rinfo);
> 
> Maybe a bit late realizing this, but why does BeginForeignCopy()
> accept a ModifyTableState pointer whereas maybe just an EState pointer
> will do?  I can't imagine why an FDW would want to look at the
> ModifyTableState.  Case in point, I see that
> postgresBeginForeignCopy() only uses the EState from the
> ModifyTableState passed to it.  I think the ResultRelInfo that's being
> passed to the Copy APIs contains most of the necessary information.
> Also, EndForeignCopy() seems fine with just receiving the EState.

+1

> If the intention is to only prevent this error, maybe the condition
> above could be changed as this:
> 
>      /*
>       * Check whether we support copying data out of the specified relation,
>       * unless the caller also passed a non-NULL data_dest_cb, in which case,
>       * the callback will take care of it
>       */
>      if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION &&
>          data_dest_cb == NULL)

Agreed. This is an atavism. In the first versions, I did not use the 
data_dest_cb routine. But now this is a redundant parameter.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Andrey Lepikhov

Date:

22 March 2021, 18:33:23

On 5/3/21 21:54, tsunakawa.takay@fujitsu.com wrote:
> I've managed to rebased it, although it took unexpectedly long.  The patch is attached.  It passes make check against
coreand postgres_fdw.  I'll turn the CF status back to ready for committer shortly.
 

Macros _() at the postgresExecForeignCopy routine:
if (PQputCopyEnd(conn, OK ? NULL : _("canceled by server")) <= 0)

uses gettext. Under linux it is compiled ok, because (as i understood)  
uses standard implementation of gettext:
objdump -t contrib/postgres_fdw/postgres_fdw.so | grep 'gettext'
gettext@@GLIBC_2.2.5

but in MacOS (and maybe somewhere else) we need to explicitly link  
libintl library in the Makefile:
SHLIB_LINK += $(filter -lintl, $(LIBS)

Also, we may not use gettext at all in this part of the code.

-- 
regards,
Andrey Lepikhov
Postgres Professional

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

23 March 2021, 02:01:56

From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
> Macros _() at the postgresExecForeignCopy routine:
> if (PQputCopyEnd(conn, OK ? NULL : _("canceled by server")) <= 0)
>
> uses gettext. Under linux it is compiled ok, because (as i understood)
> uses standard implementation of gettext:
> objdump -t contrib/postgres_fdw/postgres_fdw.so | grep 'gettext'
> gettext@@GLIBC_2.2.5
>
> but in MacOS (and maybe somewhere else) we need to explicitly link
> libintl library in the Makefile:
> SHLIB_LINK += $(filter -lintl, $(LIBS)
>
> Also, we may not use gettext at all in this part of the code.

I'm afraid so, because no extension in contrib/ has po/ directory.  I just removed _() and rebased the patch on HEAD.


    Regards
Takayuki     Tsunakawa

Attachment

v21-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Zhihong Yu

Date:

23 March 2021, 03:18:56

Hi,

In the description:

with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination.

send text -> sending text

struct PgFdwModifyState *aux_fmstate; /* foreign-insert state, if
* created */
+ CopyToState cstate; /* foreign COPY state, if used */

Since foreign COPY is optional, should cstate be a pointer ? That would be in line with aux_fmstate.

Cheers

On Mon, Mar 22, 2021 at 7:02 PM tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote:

From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
> Macros _() at the postgresExecForeignCopy routine:
> if (PQputCopyEnd(conn, OK ? NULL : _("canceled by server")) <= 0)
>
> uses gettext. Under linux it is compiled ok, because (as i understood)
> uses standard implementation of gettext:
> objdump -t contrib/postgres_fdw/postgres_fdw.so | grep 'gettext'
> gettext@@GLIBC_2.2.5
>
> but in MacOS (and maybe somewhere else) we need to explicitly link
> libintl library in the Makefile:
> SHLIB_LINK += $(filter -lintl, $(LIBS)
>
> Also, we may not use gettext at all in this part of the code.

I'm afraid so, because no extension in contrib/ has po/ directory. I just removed _() and rebased the patch on HEAD.

Regards
Takayuki Tsunakawa

RE: [POC] Fast COPY FROM command for the table with foreign partitions

From

"tsunakawa.takay@fujitsu.com"

Date:

23 March 2021, 05:05:48

From: Justin Pryzby <pryzby@telsasoft.com>
> On Mon, Mar 22, 2021 at 08:18:56PM -0700, Zhihong Yu wrote:
> > with data_dest_cb callback. It is used for send text representation of a
> > tuple to a custom destination.
> >
> > send text -> sending text
>
> I would say "It is used to send the text representation ..."

I took Justin-san's suggestion.  (It feels like I'm in a junior English class...)


> It's actually a pointer:
> src/include/commands/copy.h:typedef struct CopyToStateData *CopyToState;
>
> There's many data structures like this, where a structure is typedefed with a
> "Data" suffix and the pointer is typedefed without the "Data"

Yes.  Thank you for good explanation, Justin-san.



    Regards
Takayuki     Tsunakawa

Attachment

v22-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Zhihong Yu

Date:

09 April 2021, 01:42:40

On Thu, Apr 8, 2021 at 5:49 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.

--
Justin

Hi,

In src/backend/commands/copyfrom.c :

+ if (resultRelInfo->ri_RelationDesc->rd_rel->relkind == RELKIND_FOREIGN_TABLE)

There are a few steps of indirection. Adding assertion before the if statement on resultRelInfo->ri_RelationDesc, etc would help catch potential invalid pointer.

+CopyToStart(CopyToState cstate)

...

+CopyToFinish(CopyToState cstate)

Since 'copy to' is the action, it would be easier to read the method names if they're called StartCopyTo, FinishCopyTo, respectively.

That way, the method names would be consistent with existing ones, such as:

extern uint64 DoCopyTo(CopyToState cstate);

+ * If a partition's root parent isn't allowed to use it, neither is the

In the above sentence, 'it' refers to multi insert. It would be more readable to explicitly mention 'multi insert' instead of 'it'

Cheers

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Etsuro Fujita

Date:

09 April 2021, 03:50:28

On Fri, Apr 9, 2021 at 9:49 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.
>
> --
> Justin

Re: [POC] Fast COPY FROM command for the table with foreign partitions

From

Etsuro Fujita

Date:

09 April 2021, 03:53:55

On Fri, Apr 9, 2021 at 9:49 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6.

Thanks for rebasing!

Actually, I've started reviewing this, but I couldn't finish my
review.  My apologies for not having much time on this.  I'll continue
to work on it for PG15.

Sorry for the empty email.

Best regards,
Etsuro Fujita

Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

04 June 2021, 08:26:29

Hi,
We still have slow 'COPY FROM' operation for foreign tables in current 
master.
Now we have a foreign batch insert operation And I tried to rewrite the 
patch [1] with this machinery.

The patch (see in attachment) smaller than [1] and no changes required 
in FDW API.

Benchmarking
============
I used two data sets: with a number of 1E6 and 1E7 tuples. As a foreign 
server emulation I used loopback FDW links.

Test table:
CREATE TABLE test(a int, payload varchar(80));

Execution time of COPY FROM into single foreign table:
version    | 1E6 tuples | 1E7 tuples |
master:    | 64s        | 775s       |
Patch [1]: | 5s         | 50s        |
Current:   | 4s         | 42s        |
Execution time of the COPY operation into a plane table is 0.8s for 1E6 
tuples and 8s for 1E7 tuples.

Execution time of COPY FROM into the table partitioned by three foreign 
partitions:
version    | 1E6 tuples | 1E7 tuples |
master:    | 85s        | 900s       |
Patch [1]: | 10s        | 100s       |
Current:   | 3.5s       | 34s        |

But the bulk insert execution time in current implementation strongly 
depends on MAX_BUFFERED_TUPLES/BYTES value and in my experiments was 
reduced to 50s.

[1] 
https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

0001-Implementation-of-a-Bulk-COPY-FROM.patch

RE: Fast COPY FROM based on batch insert

From

"tsunakawa.takay@fujitsu.com"

Date:

04 June 2021, 08:45:55

From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
> We still have slow 'COPY FROM' operation for foreign tables in current master.
> Now we have a foreign batch insert operation And I tried to rewrite the patch [1]
> with this machinery.

I haven't looked at the patch, but nice performance.

However, I see the following problems.  What do you think about them?

1)
No wonder why the user would think like "Why are INSERTs run on the remote server?  I ran COPY."


2)
Without the FDW API for COPY, other FDWs won't get a chance to optimize for bulk data loading.  For example, oracle_fdw
mightuse conventional path insert for the FDW batch insert, and the direct path insert for the FDW COPY.
 


3)
INSERT and COPY in Postgres differs in whether the rule is invoked:

https://www.postgresql.org/docs/devel/sql-copy.html

"COPY FROM will invoke any triggers and check constraints on the destination table. However, it will not invoke
rules."


Regards
Takayuki Tsunakawa

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

07 June 2021, 06:36:18

On 4/6/21 13:45, tsunakawa.takay@fujitsu.com wrote:
> From: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
>> We still have slow 'COPY FROM' operation for foreign tables in current master.
>> Now we have a foreign batch insert operation And I tried to rewrite the patch [1]
>> with this machinery.
> 
> I haven't looked at the patch, but nice performance.
> 
> However, I see the following problems.  What do you think about them?
I agree with your fears.
Think about this patch as an intermediate step on the way to fast COPY 
FROM. This patch contains all logic of the previous patch, except of 
transport machinery (bulk insertion api).
It may be simpler to understand advantages of proposed 'COPY' FDW API 
having committed 'COPY FROM ...' feature based on the bulk insert FDW API.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

07 June 2021, 11:16:58

Second version of the patch fixes problems detected by the FDW 
regression tests and shows differences of error reports in 
tuple-by-tuple and batched COPY approaches.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v2-0001-Implementation-of-a-Bulk-COPY-FROM.patch

Re: Fast COPY FROM based on batch insert

From

Andres Freund

Date:

21 March 2022, 23:58:01

On 2021-06-07 16:16:58 +0500, Andrey Lepikhov wrote:
> Second version of the patch fixes problems detected by the FDW regression
> tests and shows differences of error reports in tuple-by-tuple and batched
> COPY approaches.

Patch doesn't apply and likely hasn't for a while...

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

22 March 2022, 01:54:00

On Fri, Jun 4, 2021 at 5:26 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> We still have slow 'COPY FROM' operation for foreign tables in current
> master.
> Now we have a foreign batch insert operation And I tried to rewrite the
> patch [1] with this machinery.

I’d been reviewing the previous version of the patch without noticing
this.  (Gmail grouped it in a new thread due to the subject change,
but I overlooked the whole thread.)

I agree with you that the first step for fast copy into foreign
tables/partitions is to use the foreign-batch-insert API.  (Actually,
I was also thinking the same while reviewing the previous version.)
Thanks for the new version of the patch!

The patch has been rewritten to something essentially different, but
no one reviewed it.  (Tsunakawa-san gave some comments without looking
at it, though.)  So the right status of the patch is “Needs review”,
rather than “Ready for Committer”?  Anyway, here are a few review
comments from me:

* I don’t think this assumption is correct:

@@ -359,6 +386,12 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
                 (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
                  resultRelInfo->ri_TrigDesc->trig_insert_new_table))
        {
+           /*
+            * AFTER ROW triggers aren't allowed with the foreign bulk insert
+            * method.
+            */
+           Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind !=
RELKIND_FOREIGN_TABLE);
+

In postgres_fdw we disable foreign batch insert when the target table
has AFTER ROW triggers, but the core allows it even in that case.  No?

* To allow foreign multi insert, the patch made an invasive change to
the existing logic to determine whether to use multi insert for the
target relation, adding a new member ri_usesMultiInsert to the
ResultRelInfo struct, as well as introducing a new function
ExecMultiInsertAllowed().  But I’m not sure we really need such a
change.  Isn’t it reasonable to *adjust* the existing logic to allow
foreign multi insert when possible?

I didn’t finish my review, but I’ll mark this as “Waiting on Author”.

My apologies for the long long delay.

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

22 March 2022, 02:36:00

On Tue, Mar 22, 2022 at 8:58 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-06-07 16:16:58 +0500, Andrey Lepikhov wrote:
> > Second version of the patch fixes problems detected by the FDW regression
> > tests and shows differences of error reports in tuple-by-tuple and batched
> > COPY approaches.
>
> Patch doesn't apply and likely hasn't for a while...

Actually, it has bit-rotted due to the recent fix for cross-partition
updates (i.e., commit ba9a7e392).

Thanks!

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

"Andrey V. Lepikhov"

Date:

24 March 2022, 06:43:37

On 3/22/22 06:54, Etsuro Fujita wrote:
> On Fri, Jun 4, 2021 at 5:26 PM Andrey Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> We still have slow 'COPY FROM' operation for foreign tables in current
>> master.
>> Now we have a foreign batch insert operation And I tried to rewrite the
>> patch [1] with this machinery.
> 
> The patch has been rewritten to something essentially different, but
> no one reviewed it.  (Tsunakawa-san gave some comments without looking
> at it, though.)  So the right status of the patch is “Needs review”,
> rather than “Ready for Committer”?  Anyway, here are a few review
> comments from me:
> 
> * I don’t think this assumption is correct:
> 
> @@ -359,6 +386,12 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
>                   (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
>                    resultRelInfo->ri_TrigDesc->trig_insert_new_table))
>          {
> +           /*
> +            * AFTER ROW triggers aren't allowed with the foreign bulk insert
> +            * method.
> +            */
> +           Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind !=
> RELKIND_FOREIGN_TABLE);
> +
> 
> In postgres_fdw we disable foreign batch insert when the target table
> has AFTER ROW triggers, but the core allows it even in that case.  No?
Agree

> * To allow foreign multi insert, the patch made an invasive change to
> the existing logic to determine whether to use multi insert for the
> target relation, adding a new member ri_usesMultiInsert to the
> ResultRelInfo struct, as well as introducing a new function
> ExecMultiInsertAllowed().  But I’m not sure we really need such a
> change.  Isn’t it reasonable to *adjust* the existing logic to allow
> foreign multi insert when possible?
Of course, such approach would look much better, if we implemented it. 
I'll ponder how to do it.

> I didn’t finish my review, but I’ll mark this as “Waiting on Author”.
I rebased the patch onto current master. Now it works correctly. I'll 
mark it as "Waiting for review".

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v3-0001-Implementation-of-a-Bulk-COPY-FROM.patch

Re: Fast COPY FROM based on batch insert

From

Ian Barwick

Date:

07 July 2022, 03:14:44

2022年3月24日(木) 15:44 Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>:
 >
 > On 3/22/22 06:54, Etsuro Fujita wrote:
 > > On Fri, Jun 4, 2021 at 5:26 PM Andrey Lepikhov
 > > <a.lepikhov@postgrespro.ru> wrote:
 > >> We still have slow 'COPY FROM' operation for foreign tables in current
 > >> master.
 > >> Now we have a foreign batch insert operation And I tried to rewrite the
 > >> patch [1] with this machinery.
 > >
 > > The patch has been rewritten to something essentially different, but
 > > no one reviewed it.  (Tsunakawa-san gave some comments without looking
 > > at it, though.)  So the right status of the patch is “Needs review”,
 > > rather than “Ready for Committer”?  Anyway, here are a few review
 > > comments from me:
 > >
 > > * I don’t think this assumption is correct:
 > >
 > > @@ -359,6 +386,12 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
 > >                   (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
 > >                    resultRelInfo->ri_TrigDesc->trig_insert_new_table))
 > >          {
 > > +           /*
 > > +            * AFTER ROW triggers aren't allowed with the foreign bulk insert
 > > +            * method.
 > > +            */
 > > +           Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind !=
 > > RELKIND_FOREIGN_TABLE);
 > > +
 > >
 > > In postgres_fdw we disable foreign batch insert when the target table
 > > has AFTER ROW triggers, but the core allows it even in that case.  No?
 > Agree
 >
 > > * To allow foreign multi insert, the patch made an invasive change to
 > > the existing logic to determine whether to use multi insert for the
 > > target relation, adding a new member ri_usesMultiInsert to the
 > > ResultRelInfo struct, as well as introducing a new function
 > > ExecMultiInsertAllowed().  But I’m not sure we really need such a
 > > change.  Isn’t it reasonable to *adjust* the existing logic to allow
 > > foreign multi insert when possible?
 > Of course, such approach would look much better, if we implemented it.
 > I'll ponder how to do it.
 >
 > > I didn’t finish my review, but I’ll mark this as “Waiting on Author”.
 > I rebased the patch onto current master. Now it works correctly. I'll
 > mark it as "Waiting for review".

I took a look at this patch as it would a useful optimization to have.

It applies cleanly to current HEAD, but as-is, with a large data set, it
reproducibly fails like this (using postgres_fdw):

     postgres=# COPY foo FROM '/tmp/fast-copy-from/test.csv' WITH (format csv);
     ERROR:  bind message supplies 0 parameters, but prepared statement "pgsql_fdw_prep_19422" requires 6
     CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, v1, v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5,
$6)
     COPY foo, line 17281589

This occurs because not all multi-insert buffers being flushed actually contain
tuples; the fix is simply not to call ExecForeignBatchInsert() if that's the case,
e.g:


         /* Flush into foreign table or partition */
         do {
             int size = (resultRelInfo->ri_BatchSize < nused - sent) ?
                         resultRelInfo->ri_BatchSize : (nused - sent);

             if (size)
             {
                 int inserted = size;

                 resultRelInfo->ri_FdwRoutine->ExecForeignBatchInsert(estate,
                                                                      resultRelInfo,
                                                                      &slots[sent],
                                                                      NULL,
                                                                      &inserted);
                 sent += size;
             }
         } while (sent < nused);


There might a case for arguing that the respective FDW should check that it has
actually received some tuples to insert, but IMHO it's much preferable to catch
this as early as possible and avoid a superfluous call.

FWIW, with the above fix in place, with a simple local test the patch produces a
consistent speed-up of about 8 times compared to the existing functionality.


Regards

Ian Barwick

--

EnterpriseDB - https://www.enterprisedb.com

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

07 July 2022, 13:51:54

On 7/7/2022 06:14, Ian Barwick wrote:
> 2022年3月24日(木) 15:44 Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>:
>  >
>  > On 3/22/22 06:54, Etsuro Fujita wrote:
>  > > On Fri, Jun 4, 2021 at 5:26 PM Andrey Lepikhov
>  > > <a.lepikhov@postgrespro.ru> wrote:
>  > >> We still have slow 'COPY FROM' operation for foreign tables in 
> current
>  > >> master.
>  > >> Now we have a foreign batch insert operation And I tried to 
> rewrite the
>  > >> patch [1] with this machinery.
>  > >
>  > > The patch has been rewritten to something essentially different, but
>  > > no one reviewed it.  (Tsunakawa-san gave some comments without looking
>  > > at it, though.)  So the right status of the patch is “Needs review”,
>  > > rather than “Ready for Committer”?  Anyway, here are a few review
>  > > comments from me:
>  > >
>  > > * I don’t think this assumption is correct:
>  > >
>  > > @@ -359,6 +386,12 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo 
> *miinfo,
>  > >                   
> (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
>  > >                    resultRelInfo->ri_TrigDesc->trig_insert_new_table))
>  > >          {
>  > > +           /*
>  > > +            * AFTER ROW triggers aren't allowed with the foreign 
> bulk insert
>  > > +            * method.
>  > > +            */
>  > > +           Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind !=
>  > > RELKIND_FOREIGN_TABLE);
>  > > +
>  > >
>  > > In postgres_fdw we disable foreign batch insert when the target table
>  > > has AFTER ROW triggers, but the core allows it even in that case.  No?
>  > Agree
>  >
>  > > * To allow foreign multi insert, the patch made an invasive change to
>  > > the existing logic to determine whether to use multi insert for the
>  > > target relation, adding a new member ri_usesMultiInsert to the
>  > > ResultRelInfo struct, as well as introducing a new function
>  > > ExecMultiInsertAllowed().  But I’m not sure we really need such a
>  > > change.  Isn’t it reasonable to *adjust* the existing logic to allow
>  > > foreign multi insert when possible?
>  > Of course, such approach would look much better, if we implemented it.
>  > I'll ponder how to do it.
>  >
>  > > I didn’t finish my review, but I’ll mark this as “Waiting on Author”.
>  > I rebased the patch onto current master. Now it works correctly. I'll
>  > mark it as "Waiting for review".
> 
> I took a look at this patch as it would a useful optimization to have.
> 
> It applies cleanly to current HEAD, but as-is, with a large data set, it
> reproducibly fails like this (using postgres_fdw):
> 
>      postgres=# COPY foo FROM '/tmp/fast-copy-from/test.csv' WITH 
> (format csv);
>      ERROR:  bind message supplies 0 parameters, but prepared statement 
> "pgsql_fdw_prep_19422" requires 6
>      CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, v1, 
> v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5, $6)
>      COPY foo, line 17281589
> 
> This occurs because not all multi-insert buffers being flushed actually 
> contain
> tuples; the fix is simply not to call ExecForeignBatchInsert() if that's 
> the case,
> e.g:
> 
> 
>          /* Flush into foreign table or partition */
>          do {
>              int size = (resultRelInfo->ri_BatchSize < nused - sent) ?
>                          resultRelInfo->ri_BatchSize : (nused - sent);
> 
>              if (size)
>              {
>                  int inserted = size;
> 
>                  
> resultRelInfo->ri_FdwRoutine->ExecForeignBatchInsert(estate,
>                                                                       
> resultRelInfo,
>                                                                       
> &slots[sent],
>                                                                       NULL,
>                                                                       
> &inserted);
>                  sent += size;
>              }
>          } while (sent < nused);
> 
> 
> There might a case for arguing that the respective FDW should check that 
> it has
> actually received some tuples to insert, but IMHO it's much preferable 
> to catch
> this as early as possible and avoid a superfluous call.
> 
> FWIW, with the above fix in place, with a simple local test the patch 
> produces a
> consistent speed-up of about 8 times compared to the existing 
> functionality.
Thank you for the attention to the patch.
I have a couple of questions:
1. It's a problem for me to reproduce the case you reported. Can you 
give more details on the reproduction?
2. Have you tried to use previous version, based on bulk COPY machinery, 
not bulk INSERT? Which approach looks better and have better performance 
in your opinion?

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Ian Barwick

Date:

08 July 2022, 02:12:28

On 07/07/2022 22:51, Andrey Lepikhov wrote:
 > On 7/7/2022 06:14, Ian Barwick wrote:
 >> 2022年3月24日(木) 15:44 Andrey V. Lepikhov <a.lepikhov@postgrespro.ru>:
 >>  >
 >>  > On 3/22/22 06:54, Etsuro Fujita wrote:
 >>  > > On Fri, Jun 4, 2021 at 5:26 PM Andrey Lepikhov
 >>  > > <a.lepikhov@postgrespro.ru> wrote:
 >>  > >> We still have slow 'COPY FROM' operation for foreign tables in current
 >>  > >> master.
 >>  > >> Now we have a foreign batch insert operation And I tried to rewrite the
 >>  > >> patch [1] with this machinery.
 >>  > >
 >>  > > The patch has been rewritten to something essentially different, but
 >>  > > no one reviewed it.  (Tsunakawa-san gave some comments without looking
 >>  > > at it, though.)  So the right status of the patch is “Needs review”,
 >>  > > rather than “Ready for Committer”?  Anyway, here are a few review
 >>  > > comments from me:
 >>  > >
 >>  > > * I don’t think this assumption is correct:
 >>  > >
 >>  > > @@ -359,6 +386,12 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
 >>  > > (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
 >>  > >                    resultRelInfo->ri_TrigDesc->trig_insert_new_table))
 >>  > >          {
 >>  > > +           /*
 >>  > > +            * AFTER ROW triggers aren't allowed with the foreign bulk insert
 >>  > > +            * method.
 >>  > > +            */
 >>  > > +           Assert(resultRelInfo->ri_RelationDesc->rd_rel->relkind !=
 >>  > > RELKIND_FOREIGN_TABLE);
 >>  > > +
 >>  > >
 >>  > > In postgres_fdw we disable foreign batch insert when the target table
 >>  > > has AFTER ROW triggers, but the core allows it even in that case.  No?
 >>  > Agree
 >>  >
 >>  > > * To allow foreign multi insert, the patch made an invasive change to
 >>  > > the existing logic to determine whether to use multi insert for the
 >>  > > target relation, adding a new member ri_usesMultiInsert to the
 >>  > > ResultRelInfo struct, as well as introducing a new function
 >>  > > ExecMultiInsertAllowed().  But I’m not sure we really need such a
 >>  > > change.  Isn’t it reasonable to *adjust* the existing logic to allow
 >>  > > foreign multi insert when possible?
 >>  > Of course, such approach would look much better, if we implemented it.
 >>  > I'll ponder how to do it.
 >>  >
 >>  > > I didn’t finish my review, but I’ll mark this as “Waiting on Author”.
 >>  > I rebased the patch onto current master. Now it works correctly. I'll
 >>  > mark it as "Waiting for review".
 >>
 >> I took a look at this patch as it would a useful optimization to have.
 >>
 >> It applies cleanly to current HEAD, but as-is, with a large data set, it
 >> reproducibly fails like this (using postgres_fdw):
 >>
 >>      postgres=# COPY foo FROM '/tmp/fast-copy-from/test.csv' WITH (format csv);
 >>      ERROR:  bind message supplies 0 parameters, but prepared statement "pgsql_fdw_prep_19422" requires 6
 >>      CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, v1, v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5,
$6)
 >>      COPY foo, line 17281589
 >>
 >> This occurs because not all multi-insert buffers being flushed actually contain
 >> tuples; the fix is simply not to call ExecForeignBatchInsert() if that's the case,
 >> e.g:
 >>
 >>
 >>          /* Flush into foreign table or partition */
 >>          do {
 >>              int size = (resultRelInfo->ri_BatchSize < nused - sent) ?
 >>                          resultRelInfo->ri_BatchSize : (nused - sent);
 >>
 >>              if (size)
 >>              {
 >>                  int inserted = size;
 >>
 >> resultRelInfo->ri_FdwRoutine->ExecForeignBatchInsert(estate,
 >> resultRelInfo,
 >> &slots[sent],
 >>                                                                       NULL,
 >> &inserted);
 >>                  sent += size;
 >>              }
 >>          } while (sent < nused);
 >>
 >>
 >> There might a case for arguing that the respective FDW should check that it has
 >> actually received some tuples to insert, but IMHO it's much preferable to catch
 >> this as early as possible and avoid a superfluous call.
 >>
 >> FWIW, with the above fix in place, with a simple local test the patch produces a
 >> consistent speed-up of about 8 times compared to the existing functionality.
 >
 > Thank you for the attention to the patch.
 > I have a couple of questions:
 >
 > 1. It's a problem for me to reproduce the case you reported. Can you give more
 >    details on the reproduction?

The issue seems to occur when the data spans more than one foreign partition,
probably because the accumulated data for one partition needs to be flushed
before moving on to the next partition, but not all pre-allocated multi-insert
buffers have been filled.

The reproduction method I have, which is pared down from the original bulk insert
which triggered the error, is as follows:

1. Create some data using the attached script:

     perl data.pl > /tmp/data.csv

2. Create two nodes (A and B)

3. On node B, create tables as follows:

     CREATE TABLE foo_part_1 (t timestamptz, v1 int, v2 int, v3 int, v4 text, v5 text);
     CREATE TABLE foo_part_2 (t timestamptz, v1 int, v2 int, v3 int, v4 text, v5 text);

4. On node A, create FDW and partitioned table as follows:

     -- adjust parameters as appropriate

     CREATE EXTENSION postgres_fdw;

     CREATE SERVER pg_fdw
       FOREIGN DATA WRAPPER postgres_fdw
       OPTIONS (
         host 'localhost',
         port '6301',
         dbname 'postgres',
         batch_size '100'
      );

     CREATE USER MAPPING FOR CURRENT_USER SERVER pg_fdw
       OPTIONS(user 'postgres');

     -- create parition table and partitions

     CREATE TABLE foo (t timestamptz, v1 int, v2 int, v3 int, v4 text, v5 text)  PARTITION BY RANGE(t);

     CREATE FOREIGN TABLE foo_part_1
        PARTITION OF foo
        FOR VALUES FROM ('2022-05-19 00:00:00') TO ('2022-05-20 00:00:00')
        SERVER pg_fdw;

     CREATE FOREIGN TABLE foo_part_2
        PARTITION OF foo
        FOR VALUES FROM ('2022-05-20 00:00:00') TO ('2022-05-21 00:00:00')
        SERVER pg_fdw;

5. On node A, load the previously generated data with COPY:

     COPY foo FROM '/tmp/data.csv' with (format 'csv');

This will fail like this:

     ERROR:  bind message supplies 0 parameters, but prepared statement "pgsql_fdw_prep_178" requires 6
     CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, v1, v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5,
$6)
     COPY foo, line 88160


 > 2. Have you tried to use previous version, based on bulk COPY machinery, not
 >    bulk INSERT? > Which approach looks better and have better performance in
 >    your opinion?

Aha, I didn't see that, I'll take a look.


Regards

Ian Barwick

-- 

EnterpriseDB - https://www.enterprisedb.com

Attachment

data.pl

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

08 July 2022, 15:09:35

On 8/7/2022 05:12, Ian Barwick wrote:
>      ERROR:  bind message supplies 0 parameters, but prepared statement 
> "pgsql_fdw_prep_178" requires 6
>      CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, v1, 
> v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5, $6)
>      COPY foo, line 88160
Thanks, I got it. MultiInsertBuffer are created on the first non-zero 
flush of tuples into the partition and isn't deleted from the buffers 
list until the end of COPY. And on a subsequent flush in the case of 
empty buffer we catch the error.
Your fix is correct, but I want to propose slightly different change 
(see in attachment).

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

flush_nonzero_buffer.txt

Re: Fast COPY FROM based on batch insert

From

Ian Barwick

Date:

11 July 2022, 01:12:26

On 09/07/2022 00:09, Andrey Lepikhov wrote:
> On 8/7/2022 05:12, Ian Barwick wrote:
>>      ERROR:  bind message supplies 0 parameters, but prepared statement "pgsql_fdw_prep_178" requires 6
>>      CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, v1, v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5,
$6)
>>      COPY foo, line 88160
> Thanks, I got it. MultiInsertBuffer are created on the first non-zero flush of tuples into the partition and isn't
deletedfrom the buffers list until the end of COPY. And on a subsequent flush in the case of empty buffer we catch the
error.
> Your fix is correct, but I want to propose slightly different change (see in attachment).

LGTM.

Regards

Ian Barwick

--
https://www.enterprisedb.com/

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

11 July 2022, 03:54:08

On 11/7/2022 04:12, Ian Barwick wrote:
> On 09/07/2022 00:09, Andrey Lepikhov wrote:
>> On 8/7/2022 05:12, Ian Barwick wrote:
>>>      ERROR:  bind message supplies 0 parameters, but prepared 
>>> statement "pgsql_fdw_prep_178" requires 6
>>>      CONTEXT:  remote SQL command: INSERT INTO public.foo_part_1(t, 
>>> v1, v2, v3, v4, v5) VALUES ($1, $2, $3, $4, $5, $6)
>>>      COPY foo, line 88160
>> Thanks, I got it. MultiInsertBuffer are created on the first non-zero 
>> flush of tuples into the partition and isn't deleted from the buffers 
>> list until the end of COPY. And on a subsequent flush in the case of 
>> empty buffer we catch the error.
>> Your fix is correct, but I want to propose slightly different change 
>> (see in attachment).
> 
> LGTM.
New version (with aforementioned changes) is attached.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

v4-0001-Implementation-of-a-Bulk-COPY-FROM.patch

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

18 July 2022, 08:22:49

On Thu, Mar 24, 2022 at 3:43 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 3/22/22 06:54, Etsuro Fujita wrote:
> > * To allow foreign multi insert, the patch made an invasive change to
> > the existing logic to determine whether to use multi insert for the
> > target relation, adding a new member ri_usesMultiInsert to the
> > ResultRelInfo struct, as well as introducing a new function
> > ExecMultiInsertAllowed().  But I’m not sure we really need such a
> > change.  Isn’t it reasonable to *adjust* the existing logic to allow
> > foreign multi insert when possible?
> Of course, such approach would look much better, if we implemented it.

> I'll ponder how to do it.

I rewrote the decision logic to something much simpler and much less
invasive, which reduces the patch size significantly.  Attached is an
updated patch.  What do you think about that?

While working on the patch, I fixed a few issues as well:

+       if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize != NULL)
+           resultRelInfo->ri_BatchSize =
+
resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);

When determining the batch size, I think we should check if the
ExecForeignBatchInsert callback routine is also defined, like other
places such as execPartition.c.  For consistency I fixed this by
copying-and-pasting the code from that file.

+    * Also, the COPY command requires a non-zero input list of attributes.
+    * Therefore, the length of the attribute list is checked here.
+    */
+   if (!cstate->volatile_defexprs &&
+       list_length(cstate->attnumlist) > 0 &&
+       !contain_volatile_functions(cstate->whereClause))
+       target_resultRelInfo->ri_usesMultiInsert =
+                   ExecMultiInsertAllowed(target_resultRelInfo);

I think “list_length(cstate->attnumlist) > 0” in the if-test would
break COPY FROM; it currently supports multi-inserting into *plain*
tables even in the case where they have no columns, but this would
disable the multi-insertion support in that case.  postgres_fdw would
not be able to batch into zero-column foreign tables due to the INSERT
syntax limitation (i.e., the syntax does not allow inserting multiple
empty rows into a zero-column table in a single INSERT statement).
Which is the reason why this was added to the if-test?  But I think
some other FDWs might be able to, so I think we should let the FDW
decide whether to allow batching even in that case, when called from
GetForeignModifyBatchSize.  So I removed the attnumlist test from the
patch, and modified postgresGetForeignModifyBatchSize as such.  I
might miss something, though.

Best regards,
Etsuro Fujita

Attachment

v4-0001-Implementation-of-a-Bulk-COPY-FROM-efujita-1.patch

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

19 July 2022, 09:35:22

On 18/7/2022 13:22, Etsuro Fujita wrote:
> On Thu, Mar 24, 2022 at 3:43 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 3/22/22 06:54, Etsuro Fujita wrote:
>>> * To allow foreign multi insert, the patch made an invasive change to
>>> the existing logic to determine whether to use multi insert for the
>>> target relation, adding a new member ri_usesMultiInsert to the
>>> ResultRelInfo struct, as well as introducing a new function
>>> ExecMultiInsertAllowed().  But I’m not sure we really need such a
>>> change.  Isn’t it reasonable to *adjust* the existing logic to allow
>>> foreign multi insert when possible?
>> Of course, such approach would look much better, if we implemented it.
> 
>> I'll ponder how to do it.
> 
> I rewrote the decision logic to something much simpler and much less
> invasive, which reduces the patch size significantly.  Attached is an
> updated patch.  What do you think about that?
> 
> While working on the patch, I fixed a few issues as well:
> 
> +       if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize != NULL)
> +           resultRelInfo->ri_BatchSize =
> +
> resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);
> 
> When determining the batch size, I think we should check if the
> ExecForeignBatchInsert callback routine is also defined, like other
> places such as execPartition.c.  For consistency I fixed this by
> copying-and-pasting the code from that file.
> 
> +    * Also, the COPY command requires a non-zero input list of attributes.
> +    * Therefore, the length of the attribute list is checked here.
> +    */
> +   if (!cstate->volatile_defexprs &&
> +       list_length(cstate->attnumlist) > 0 &&
> +       !contain_volatile_functions(cstate->whereClause))
> +       target_resultRelInfo->ri_usesMultiInsert =
> +                   ExecMultiInsertAllowed(target_resultRelInfo);
> 
> I think “list_length(cstate->attnumlist) > 0” in the if-test would
> break COPY FROM; it currently supports multi-inserting into *plain*
> tables even in the case where they have no columns, but this would
> disable the multi-insertion support in that case.  postgres_fdw would
> not be able to batch into zero-column foreign tables due to the INSERT
> syntax limitation (i.e., the syntax does not allow inserting multiple
> empty rows into a zero-column table in a single INSERT statement).
> Which is the reason why this was added to the if-test?  But I think
> some other FDWs might be able to, so I think we should let the FDW
> decide whether to allow batching even in that case, when called from
> GetForeignModifyBatchSize.  So I removed the attnumlist test from the
> patch, and modified postgresGetForeignModifyBatchSize as such.  I
> might miss something, though.
Thanks a lot,
maybe you forgot this code:
/*
  * If a partition's root parent isn't allowed to use it, neither is the
  * partition.
*/
if (rootResultRelInfo->ri_usesMultiInsert)
    leaf_part_rri->ri_usesMultiInsert =
                ExecMultiInsertAllowed(leaf_part_rri);

Also, maybe to describe in documentation, if the value of batch_size is 
more than 1, the ExecForeignBatchInsert routine have a chance to be called?

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

20 July 2022, 08:10:43

On Tue, Jul 19, 2022 at 6:35 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 18/7/2022 13:22, Etsuro Fujita wrote:
> > I rewrote the decision logic to something much simpler and much less
> > invasive, which reduces the patch size significantly.  Attached is an
> > updated patch.  What do you think about that?

> maybe you forgot this code:
> /*
>   * If a partition's root parent isn't allowed to use it, neither is the
>   * partition.
> */
> if (rootResultRelInfo->ri_usesMultiInsert)
>         leaf_part_rri->ri_usesMultiInsert =
>                                 ExecMultiInsertAllowed(leaf_part_rri);

I think the patch accounts for that.  Consider this bit to determine
whether to use batching for the partition chosen by
ExecFindPartition():

@@ -910,12 +962,14 @@ CopyFrom(CopyFromState cstate)

                /*
                 * Disable multi-inserts when the partition has BEFORE/INSTEAD
-                * OF triggers, or if the partition is a foreign partition.
+                * OF triggers, or if the partition is a foreign partition
+                * that can't use batching.
                 */
                leafpart_use_multi_insert = insertMethod == CIM_MULTI_CONDITION\
AL &&
                    !has_before_insert_row_trig &&
                    !has_instead_insert_row_trig &&
-                   resultRelInfo->ri_FdwRoutine == NULL;
+                   (resultRelInfo->ri_FdwRoutine == NULL ||
+                    resultRelInfo->ri_BatchSize > 1);

If the root parent isn't allowed to use batching, then we have
insertMethod=CIM_SINGLE for the parent before we get here.  So in that
case we have leafpart_use_multi_insert=false for the chosen partition,
meaning that the partition isn't allowed to use batching, either.
(The patch just extends the existing decision logic to the
foreign-partition case.)

> Also, maybe to describe in documentation, if the value of batch_size is
> more than 1, the ExecForeignBatchInsert routine have a chance to be called?

Yeah, but I think that is the existing behavior, and that the patch
doesn't change the behavior, so I would leave that for another patch.

Thanks for reviewing!

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

22 July 2022, 06:39:23

On 7/20/22 13:10, Etsuro Fujita wrote:
> On Tue, Jul 19, 2022 at 6:35 PM Andrey Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 18/7/2022 13:22, Etsuro Fujita wrote:
>>> I rewrote the decision logic to something much simpler and much less
>>> invasive, which reduces the patch size significantly.  Attached is an
>>> updated patch.  What do you think about that?
> 
>> maybe you forgot this code:
>> /*
>>    * If a partition's root parent isn't allowed to use it, neither is the
>>    * partition.
>> */
>> if (rootResultRelInfo->ri_usesMultiInsert)
>>          leaf_part_rri->ri_usesMultiInsert =
>>                                  ExecMultiInsertAllowed(leaf_part_rri);
> 
> I think the patch accounts for that.  Consider this bit to determine
> whether to use batching for the partition chosen by
> ExecFindPartition():
Agreed.

Analyzing multi-level heterogeneous partitioned configurations I 
realized, that single write into a partition with a trigger will flush 
buffers for all other partitions of the parent table even if the parent 
haven't any triggers.
It relates to the code:
else if (insertMethod == CIM_MULTI_CONDITIONAL &&
   !CopyMultiInsertInfoIsEmpty(&multiInsertInfo))
{
   /*
    * Flush pending inserts if this partition can't use
    * batching, so rows are visible to triggers etc.
    */
   CopyMultiInsertInfoFlush(&multiInsertInfo, resultRelInfo);
}

Why such cascade flush is really necessary, especially for BEFORE and 
INSTEAD OF triggers? AFTER Trigger should see all rows of the table, but 
if it isn't exists for parent, I think, we wouldn't obligate to 
guarantee order of COPY into two different tables.

-- 
Regards
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

22 July 2022, 08:14:00

On Fri, Jul 22, 2022 at 3:39 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Analyzing multi-level heterogeneous partitioned configurations I
> realized, that single write into a partition with a trigger will flush
> buffers for all other partitions of the parent table even if the parent
> haven't any triggers.
> It relates to the code:
> else if (insertMethod == CIM_MULTI_CONDITIONAL &&
>    !CopyMultiInsertInfoIsEmpty(&multiInsertInfo))
> {
>    /*
>     * Flush pending inserts if this partition can't use
>     * batching, so rows are visible to triggers etc.
>     */
>    CopyMultiInsertInfoFlush(&multiInsertInfo, resultRelInfo);
> }
>
> Why such cascade flush is really necessary, especially for BEFORE and
> INSTEAD OF triggers?

BEFORE triggers on the chosen partition might query the parent table,
not just the partition, so I think we need to do this so that such
triggers can see all the rows that have been inserted into the parent
table until then.

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

22 July 2022, 08:42:43

On 7/22/22 13:14, Etsuro Fujita wrote:
> On Fri, Jul 22, 2022 at 3:39 PM Andrey Lepikhov
>> Why such cascade flush is really necessary, especially for BEFORE and
>> INSTEAD OF triggers?
> 
> BEFORE triggers on the chosen partition might query the parent table,
> not just the partition, so I think we need to do this so that such
> triggers can see all the rows that have been inserted into the parent
> table until then.
Thanks for the explanation of your point of view. So, maybe switch 
status of this patch to 'Ready for committer'?

-- 
Regards
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

26 July 2022, 10:19:43

On Fri, Jul 22, 2022 at 5:42 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> So, maybe switch
> status of this patch to 'Ready for committer'?

Yeah, I think the patch is getting better, but I noticed some issues,
so I'm working on them.  I think I can post a new version in the next
few days.

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

27 July 2022, 05:42:28

On 7/22/22 13:14, Etsuro Fujita wrote:
> On Fri, Jul 22, 2022 at 3:39 PM Andrey Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> Analyzing multi-level heterogeneous partitioned configurations I
>> realized, that single write into a partition with a trigger will flush
>> buffers for all other partitions of the parent table even if the parent
>> haven't any triggers.
>> It relates to the code:
>> else if (insertMethod == CIM_MULTI_CONDITIONAL &&
>>     !CopyMultiInsertInfoIsEmpty(&multiInsertInfo))
>> {
>>     /*
>>      * Flush pending inserts if this partition can't use
>>      * batching, so rows are visible to triggers etc.
>>      */
>>     CopyMultiInsertInfoFlush(&multiInsertInfo, resultRelInfo);
>> }
>>
>> Why such cascade flush is really necessary, especially for BEFORE and
>> INSTEAD OF triggers?
> 
> BEFORE triggers on the chosen partition might query the parent table,
> not just the partition, so I think we need to do this so that such
> triggers can see all the rows that have been inserted into the parent
> table until then.
if you'll excuse me, I will add one more argument.
It wasn't clear, so I've made an experiment: result of a SELECT in an 
INSERT trigger function shows only data, existed in the parent table 
before the start of COPY.
So, we haven't tools to access newly inserting rows in neighboring 
partition and don't need to flush tuple buffers immediately.
Where am I wrong?

-- 
Regards
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

01 August 2022, 04:00:13

On Wed, Jul 27, 2022 at 2:42 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 7/22/22 13:14, Etsuro Fujita wrote:
> > On Fri, Jul 22, 2022 at 3:39 PM Andrey Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> Analyzing multi-level heterogeneous partitioned configurations I
> >> realized, that single write into a partition with a trigger will flush
> >> buffers for all other partitions of the parent table even if the parent
> >> haven't any triggers.
> >> It relates to the code:
> >> else if (insertMethod == CIM_MULTI_CONDITIONAL &&
> >>     !CopyMultiInsertInfoIsEmpty(&multiInsertInfo))
> >> {
> >>     /*
> >>      * Flush pending inserts if this partition can't use
> >>      * batching, so rows are visible to triggers etc.
> >>      */
> >>     CopyMultiInsertInfoFlush(&multiInsertInfo, resultRelInfo);
> >> }
> >>
> >> Why such cascade flush is really necessary, especially for BEFORE and
> >> INSTEAD OF triggers?
> >
> > BEFORE triggers on the chosen partition might query the parent table,
> > not just the partition, so I think we need to do this so that such
> > triggers can see all the rows that have been inserted into the parent
> > table until then.
> if you'll excuse me, I will add one more argument.
> It wasn't clear, so I've made an experiment: result of a SELECT in an
> INSERT trigger function shows only data, existed in the parent table
> before the start of COPY.

Is the trigger function declared VOLATILE?  If so, the trigger should
see modifications to the parent table as well.  See:

https://www.postgresql.org/docs/15/trigger-datachanges.html

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

09 August 2022, 11:44:59

On Tue, Jul 26, 2022 at 7:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> Yeah, I think the patch is getting better, but I noticed some issues,
> so I'm working on them.  I think I can post a new version in the next
> few days.

* When running AFTER ROW triggers in CopyMultiInsertBufferFlush(), the
patch uses the slots passed to ExecForeignBatchInsert(), not the ones
returned by the callback function, but I don't think that that is
always correct, as the documentation about the callback function says:

     The return value is an array of slots containing the data that was
     actually inserted (this might differ from the data supplied, for
     example as a result of trigger actions.)
     The passed-in <literal>slots</literal> can be re-used for this purpose.

postgres_fdw re-uses the passed-in slots, but other FDWs might not, so
I modified the patch to reference the returned slots when running the
AFTER ROW triggers.  I also modified the patch to initialize the
tts_tableOid.  Attached is an updated patch, in which I made some
minor adjustments to CopyMultiInsertBufferFlush() as well.

* The patch produces incorrect error context information:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t1 (f1 int, f2 text);
create foreign table ft1 (f1 int, f2 text) server loopback options
(table_name 't1');
alter table t1 add constraint t1_f1positive check (f1 >= 0);
alter foreign table ft1 add constraint ft1_f1positive check (f1 >= 0);

— single insert
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> -1 foo
>> 1 bar
>> \.
ERROR:  new row for relation "t1" violates check constraint "t1_f1positive"
DETAIL:  Failing row contains (-1, foo).
CONTEXT:  remote SQL command: INSERT INTO public.t1(f1, f2) VALUES ($1, $2)
COPY ft1, line 1: "-1 foo"

— batch insert
alter server loopback options (add batch_size '2');
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> -1 foo
>> 1 bar
>> \.
ERROR:  new row for relation "t1" violates check constraint "t1_f1positive"
DETAIL:  Failing row contains (-1, foo).
CONTEXT:  remote SQL command: INSERT INTO public.t1(f1, f2) VALUES
($1, $2), ($3, $4)
COPY ft1, line 3

In single-insert mode the error context information is correct, but in
batch-insert mode it isn’t (i.e., the line number isn’t correct).

The error occurs on the remote side, so I'm not sure if there is a
simple fix.  What I came up with is to just suppress error context
information other than the relation name, like the attached.  What do
you think about that?

(In CopyMultiInsertBufferFlush() your patch sets cstate->cur_lineno to
buffer->linenos[i] even when running AFTER ROW triggers for the i-th
row returned by ExecForeignBatchInsert(), but that wouldn’t always be
correct, as the i-th returned row might not correspond to the i-th row
originally stored in the buffer as the callback function returns only
the rows that were actually inserted on the remote side.  I think the
proposed fix would address this issue as well.)

* The patch produces incorrect row count in cases where some/all of
the rows passed to ExecForeignBatchInsert() weren’t inserted on the
remote side:

create function trig_null() returns trigger as $$ begin return NULL;
end $$ language plpgsql;
create trigger trig_null before insert on t1 for each row execute
function trig_null();

— single insert
alter server loopback options (drop batch_size);
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 0 foo
>> 1 bar
>> \.
COPY 0

— batch insert
alter server loopback options (add batch_size '2');
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 0    foo
>> 1 bar
>> \.
COPY 2

The row count is correct in single-insert mode, but isn’t in batch-insert mode.

The reason is that in batch-insert mode the row counter is updated
immediately after adding the row to the buffer, not after doing
ExecForeignBatchInsert(), which might ignore the row.  To fix, I
modified the patch to delay updating the row counter (and the progress
of the COPY command) until after doing the callback function.  For
consistency, I also modified the patch to delay it even when batching
into plain tables.  IMO I think that that would be more consistent
with the single-insert mode, as in that mode we update them after
writing the tuple out to the table or sending it to the remote side.

* I modified the patch so that when batching into foreign tables we
skip useless steps in CopyMultiInsertBufferInit() and
CopyMultiInsertBufferCleanup().

That’s all I have for now.  Sorry for the delay.

Best regards,
Etsuro Fujita

Attachment

v4-0001-Implementation-of-a-Bulk-COPY-FROM-efujita-2.patch

Re: Fast COPY FROM based on batch insert

From

Zhihong Yu

Date:

09 August 2022, 16:13:02

On Tue, Aug 9, 2022 at 4:45 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:

On Tue, Jul 26, 2022 at 7:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> Yeah, I think the patch is getting better, but I noticed some issues,
> so I'm working on them. I think I can post a new version in the next
> few days.

* When running AFTER ROW triggers in CopyMultiInsertBufferFlush(), the
patch uses the slots passed to ExecForeignBatchInsert(), not the ones
returned by the callback function, but I don't think that that is
always correct, as the documentation about the callback function says:

The return value is an array of slots containing the data that was
actually inserted (this might differ from the data supplied, for
example as a result of trigger actions.)
The passed-in <literal>slots</literal> can be re-used for this purpose.

postgres_fdw re-uses the passed-in slots, but other FDWs might not, so
I modified the patch to reference the returned slots when running the
AFTER ROW triggers. I also modified the patch to initialize the
tts_tableOid. Attached is an updated patch, in which I made some
minor adjustments to CopyMultiInsertBufferFlush() as well.

* The patch produces incorrect error context information:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t1 (f1 int, f2 text);
create foreign table ft1 (f1 int, f2 text) server loopback options
(table_name 't1');
alter table t1 add constraint t1_f1positive check (f1 >= 0);
alter foreign table ft1 add constraint ft1_f1positive check (f1 >= 0);

— single insert
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> -1 foo
>> 1 bar
>> \.
ERROR: new row for relation "t1" violates check constraint "t1_f1positive"
DETAIL: Failing row contains (-1, foo).
CONTEXT: remote SQL command: INSERT INTO public.t1(f1, f2) VALUES ($1, $2)
COPY ft1, line 1: "-1 foo"

— batch insert
alter server loopback options (add batch_size '2');
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> -1 foo
>> 1 bar
>> \.
ERROR: new row for relation "t1" violates check constraint "t1_f1positive"
DETAIL: Failing row contains (-1, foo).
CONTEXT: remote SQL command: INSERT INTO public.t1(f1, f2) VALUES
($1, $2), ($3, $4)
COPY ft1, line 3

In single-insert mode the error context information is correct, but in
batch-insert mode it isn’t (i.e., the line number isn’t correct).

The error occurs on the remote side, so I'm not sure if there is a
simple fix. What I came up with is to just suppress error context
information other than the relation name, like the attached. What do
you think about that?

(In CopyMultiInsertBufferFlush() your patch sets cstate->cur_lineno to
buffer->linenos[i] even when running AFTER ROW triggers for the i-th
row returned by ExecForeignBatchInsert(), but that wouldn’t always be
correct, as the i-th returned row might not correspond to the i-th row
originally stored in the buffer as the callback function returns only
the rows that were actually inserted on the remote side. I think the
proposed fix would address this issue as well.)

* The patch produces incorrect row count in cases where some/all of
the rows passed to ExecForeignBatchInsert() weren’t inserted on the
remote side:

create function trig_null() returns trigger as $$ begin return NULL;
end $$ language plpgsql;
create trigger trig_null before insert on t1 for each row execute
function trig_null();

— single insert
alter server loopback options (drop batch_size);
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 0 foo
>> 1 bar
>> \.
COPY 0

— batch insert
alter server loopback options (add batch_size '2');
copy ft1 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 0 foo
>> 1 bar
>> \.
COPY 2

The row count is correct in single-insert mode, but isn’t in batch-insert mode.

The reason is that in batch-insert mode the row counter is updated
immediately after adding the row to the buffer, not after doing
ExecForeignBatchInsert(), which might ignore the row. To fix, I
modified the patch to delay updating the row counter (and the progress
of the COPY command) until after doing the callback function. For
consistency, I also modified the patch to delay it even when batching
into plain tables. IMO I think that that would be more consistent
with the single-insert mode, as in that mode we update them after
writing the tuple out to the table or sending it to the remote side.

* I modified the patch so that when batching into foreign tables we
skip useless steps in CopyMultiInsertBufferInit() and
CopyMultiInsertBufferCleanup().

That’s all I have for now. Sorry for the delay.

Best regards,
Etsuro Fujita

Hi,

+ /* If any rows were inserted, run AFTER ROW INSERT triggers. */

...

+ for (i = 0; i < inserted; i++)
+ {

+ TupleTableSlot *slot = rslots[i];

...

+ slot->tts_tableOid =

+ RelationGetRelid(resultRelInfo->ri_RelationDesc);

It seems the return value of `RelationGetRelid(resultRelInfo->ri_RelationDesc)` can be stored in a variable outside the for loop.

Inside the for loop, assign this variable to slot->tts_tableOid.

Cheers

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

10 August 2022, 08:30:20

Hi,

On Wed, Aug 10, 2022 at 1:06 AM Zhihong Yu <zyu@yugabyte.com> wrote:
> On Tue, Aug 9, 2022 at 4:45 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> * When running AFTER ROW triggers in CopyMultiInsertBufferFlush(), the
>> patch uses the slots passed to ExecForeignBatchInsert(), not the ones
>> returned by the callback function, but I don't think that that is
>> always correct, as the documentation about the callback function says:
>>
>>      The return value is an array of slots containing the data that was
>>      actually inserted (this might differ from the data supplied, for
>>      example as a result of trigger actions.)
>>      The passed-in <literal>slots</literal> can be re-used for this purpose.
>>
>> postgres_fdw re-uses the passed-in slots, but other FDWs might not, so
>> I modified the patch to reference the returned slots when running the
>> AFTER ROW triggers.

I noticed that my explanation was not correct.  Let me explain.
Before commit 82593b9a3, when batching into a view referencing a
postgres_fdw foreign table that has WCO constraints, postgres_fdw used
the passed-in slots to store the first tuple that was actually
inserted to the remote table.  But that commit disabled batching in
that case, so postgres_fdw wouldn’t use the passed-in slots (until we
support batching when there are WCO constraints from the parent views
and/or AFTER ROW triggers on the foreign table).

> +           /* If any rows were inserted, run AFTER ROW INSERT triggers. */
> ...
> +               for (i = 0; i < inserted; i++)
> +               {
> +                   TupleTableSlot *slot = rslots[i];
> ...
> +                   slot->tts_tableOid =
> +                       RelationGetRelid(resultRelInfo->ri_RelationDesc);
>
> It seems the return value of `RelationGetRelid(resultRelInfo->ri_RelationDesc)` can be stored in a variable outside
thefor loop. 
> Inside the for loop, assign this variable to slot->tts_tableOid.

Actually, I did this to match the code in ExecBatchInsert(), but that
seems like a good idea, so I’ll update the patch as such in the next
version.

Thanks for reviewing!

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

15 August 2022, 05:29:22

On 8/9/22 16:44, Etsuro Fujita wrote:
>>> -1 foo
>>> 1 bar
>>> \.
> ERROR:  new row for relation "t1" violates check constraint "t1_f1positive"
> DETAIL:  Failing row contains (-1, foo).
> CONTEXT:  remote SQL command: INSERT INTO public.t1(f1, f2) VALUES
> ($1, $2), ($3, $4)
> COPY ft1, line 3
> 
> In single-insert mode the error context information is correct, but in
> batch-insert mode it isn’t (i.e., the line number isn’t correct).
> 
> The error occurs on the remote side, so I'm not sure if there is a
> simple fix.  What I came up with is to just suppress error context
> information other than the relation name, like the attached.  What do
> you think about that?
I've spent many efforts to this problem too. Your solution have a 
rationale and looks fine.
I only think, we should add a bit of info into an error report to 
simplify comprehension why don't point specific line here. For example:
'COPY %s (buffered)'
or
'COPY FOREIGN TABLE %s'

or, if instead of relname_only field to save a MultiInsertBuffer 
pointer, we might add min/max linenos into the report:
'COPY %s, line between %llu and %llu'

-- 
Regards
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

22 August 2022, 08:44:27

On Mon, Aug 15, 2022 at 2:29 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 8/9/22 16:44, Etsuro Fujita wrote:
> >>> -1 foo
> >>> 1 bar
> >>> \.
> > ERROR:  new row for relation "t1" violates check constraint "t1_f1positive"
> > DETAIL:  Failing row contains (-1, foo).
> > CONTEXT:  remote SQL command: INSERT INTO public.t1(f1, f2) VALUES
> > ($1, $2), ($3, $4)
> > COPY ft1, line 3
> >
> > In single-insert mode the error context information is correct, but in
> > batch-insert mode it isn’t (i.e., the line number isn’t correct).
> >
> > The error occurs on the remote side, so I'm not sure if there is a
> > simple fix.  What I came up with is to just suppress error context
> > information other than the relation name, like the attached.  What do
> > you think about that?

> I've spent many efforts to this problem too. Your solution have a
> rationale and looks fine.
> I only think, we should add a bit of info into an error report to
> simplify comprehension why don't point specific line here. For example:
> 'COPY %s (buffered)'
> or
> 'COPY FOREIGN TABLE %s'
>
> or, if instead of relname_only field to save a MultiInsertBuffer
> pointer, we might add min/max linenos into the report:
> 'COPY %s, line between %llu and %llu'

I think the latter is more consistent with the existing error context
information when in CopyMultiInsertBufferFlush().  Actually, I thought
this too, and I think this would be useful when the COPY FROM command
is executed on a foreign table.  My concern, however, is the case when
the command is executed on a partitioned table containing foreign
partitions; in that case the input data would not always be sorted in
the partition order, so the range for an error-occurring foreign
partition might contain many lines with rows from other partitions,
which I think makes the range information less useful.  Maybe I'm too
worried about that, though.

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

23 August 2022, 05:58:22

On 22/8/2022 11:44, Etsuro Fujita wrote:
> I think the latter is more consistent with the existing error context
> information when in CopyMultiInsertBufferFlush().  Actually, I thought
> this too, and I think this would be useful when the COPY FROM command
> is executed on a foreign table.  My concern, however, is the case when
> the command is executed on a partitioned table containing foreign
> partitions; in that case the input data would not always be sorted in
> the partition order, so the range for an error-occurring foreign
> partition might contain many lines with rows from other partitions,
> which I think makes the range information less useful.  Maybe I'm too
> worried about that, though.
I got your point. Indeed, perharps such info doesn't really needed to be 
included into the core, at least for now.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

27 September 2022, 08:47:45

On Tue, Aug 23, 2022 at 2:58 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 22/8/2022 11:44, Etsuro Fujita wrote:
> > I think the latter is more consistent with the existing error context
> > information when in CopyMultiInsertBufferFlush().  Actually, I thought
> > this too, and I think this would be useful when the COPY FROM command
> > is executed on a foreign table.  My concern, however, is the case when
> > the command is executed on a partitioned table containing foreign
> > partitions; in that case the input data would not always be sorted in
> > the partition order, so the range for an error-occurring foreign
> > partition might contain many lines with rows from other partitions,
> > which I think makes the range information less useful.  Maybe I'm too
> > worried about that, though.

> I got your point. Indeed, perharps such info doesn't really needed to be
> included into the core, at least for now.

Ok.  Sorry for the late response.

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

27 September 2022, 09:03:51

On Wed, Aug 10, 2022 at 5:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> On Wed, Aug 10, 2022 at 1:06 AM Zhihong Yu <zyu@yugabyte.com> wrote:
> > +           /* If any rows were inserted, run AFTER ROW INSERT triggers. */
> > ...
> > +               for (i = 0; i < inserted; i++)
> > +               {
> > +                   TupleTableSlot *slot = rslots[i];
> > ...
> > +                   slot->tts_tableOid =
> > +                       RelationGetRelid(resultRelInfo->ri_RelationDesc);
> >
> > It seems the return value of `RelationGetRelid(resultRelInfo->ri_RelationDesc)` can be stored in a variable outside
thefor loop. 
> > Inside the for loop, assign this variable to slot->tts_tableOid.
>
> Actually, I did this to match the code in ExecBatchInsert(), but that
> seems like a good idea, so I’ll update the patch as such in the next
> version.

Done.  I also adjusted the code in CopyMultiInsertBufferFlush() a bit
further.  No functional changes.  I put back in the original position
an assertion ensuring the FDW supports batching.  Sorry for the back
and forth.  Attached is an updated version of the patch.

Other changes are:

* The previous patch modified postgres_fdw.sql so that the existing
test cases for COPY FROM were tested in batch-insert mode.  But I
think we should keep them as-is to test the default behavior, so I
added test cases for this feature by copying-and-pasting some of the
existing test cases.  Also, the previous patch added this:

+create table foo (a int) partition by list (a);
+create table foo1 (like foo);
+create foreign table ffoo1 partition of foo for values in (1)
+   server loopback options (table_name 'foo1');
+create table foo2 (like foo);
+create foreign table ffoo2 partition of foo for values in (2)
+   server loopback options (table_name 'foo2');
+create function print_new_row() returns trigger language plpgsql as $$
+   begin raise notice '%', new; return new; end; $$;
+create trigger ffoo1_br_trig before insert on ffoo1
+   for each row execute function print_new_row();
+
+copy foo from stdin;
+1
+2
+\.

Rather than doing so, I think it would be better to use a partitioned
table defined in the above section “test tuple routing for
foreign-table partitions”, to save cycles.  So I changed this as such.

* I modified comments a bit further and updated docs.

That is it.  I will review the patch a bit more, but I feel that it is
in good shape.

Best regards,
Etsuro Fujita

Attachment

v4-0001-Implementation-of-a-Bulk-COPY-FROM-efujita-3.patch

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

07 October 2022, 06:18:59

On Tue, Sep 27, 2022 at 6:03 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> I will review the patch a bit more, but I feel that it is
> in good shape.

One thing I noticed is this bit added to CopyMultiInsertBufferFlush()
to run triggers on the foreign table.

+           /* Run AFTER ROW INSERT triggers */
+           if (resultRelInfo->ri_TrigDesc != NULL &&
+               (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
+                resultRelInfo->ri_TrigDesc->trig_insert_new_table))
+           {
+               Oid         relid =
RelationGetRelid(resultRelInfo->ri_RelationDesc);
+
+               for (i = 0; i < inserted; i++)
+               {
+                   TupleTableSlot *slot = rslots[i];
+
+                   /*
+                    * AFTER ROW Triggers might reference the tableoid column,
+                    * so (re-)initialize tts_tableOid before evaluating them.
+                    */
+                   slot->tts_tableOid = relid;
+
+                   ExecARInsertTriggers(estate, resultRelInfo,
+                                        slot, NIL,
+                                        cstate->transition_capture);
+               }
+           }

Since foreign tables cannot have transition tables, we have
trig_insert_new_table=false.  So I simplified the if test and added an
assertion ensuring trig_insert_new_table=false.  Attached is a new
version of the patch.  I tweaked some comments a bit as well.  I think
the patch is committable.  So I plan on committing it next week if
there are no objections.

Best regards,
Etsuro Fujita

Attachment

v4-0001-Implementation-of-a-Bulk-COPY-FROM-efujita-4.patch

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

11 October 2022, 06:06:57

On 10/7/22 11:18, Etsuro Fujita wrote:
> On Tue, Sep 27, 2022 at 6:03 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> I will review the patch a bit more, but I feel that it is
>> in good shape.
> 
> One thing I noticed is this bit added to CopyMultiInsertBufferFlush()
> to run triggers on the foreign table.
> 
> +           /* Run AFTER ROW INSERT triggers */
> +           if (resultRelInfo->ri_TrigDesc != NULL &&
> +               (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
> +                resultRelInfo->ri_TrigDesc->trig_insert_new_table))
> +           {
> +               Oid         relid =
> RelationGetRelid(resultRelInfo->ri_RelationDesc);
> +
> +               for (i = 0; i < inserted; i++)
> +               {
> +                   TupleTableSlot *slot = rslots[i];
> +
> +                   /*
> +                    * AFTER ROW Triggers might reference the tableoid column,
> +                    * so (re-)initialize tts_tableOid before evaluating them.
> +                    */
> +                   slot->tts_tableOid = relid;
> +
> +                   ExecARInsertTriggers(estate, resultRelInfo,
> +                                        slot, NIL,
> +                                        cstate->transition_capture);
> +               }
> +           }
> 
> Since foreign tables cannot have transition tables, we have
> trig_insert_new_table=false.  So I simplified the if test and added an
> assertion ensuring trig_insert_new_table=false.  Attached is a new
> version of the patch.  I tweaked some comments a bit as well.  I think
> the patch is committable.  So I plan on committing it next week if
> there are no objections.
I reviewed the patch one more time. Only one question: bistate and 
ri_FdwRoutine are strongly bounded. Maybe to add some assertion on 
(ri_FdwRoutine XOR bistate) ? Just to prevent possible errors in future.

-- 
Regards
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

12 October 2022, 02:56:17

On Tue, Oct 11, 2022 at 3:06 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> I reviewed the patch one more time. Only one question: bistate and
> ri_FdwRoutine are strongly bounded. Maybe to add some assertion on
> (ri_FdwRoutine XOR bistate) ? Just to prevent possible errors in future.

You mean the bistate member of CopyMultiInsertBuffer?

We do not use that member at all for foreign tables, so the patch
avoids initializing that member in CopyMultiInsertBufferInit() when
called for a foreign table.  So we have bistate = NULL for foreign
tables (and bistate != NULL for plain tables), as you mentioned above.
I think it is a good idea to add such assertions.  How about adding
them to CopyMultiInsertBufferFlush() and
CopyMultiInsertBufferCleanup() like the attached?  In the attached I
updated comments a bit further as well.

Thanks for reviewing!

Best regards,
Etsuro Fujita

Attachment

v4-0001-Implementation-of-a-Bulk-COPY-FROM-efujita-5.patch

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

13 October 2022, 04:38:05

On 10/12/22 07:56, Etsuro Fujita wrote:
> On Tue, Oct 11, 2022 at 3:06 PM Andrey Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> I reviewed the patch one more time. Only one question: bistate and
>> ri_FdwRoutine are strongly bounded. Maybe to add some assertion on
>> (ri_FdwRoutine XOR bistate) ? Just to prevent possible errors in future.
> 
> You mean the bistate member of CopyMultiInsertBuffer?
Yes
> 
> We do not use that member at all for foreign tables, so the patch
> avoids initializing that member in CopyMultiInsertBufferInit() when
> called for a foreign table.  So we have bistate = NULL for foreign
> tables (and bistate != NULL for plain tables), as you mentioned above.
> I think it is a good idea to add such assertions.  How about adding
> them to CopyMultiInsertBufferFlush() and
> CopyMultiInsertBufferCleanup() like the attached?  In the attached I
> updated comments a bit further as well.
Yes, quite enough.

-- 
Regards
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

13 October 2022, 09:58:14

On Thu, Oct 13, 2022 at 1:38 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 10/12/22 07:56, Etsuro Fujita wrote:
> > On Tue, Oct 11, 2022 at 3:06 PM Andrey Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> I reviewed the patch one more time. Only one question: bistate and
> >> ri_FdwRoutine are strongly bounded. Maybe to add some assertion on
> >> (ri_FdwRoutine XOR bistate) ? Just to prevent possible errors in future.
> >
> > You mean the bistate member of CopyMultiInsertBuffer?
> Yes
> >
> > We do not use that member at all for foreign tables, so the patch
> > avoids initializing that member in CopyMultiInsertBufferInit() when
> > called for a foreign table.  So we have bistate = NULL for foreign
> > tables (and bistate != NULL for plain tables), as you mentioned above.
> > I think it is a good idea to add such assertions.  How about adding
> > them to CopyMultiInsertBufferFlush() and
> > CopyMultiInsertBufferCleanup() like the attached?  In the attached I
> > updated comments a bit further as well.
> Yes, quite enough.

I have committed the patch after tweaking comments a little bit further.

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

28 October 2022, 10:12:52

On Thu, Oct 13, 2022 at 6:58 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> I have committed the patch after tweaking comments a little bit further.

I think there is another patch that improves performance of COPY FROM
for foreign tables using COPY FROM STDIN, but if Andrey (or anyone
else) want to work on it again, I think it would be better to create a
new CF entry for it (and start a new thread for it).  So I plan to
close this in the November CF unless they think otherwise.

Anyway, thanks for the patch, Andrey!  Thanks for reviewing, Ian and Zhihong!

Best regards,
Etsuro Fujita

Re: Fast COPY FROM based on batch insert

From

Andrey Lepikhov

Date:

28 October 2022, 10:53:17

On 28/10/2022 16:12, Etsuro Fujita wrote:
> On Thu, Oct 13, 2022 at 6:58 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> I have committed the patch after tweaking comments a little bit further.
> 
> I think there is another patch that improves performance of COPY FROM
> for foreign tables using COPY FROM STDIN, but if Andrey (or anyone
> else) want to work on it again, I think it would be better to create a
> new CF entry for it (and start a new thread for it).  So I plan to
> close this in the November CF unless they think otherwise.
> 
> Anyway, thanks for the patch, Andrey!  Thanks for reviewing, Ian and Zhihong!
Thanks,

I studied performance of this code in comparison to bulk INSERTions.
This patch seems to improve speed of insertion by about 20%. Also, this 
patch is very invasive. So, I don't have any plans to work on it now.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: Fast COPY FROM based on batch insert

From

Etsuro Fujita

Date:

31 October 2022, 08:50:02

On Fri, Oct 28, 2022 at 7:53 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 28/10/2022 16:12, Etsuro Fujita wrote:
> > I think there is another patch that improves performance of COPY FROM
> > for foreign tables using COPY FROM STDIN, but if Andrey (or anyone
> > else) want to work on it again, I think it would be better to create a
> > new CF entry for it (and start a new thread for it).  So I plan to
> > close this in the November CF unless they think otherwise.

> I studied performance of this code in comparison to bulk INSERTions.
> This patch seems to improve speed of insertion by about 20%. Also, this
> patch is very invasive. So, I don't have any plans to work on it now.

Ok, let's leave that for future work.  I closed this entry in the November CF.

Best regards,
Etsuro Fujita