Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id CAG-ACPX4a=bfVS4Tc8NPCL9fu1sQNotsAVXdq-x+FD9davJHpQ@mail.gmail.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers


On Tue, 30 Jun 2020 at 22:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>I didn't get this. We are executing an INSERT on the foreign server,
>so we get the number of rows INSERTed from that server. We should just
>add those up across batches. If there's a failure, it would abort the
>transaction, local as well as remote.
>

True, but it's not the FDW code doing the counting - it's the caller,
depending on whether the ExecForeignInsert returns a valid slot or NULL.
So it's not quite possible to just return a number of inserted tuples,
as returned by the remote server.

Hmm yes, now I remember that bit. So for every row buffered, we return a valid slot without knowing whether that row was inserted on the remote server or not. I think we have that problem even now where a single INSERT might result in multiple INSERTs on the remote server (rare but not completely impossible).
 

>In your patch, I see that an INSERT statement with batch is
>constructed as INSERT INTO ... VALUES (...), (...) as many values as
>the batch size. That won't work as is for UPDATE/DELETE since we can't
>pass multiple pairs of ctids and columns to be updated for each ctid
>in one statement. Maybe we could build as many UPDATE/DELETE
>statements as the size of a batch, but that would be ugly. What we
>need is a feature like a batch prepared statement in libpq similar to
>what JDBC supports
>((https://mkyong.com/jdbc/jdbc-preparedstatement-example-batch-update/).
>This will allow a single prepared statement to be executed with a
>batch of parameters, each batch corresponding to one foreign DML
>statement.
>

I'm pretty sure we could make it work with some array/unnest tricks to
build a relation, and use that as a source of data.

That sounds great. The solution will be limited to postgres_fdw only.
 
I don't see why not support both, the use cases are somewhat different I
think.

+1, if we can do both.

--
Best Wishes,
Ashutosh

pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Next
From: torikoshia
Date:
Subject: Re: Creating a function for exposing memory usage of backend process