Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id CAGRY4nzSZTn-EbvNJq+ZRVKrMv0pHGqb6xfi-SAiM5+LrX0Ebw@mail.gmail.com
Whole thread Raw
In response to RE: POC: postgres_fdw insert batching  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: POC: postgres_fdw insert batching  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
On Thu, Oct 8, 2020 at 10:40 AM tsunakawa.takay@fujitsu.com <tsunakawa.takay@fujitsu.com> wrote:

Thank you for picking up this.  I'm interested in this topic, too.  (As an aside, we'd like to submit a bulk insert patch for ECPG in the near future.)

As others referred, Andrey-san's fast COPY to foreign partitions is also promising.  But I think your bulk INSERT is a separate feature and offers COPY cannot do -- data transformation during loading with INSERT SELECT and CREATE TABLE AS SELECT.

Is there anything that makes you worry and stops development?  Could I give it a try to implement this (I'm not sure I can, sorry.  I'm worried if we can change the executor's call chain easily.)


I suggest that when developing this, you keep in mind the ongoing work on the libpq pipelining/batching enhancements, and also the way many interfaces to foreign data sources support asynchronous, concurrent operations.

Best results with postgres_fdw insert batching would be achieved if it can also send its batches as asynchronous queries and only block when it's required to report on the results of the work. This will also be true of any other FDW where the backing remote interface can support asynchronous concurrent or pipelined operation.

I'd argue it's pretty much vital for decent performance when talking to a cloud database from an on-prem server for example, or any other time that round-trip-time reduction is important.

The most important characteristic of an FDW API to permit this would be decoupling of request and response into separate non-blocking calls that don't have to occur in ordered pairs. Instead of "insert_foo(foo) -> insert_result", have "queue_insert_foo(foo) -> future_result", "get_result_if_available(future_result) -> maybe result" and "get_result_blocking(future_result) -> result". Permit multiple queue_insert_foo(...)s without a/b interleaving with result fetches being required.

Ideally it'd be able to accumulate small batches of inserts locally and send a batch to the remote end once it's accumulated enough. But instead of blocking waiting for the result, return control to the executor after sending, without forcing a socket flush (which might block) and without waiting to learn what the outcome was. Allow new batches to be accumulated and sent before the results of the first batch are received, so long as it's within the same executor node so we don't make any unfortunate mistakes with mixing things up in compound statements or functions etc. Only report outcomes like rowcounts lazily when results are received, or when required to do so.

If now we have

REQUEST -> [block] -> RESULT
~~ round trip delay ~~
REQUEST -> [block] -> RESULT
~~ round trip delay ~~
REQUEST -> [block] -> RESULT
~~ round trip delay ~~
REQUEST -> [block] -> RESULT

and batching would give us

{ REQUEST, REQUEST} -> [block] -> { RESULT, RESULT }
~~ round trip delay ~~
{ REQUEST, REQUEST} -> [block] -> { RESULT, RESULT }

consider if room can be left in the batching API to permit:

{ REQUEST, REQUEST} -> [nonblocking send...]
{ REQUEST, REQUEST} -> [nonblocking send...]
~~ round trip delay ~~
[....] -> RESULT, RESULT
[....] -> RESULT, RESULT


... where we only actually block at the point where the result is required as input into the next node.

Honestly I don't know the executor structure well enough to say if this is even remotely feasible right now. Maybe Andres may be able to comment. But please keep it in mind if you're thinking of making FDW API changes.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path
Next
From: Li Japin
Date:
Subject: Re: Terminate the idle sessions