Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Amit Langote
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id CA+HiwqFGENfuyC7w7TD9L8wdJk3+7XwD8Z24Kp7OjJg4+mTjqQ@mail.gmail.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses RE: POC: postgres_fdw insert batching  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
List pgsql-hackers
On Fri, Jan 15, 2021 at 12:05 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Attached is v9 with all of those tweaks,

Thanks.

> except for moving the BatchSize
> call to BeginForeignModify - I tried that, but it did not seem like an
> improvement, because we'd still need the checks for API callbacks in
> ExecInsert for example. So I decided not to do that.

Okay, so maybe not moving the whole logic into the FDW's
BeginForeignModify(), but at least if we move this...

@@ -441,6 +449,72 @@ ExecInsert(ModifyTableState *mtstate,
+       /*
+        * Determine if the FDW supports batch insert and determine the batch
+        * size (a FDW may support batching, but it may be disabled for the
+        * server/table). Do this only once, at the beginning - we don't want
+        * the batch size to change during execution.
+        */
+       if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize &&
+           resultRelInfo->ri_FdwRoutine->ExecForeignBatchInsert &&
+           resultRelInfo->ri_BatchSize == 0)
+           resultRelInfo->ri_BatchSize =
+
resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);

...into ExecInitModifyTable(), ExecInsert() only needs the following block:

       /*
+        * If the FDW supports batching, and batching is requested, accumulate
+        * rows and insert them in batches. Otherwise use the per-row inserts.
+        */
+       if (resultRelInfo->ri_BatchSize > 1)
+       {
+ ...

AFAICS, I don't see anything that will cause ri_BatchSize to become 0
once set so don't see the point of checking whether it needs to be set
again on every ExecInsert() call.  Also, maybe not that important, but
shaving off 3 comparisons for every tuple would add up nicely IMHO
especially given that we're targeting bulk loads.

--
Amit Langote
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Next
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: Determine parallel-safety of partition relations for Inserts