Hi all,
Currently on postgres_fdw we use prepared statements to insert batches
into foreign tables. Although this works fine for the most use cases the
COPY command can also be used in some scenarios to speed up large batch
inserts.
The attached patch implements this idea of using the COPY command for
batch inserts on postgres_fdw foreign tables. I've performed some
benchmarks using pgbench and the results seem good to consider this.
I've performed the benchmark using different batch_size values to see
when this optimization could be useful. The following results are the
best tps of 3 runs.
Command: pgbench -n -c 10 -j 10 -t 100 -f bench.sql postgres
batch_size: 10
master tps: 76.360406
patch tps: 68.917109
batch_size: 100
master tps: 123.427731
patch tps: 243.737055
batch_size: 1000
master tps: 132.500506
patch tps: 239.295132
It seems that using a batch_size greater than 100 we can have a
considerable speed up for batch inserts.
The attached patch uses the COPY command whenever we have a *numSlots >
1 but the tests show that maybe we should have a GUC to enable this?
I also think that we can have a better patch by removing the duplicated
code introduced on this first version, specially on the clean up phase,
but I tried to keep things more simple on this initial phase to keep the
review more easier and also just to test the idea.
Lastly, I don't know if we should change the EXPLAIN(ANALYZE, VERBOSE)
output for batch inserts that use the COPY to mention that we are
sending the COPY command to the remote server. I guess so?
(this proposal is based on a patch idea written by Tomas Vondra in one
of his blogs posts)
--
Matheus Alcantara