Re: Fdw batch insert error out when set batch_size > 65535 - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Fdw batch insert error out when set batch_size > 65535
Date
Msg-id CALj2ACXg3tPAChzFd94_k_tP3x61QTVpuqKZke3M4ef78xEcuA@mail.gmail.com
Whole thread Raw
In response to Fdw batch insert error out when set batch_size > 65535  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses RE: Fdw batch insert error out when set batch_size > 65535
List pgsql-hackers
On Fri, May 21, 2021 at 8:18 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Hi,
>
> When reading the code, I noticed some possible issue about fdw batch insert.
> When I set batch_size > 65535 and insert more than 65535 rows into foreign table,
> It will throw an error:
>
> For example:
>
> ------------------
> CREATE FOREIGN TABLE vzdalena_tabulka2(a int, b varchar)
>   SERVER omega_db
>   OPTIONS (table_name 'tabulka', batch_size '65536');
>
> INSERT INTO vzdalena_tabulka2 SELECT i, 'AHOJ' || i FROM generate_series(1,65536) g(i);
>
> ERROR:  number of parameters must be between 0 and 65535
> CONTEXT:  remote SQL command: INSERT INTO public.tabulka(a, b) VALUES ($1, $2), ($3, $4)...
> ------------------
>
> Actually, I think if the (number of columns) * (number of rows) > 65535, then we can
> get this error. But, I think almost no one will set such a large value, so I think adjust the
> batch_size automatically when doing INSERT seems an acceptable solution.
>
> In the postgresGetForeignModifyBatchSize(), if we found the (num of param) * batch_size
> Is greater than the limit(65535), then we set it to 65535 / (num of param).
>
> Thoughts ?

+1 to limit batch_size for postgres_fdw and it's a good idea to have a
macro for the max params.

Few comments:
1) How about using macro in the error message, something like below?
        appendPQExpBuffer(errorMessage,
                          libpq_gettext("number of parameters must be
between 0 and %d\n"),
                          PQ_MAX_PARAM_NUMBER);
2) I think we can choose not mention the 65535 because it's hard to
maintain if that's changed in libpq protocol sometime in future. How
about
"The final number of rows postgres_fdw inserts in a batch actually
depends on the number of columns and the provided batch_size value.
This is because of the limit the libpq protocol (which postgres_fdw
uses to connect to a remote server) has on the number of query
parameters that can be specified per query. For instance, if the
number of columns * batch_size is more than the limit, then the libpq
emits an error. But postgres_fdw adjusts the batch_size to avoid this
error."
instead of
+       overrides an option specified for the server. Note if the batch size
+       exceed the protocol limit (number of columns * batch_size > 65535),
+       then the actual batch size will be less than the specified batch_size.
3) I think "postgres_fdw should insert in each insert operation"
doesn't hold after this  patch. We can reword it to "postgres_fdw
inserts in each insert operation".
       This option specifies the number of rows
<filename>postgres_fdw</filename>
       should insert in each insert operation. It can be specified for a
4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT?
5) We can use macro in code comments as well.
+     * 65535, so set the batch_size to not exceed limit in a batch insert.
6) I think both code and docs patches can be combined into a single patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: parallel vacuum - few questions on docs, comments and code
Next
From: Amit Kapila
Date:
Subject: Re: Diagnostic comment in LogicalIncreaseXminForSlot