Re: Fdw batch insert error out when set batch_size > 65535 - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: Fdw batch insert error out when set batch_size > 65535 |
Date | |
Msg-id | CALj2ACXg3tPAChzFd94_k_tP3x61QTVpuqKZke3M4ef78xEcuA@mail.gmail.com Whole thread Raw |
In response to | Fdw batch insert error out when set batch_size > 65535 ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>) |
Responses |
RE: Fdw batch insert error out when set batch_size > 65535
|
List | pgsql-hackers |
On Fri, May 21, 2021 at 8:18 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Hi, > > When reading the code, I noticed some possible issue about fdw batch insert. > When I set batch_size > 65535 and insert more than 65535 rows into foreign table, > It will throw an error: > > For example: > > ------------------ > CREATE FOREIGN TABLE vzdalena_tabulka2(a int, b varchar) > SERVER omega_db > OPTIONS (table_name 'tabulka', batch_size '65536'); > > INSERT INTO vzdalena_tabulka2 SELECT i, 'AHOJ' || i FROM generate_series(1,65536) g(i); > > ERROR: number of parameters must be between 0 and 65535 > CONTEXT: remote SQL command: INSERT INTO public.tabulka(a, b) VALUES ($1, $2), ($3, $4)... > ------------------ > > Actually, I think if the (number of columns) * (number of rows) > 65535, then we can > get this error. But, I think almost no one will set such a large value, so I think adjust the > batch_size automatically when doing INSERT seems an acceptable solution. > > In the postgresGetForeignModifyBatchSize(), if we found the (num of param) * batch_size > Is greater than the limit(65535), then we set it to 65535 / (num of param). > > Thoughts ? +1 to limit batch_size for postgres_fdw and it's a good idea to have a macro for the max params. Few comments: 1) How about using macro in the error message, something like below? appendPQExpBuffer(errorMessage, libpq_gettext("number of parameters must be between 0 and %d\n"), PQ_MAX_PARAM_NUMBER); 2) I think we can choose not mention the 65535 because it's hard to maintain if that's changed in libpq protocol sometime in future. How about "The final number of rows postgres_fdw inserts in a batch actually depends on the number of columns and the provided batch_size value. This is because of the limit the libpq protocol (which postgres_fdw uses to connect to a remote server) has on the number of query parameters that can be specified per query. For instance, if the number of columns * batch_size is more than the limit, then the libpq emits an error. But postgres_fdw adjusts the batch_size to avoid this error." instead of + overrides an option specified for the server. Note if the batch size + exceed the protocol limit (number of columns * batch_size > 65535), + then the actual batch size will be less than the specified batch_size. 3) I think "postgres_fdw should insert in each insert operation" doesn't hold after this patch. We can reword it to "postgres_fdw inserts in each insert operation". This option specifies the number of rows <filename>postgres_fdw</filename> should insert in each insert operation. It can be specified for a 4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT? 5) We can use macro in code comments as well. + * 65535, so set the batch_size to not exceed limit in a batch insert. 6) I think both code and docs patches can be combined into a single patch. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: