RE: Fdw batch insert error out when set batch_size > 65535 - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Fdw batch insert error out when set batch_size > 65535
Date
Msg-id OS0PR01MB5716BFBD760428859058FCBD94299@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Fdw batch insert error out when set batch_size > 65535  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Fdw batch insert error out when set batch_size > 65535
List pgsql-hackers
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Sent: Friday, May 21, 2021 1:42 PM
> On Fri, May 21, 2021 at 8:18 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > Actually, I think if the (number of columns) * (number of rows) >
> > 65535, then we can get this error. But, I think almost no one will set
> > such a large value, so I think adjust the batch_size automatically when doing
> INSERT seems an acceptable solution.
> >
> > In the postgresGetForeignModifyBatchSize(), if we found the (num of
> > param) * batch_size Is greater than the limit(65535), then we set it to 65535 /
> (num of param).
> >
> > Thoughts ?
> 
> +1 to limit batch_size for postgres_fdw and it's a good idea to have a
> macro for the max params.
>
> Few comments:

Thanks for the comments. 

> 1) How about using macro in the error message, something like below?
>         appendPQExpBuffer(errorMessage,
>                           libpq_gettext("number of parameters must be
> between 0 and %d\n"),
>                           PQ_MAX_PARAM_NUMBER);

Changed.

> 2) I think we can choose not mention the 65535 because it's hard to maintain if
> that's changed in libpq protocol sometime in future. How about "The final
> number of rows postgres_fdw inserts in a batch actually depends on the
> number of columns and the provided batch_size value.
> This is because of the limit the libpq protocol (which postgres_fdw uses to
> connect to a remote server) has on the number of query parameters that can
> be specified per query. For instance, if the number of columns * batch_size is
> more than the limit, then the libpq emits an error. But postgres_fdw adjusts the
> batch_size to avoid this error."
> instead of
> +       overrides an option specified for the server. Note if the batch size
> +       exceed the protocol limit (number of columns * batch_size > 65535),
> +       then the actual batch size will be less than the specified batch_size.

Thanks, your description looks better. Changed.

> 3) I think "postgres_fdw should insert in each insert operation"
> doesn't hold after this  patch. We can reword it to "postgres_fdw inserts in
> each insert operation".
>        This option specifies the number of rows
> <filename>postgres_fdw</filename>
>        should insert in each insert operation. It can be specified for a

Changed.

> 4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT?

Changed.

> 5) We can use macro in code comments as well.

Thanks, I reorganized the code comments.

> +     * 65535, so set the batch_size to not exceed limit in a batch insert.
> 6) I think both code and docs patches can be combined into a single patch.
Combined.

Attaching V2 patch. Please consider it for further review.

Best regards,
houzj

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Race condition in recovery?
Next
From: Bharath Rupireddy
Date:
Subject: Re: parallel vacuum - few questions on docs, comments and code