Re: Fdw batch insert error out when set batch_size > 65535 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Fdw batch insert error out when set batch_size > 65535
Date
Msg-id 68b4d5bd-d9fd-4337-e677-0bc5b3b7db6f@enterprisedb.com
Whole thread Raw
In response to Re: Fdw batch insert error out when set batch_size > 65535  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Fdw batch insert error out when set batch_size > 65535
List pgsql-hackers

On 5/31/21 6:01 AM, Bharath Rupireddy wrote:
> On Mon, May 31, 2021 at 1:21 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> Hi,
>>
>> I took at this patch today. I did some minor changes, mostly:
>>
>> 1) change the code limiting batch_size from
>>
>>     if (fmstate->p_nums > 0 &&
>>        (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
>>     {
>>         batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
>>     }
>>
>> to
>>
>>     if (fmstate && fmstate->p_nums > 0)
>>         batch_size = Min(batch_size,
>>                          PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums);
>>
>> which I think is somewhat clearer / more common patter.
> 
> Agree, that looks pretty good.
> 
>> 2) I've reworded the docs a bit, splitting the single para into two. I
>> think this makes it clearer.
> 
> LGTM, except one thing that the batch_size description says "should
> insert in", but it's not that the value entered for batch_size is
> always honoured right? Because this patch might it.
> 
>        This option specifies the number of rows
> <filename>postgres_fdw</filename>
>        should insert in each insert operation. It can be specified for a
> 
> So, I suggest to remove "should" and change it to:
> 
>        This option specifies the number of rows
> <filename>postgres_fdw</filename>
>        inserts in each insert operation. It can be specified for a
> 

I think the "should" indicates exactly that postgres_fdw may adjust the
batch size. Without it it sounds much more definitive, so I kept it.

>> Attached is a patch doing this. Please check the commit message etc.
>> Barring objections I'll get it committed in a couple days.
> 
> One minor comment:
> In the commit message, Int16 is used
> The FE/BE protocol identifies parameters with an Int16 index, which
> limits the maximum number of parameters per query to 65535. With
> 
> and in the code comments uint16 is used.
> +     * parameters in a batch is limited to 64k (uint16), so make sure we don't
> 
> Isn't it uint16 in the commit message too? Also, can we use 64k in the
> commit message instead of 65535?
> 

No, the "Int16" refers to the FE/BE documentation, where we use Int16.
But in the C code we interpret it as uint16.

I've added a simple regression test to postgres_fdw, testing that batch
sizes > 65535 work fine, and pushed the fix.


I've considered checking the value in postgres_fdw_validator and just
rejecting anything over 65535, but I've decided against that. We'd still
need to adjust depending on number of columns anyway.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Next
From: Jeremy Schneider
Date:
Subject: Re: logical decoding bug: segfault in ReorderBufferToastReplace()