Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id 803d5507-299d-f109-7b71-b6bbbb17a053@enterprisedb.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers

On 2/16/21 10:25 AM, Amit Langote wrote:
> On Tue, Feb 16, 2021 at 1:36 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> On 2/5/21 3:52 AM, Amit Langote wrote:
>>> Tsunakwa-san,
>>>
>>> On Mon, Jan 25, 2021 at 1:21 PM tsunakawa.takay@fujitsu.com
>>> <tsunakawa.takay@fujitsu.com> wrote:
>>>> From: Amit Langote <amitlangote09@gmail.com>
>>>>> Yes, it can be simplified by using a local join to prevent the update of the foreign
>>>>> partition from being pushed to the remote server, for which my example in the
>>>>> previous email used a local trigger.  Note that the update of the foreign
>>>>> partition to be done locally is a prerequisite for this bug to occur.
>>>>
>>>> Thank you, I was aware that UPDATE calls ExecInsert() but forgot about it partway.  Good catch (and my bad miss.)
>>>
>>> It appears I had missed your reply, sorry.
>>>
>>>> +       PgFdwModifyState *fmstate = resultRelInfo->ri_FdwState ?
>>>> +                                                       (PgFdwModifyState *) resultRelInfo->ri_FdwState :
>>>> +                                                       NULL;
>>>>
>>>> This can be written as:
>>>>
>>>> +       PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
>>>
>>> Facepalm, yes.
>>>
>>> Patch updated.  Thanks for the review.
>>>
>>
>> Thanks for the patch, it seems fine to me.
> 
> Thanks for checking.
> 
>> I wonder it the commit
>> message needs some tweaks, though. At the moment it says:
>>
>>      Prevent FDW insert batching during cross-partition updates
>>
>> but what the patch seems to be doing is simply initializing the info
>> only for CMD_INSERT operations. Which does the trick, but it affects
>> everything, i.e. all updates, no? Not just cross-partition updates.
> 
> You're right.  Please check the message in the updated patch.
>

Thanks. I'm not sure I understand what "FDW may not be able to handle 
both the original update operation and the batched insert operation 
being performed at the same time" means. I mean, if we translate the 
UPDATE into DELETE+INSERT, then we don't run both the update and insert 
at the same time, right? What exactly is the problem with allowing 
batching for inserts in cross-partition updates?

On a closer look, it seems the problem actually lies in a small 
inconsistency between create_foreign_modify and ExecInitRoutingInfo. The 
former only set batch_size for CMD_INSERT while the latter called the 
BatchSize() for all operations, expecting >= 1 result. So we may either 
relax create_foreign_modify and set batch_size for all DML, or make 
ExecInitRoutingInfo stricter (which is what the patches here do).

Is there a reason not to do the first thing, allowing batching of 
inserts during cross-partition updates? I tried to do that, but it 
dawned on me that we can't mix batched and un-batched operations, e.g. 
DELETE + INSERT, because that'd break the order of execution, leading to 
bogus results in case the same row is modified repeatedly, etc.

Am I getting this right?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Next
From: Tom Lane
Date:
Subject: Re: libpq PQresultErrorMessage vs PQerrorMessage API issue