Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Amit Langote
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id CA+HiwqEjV00Re-xXaiy3GKntj-B-M8Q1LOt=tGx62ZWzv+y0uw@mail.gmail.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: POC: postgres_fdw insert batching  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Thu, Jan 14, 2021 at 21:57 Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 1/14/21 9:58 AM, Amit Langote wrote:
> Hi,
>
> On Thu, Jan 14, 2021 at 2:41 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> On 1/13/21 3:43 PM, Tomas Vondra wrote:
>>> Thanks for the report. Yeah, I think there's a missing check in
>>> ExecInsert. Adding
>>>
>>>   (!resultRelInfo->ri_TrigDesc->trig_insert_after_row)
>>>
>>> solves this. But now I'm wondering if this is the wrong place to make
>>> this decision. I mean, why should we make the decision here, when the
>>> decision whether to have a RETURNING clause is made in postgres_fdw in
>>> deparseReturningList? We don't really know what the other FDWs will do,
>>> for example.
>>>
>>> So I think we should just move all of this into GetModifyBatchSize. We
>>> can start with ri_BatchSize = 0. And then do
>>>
>>>   if (resultRelInfo->ri_BatchSize == 0)
>>>     resultRelInfo->ri_BatchSize =
>>>       resultRelInfo->ri_FdwRoutine->GetModifyBatchSize(resultRelInfo);
>>>
>>>   if (resultRelInfo->ri_BatchSize > 1)
>>>   {
>>>     ... do batching ...
>>>   }
>>>
>>> The GetModifyBatchSize would always return value > 0, so either 1 (no
>>> batching) or >1 (batching).
>>>
>>
>> FWIW the attached v8 patch does this - most of the conditions are moved
>> to the GetModifyBatchSize() callback.
>
> Thanks.  A few comments:
>
> * I agree with leaving it up to an FDW to look at the properties of
> the table and of the operation being performed to decide whether or
> not to use batching, although maybe BeginForeignModify() is a better
> place for putting that logic instead of GetModifyBatchSize()?  So, in
> create_foreign_modify(), instead of PgFdwModifyState.batch_size simply
> being set to match the table's or the server's value for the
> batch_size option, make it also consider the things that prevent
> batching and set the execution state's batch_size based on that.
> GetModifyBatchSize() simply returns that value.
>
> * Regarding the timing of calling GetModifyBatchSize() to set
> ri_BatchSize, I wonder if it wouldn't be better to call it just once,
> say from ExecInitModifyTable(), right after BeginForeignModify()
> returns?  I don't quite understand why it is being called from
> ExecInsert().  Can the batch size change once the execution starts?
>

But it should be called just once. The idea is that initially we have
batch_size=0, and the fist call returns value that is >= 1. So we never
call it again. But maybe it could be called from BeginForeignModify, in
which case we'd not need this logic with first setting it to 0 etc.

Right, although I was thinking that maybe ri_BatchSize itself is not to be written to by the FDW.  Not to say that’s doing anything wrong though.

> * Lastly, how about calling it GetForeignModifyBatchSize() to be
> consistent with other nearby callbacks?
>

Yeah, good point.

>> I've removed the check for the
>> BatchInsert callback, though - the FDW knows whether it supports that,
>> and it seems a bit pointless at the moment as there are no other batch
>> callbacks. Maybe we should add an Assert somewhere, though?
>
> Hmm, not checking whether BatchInsert() exists may not be good idea,
> because if an FDW's GetModifyBatchSize() returns a value > 1 but
> there's no BatchInsert() function to call, ExecBatchInsert() would
> trip.  I don't see the newly added documentation telling FDW authors
> to either define both or none.
>

Hmm. The BatchInsert check seemed somewhat unnecessary to me, but OTOH
it can't hurt, I guess. I'll ad it back.

> Regarding how this plays with partitions, I don't think we need
> ExecGetTouchedPartitions(), because you can get the routed-to
> partitions using es_tuple_routing_result_relations.  Also, perhaps

I'm not very familiar with es_tuple_routing_result_relations, but that
doesn't seem to work. I've replaced the flushing code at the end of
ExecModifyTable with a loop over es_tuple_routing_result_relations, but
then some of the rows are missing (i.e. not flushed).

I should’ve mentioned es_opened_result_relations too which contain non-routing result relations.  So I really meant if (proute) then use es_tuple_routing_result_relations, else es_opened_result_relations.  This should work as long as batching is only used for inserts.


> it's a good idea to put the "finishing" ExecBatchInsert() calls into a
> function ExecFinishBatchInsert().  Maybe the logic to choose the
> relations to perform the finishing calls on will get complicated in
> the future as batching is added for updates/deletes too and it seems
> better to encapsulate that in the separate function than have it out
> in the open in ExecModifyTable().
>

IMO that'd be an over-engineering at this point. We don't need such
separate function yet, so why complicate the API? If we need it in the
future, we can add it.

Fair enough.
--

pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: cost_sort vs cost_agg
Next
From: Tomas Vondra
Date:
Subject: Re: WIP: BRIN multi-range indexes