Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id 70d5f38f-1271-234e-c154-3c5c81b5bcf2@enterprisedb.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Amit Langote <amitlangote09@gmail.com>)
Responses RE: POC: postgres_fdw insert batching  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Re: POC: postgres_fdw insert batching  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers

On 1/14/21 2:57 PM, Amit Langote wrote:
> On Thu, Jan 14, 2021 at 21:57 Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> wrote:
> 
>     On 1/14/21 9:58 AM, Amit Langote wrote:
>     > Hi,
>     >
>     > On Thu, Jan 14, 2021 at 2:41 AM Tomas Vondra
>     > <tomas.vondra@enterprisedb.com
>     <mailto:tomas.vondra@enterprisedb.com>> wrote:
>     >> On 1/13/21 3:43 PM, Tomas Vondra wrote:
>     >>> Thanks for the report. Yeah, I think there's a missing check in
>     >>> ExecInsert. Adding
>     >>>
>     >>>   (!resultRelInfo->ri_TrigDesc->trig_insert_after_row)
>     >>>
>     >>> solves this. But now I'm wondering if this is the wrong place to
>     make
>     >>> this decision. I mean, why should we make the decision here,
>     when the
>     >>> decision whether to have a RETURNING clause is made in
>     postgres_fdw in
>     >>> deparseReturningList? We don't really know what the other FDWs
>     will do,
>     >>> for example.
>     >>>
>     >>> So I think we should just move all of this into
>     GetModifyBatchSize. We
>     >>> can start with ri_BatchSize = 0. And then do
>     >>>
>     >>>   if (resultRelInfo->ri_BatchSize == 0)
>     >>>     resultRelInfo->ri_BatchSize =
>     >>>     
>      resultRelInfo->ri_FdwRoutine->GetModifyBatchSize(resultRelInfo);
>     >>>
>     >>>   if (resultRelInfo->ri_BatchSize > 1)
>     >>>   {
>     >>>     ... do batching ...
>     >>>   }
>     >>>
>     >>> The GetModifyBatchSize would always return value > 0, so either
>     1 (no
>     >>> batching) or >1 (batching).
>     >>>
>     >>
>     >> FWIW the attached v8 patch does this - most of the conditions are
>     moved
>     >> to the GetModifyBatchSize() callback.
>     >
>     > Thanks.  A few comments:
>     >
>     > * I agree with leaving it up to an FDW to look at the properties of
>     > the table and of the operation being performed to decide whether or
>     > not to use batching, although maybe BeginForeignModify() is a better
>     > place for putting that logic instead of GetModifyBatchSize()?  So, in
>     > create_foreign_modify(), instead of PgFdwModifyState.batch_size simply
>     > being set to match the table's or the server's value for the
>     > batch_size option, make it also consider the things that prevent
>     > batching and set the execution state's batch_size based on that.
>     > GetModifyBatchSize() simply returns that value.
>     >
>     > * Regarding the timing of calling GetModifyBatchSize() to set
>     > ri_BatchSize, I wonder if it wouldn't be better to call it just once,
>     > say from ExecInitModifyTable(), right after BeginForeignModify()
>     > returns?  I don't quite understand why it is being called from
>     > ExecInsert().  Can the batch size change once the execution starts?
>     >
> 
>     But it should be called just once. The idea is that initially we have
>     batch_size=0, and the fist call returns value that is >= 1. So we never
>     call it again. But maybe it could be called from BeginForeignModify, in
>     which case we'd not need this logic with first setting it to 0 etc.
> 
> 
> Right, although I was thinking that maybe ri_BatchSize itself is not to
> be written to by the FDW.  Not to say that’s doing anything wrong though.
> 
>     > * Lastly, how about calling it GetForeignModifyBatchSize() to be
>     > consistent with other nearby callbacks?
>     >
> 
>     Yeah, good point.
> 
>     >> I've removed the check for the
>     >> BatchInsert callback, though - the FDW knows whether it supports
>     that,
>     >> and it seems a bit pointless at the moment as there are no other
>     batch
>     >> callbacks. Maybe we should add an Assert somewhere, though?
>     >
>     > Hmm, not checking whether BatchInsert() exists may not be good idea,
>     > because if an FDW's GetModifyBatchSize() returns a value > 1 but
>     > there's no BatchInsert() function to call, ExecBatchInsert() would
>     > trip.  I don't see the newly added documentation telling FDW authors
>     > to either define both or none.
>     >
> 
>     Hmm. The BatchInsert check seemed somewhat unnecessary to me, but OTOH
>     it can't hurt, I guess. I'll ad it back.
> 
>     > Regarding how this plays with partitions, I don't think we need
>     > ExecGetTouchedPartitions(), because you can get the routed-to
>     > partitions using es_tuple_routing_result_relations.  Also, perhaps
> 
>     I'm not very familiar with es_tuple_routing_result_relations, but that
>     doesn't seem to work. I've replaced the flushing code at the end of
>     ExecModifyTable with a loop over es_tuple_routing_result_relations, but
>     then some of the rows are missing (i.e. not flushed).
> 
> 
> I should’ve mentioned es_opened_result_relations too which contain
> non-routing result relations.  So I really meant if (proute) then use
> es_tuple_routing_result_relations, else es_opened_result_relations. 
> This should work as long as batching is only used for inserts.
> 

Ah, right. That did the trick.

Attached is v9 with all of those tweaks, except for moving the BatchSize
call to BeginForeignModify - I tried that, but it did not seem like an
improvement, because we'd still need the checks for API callbacks in
ExecInsert for example. So I decided not to do that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: [HACKERS] [PATCH] Generic type subscripting
Next
From: John Naylor
Date:
Subject: Re: outdated references to replication timeout