Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id 131a695b-9010-6751-9ea7-3dabe3555f8c@enterprisedb.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: POC: postgres_fdw insert batching  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 1/13/21 10:15 AM, Amit Langote wrote:
> Hi Tomas, Tsunakawa-san,
> 
> Thanks for your work on this.
> 
> On Tue, Jan 12, 2021 at 11:06 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> AFAICS the discussions about making this use COPY and/or libpq
>> pipelining (neither of which is committed yet) ended with the conclusion
>> that those changes are somewhat independent, and that it's worth getting
>> this committed in the current form. Barring objections, I'll push this
>> within the next couple days.
> 
> I was trying this out today (been meaning to do so for a while) and
> noticed that this fails when there are AFTER ROW triggers on the
> foreign table.  Here's an example:
> 
> create extension postgres_fdw ;
> create server lb foreign data wrapper postgres_fdw ;
> create user mapping for current_user server lb;
> create table p (a numeric primary key);
> create foreign table fp (a int) server lb options (table_name 'p');
> create function print_row () returns trigger as $$ begin raise notice
> '%', new; return null; end; $$ language plpgsql;
> create trigger after_insert_trig after insert on fp for each row
> execute function print_row();
> insert into fp select generate_series (1, 10);
> <crashes>
> 
> Apparently, the new code seems to assume that batching wouldn't be
> active when the original query contains RETURNING clause but some
> parts fail to account for the case where RETURNING is added to the
> query to retrieve the tuple to pass to the AFTER TRIGGER.
> Specifically, the Assert in the following block in
> execute_foreign_modify() is problematic:
> 
>     /* Check number of rows affected, and fetch RETURNING tuple if any */
>     if (fmstate->has_returning)
>     {
>         Assert(*numSlots == 1);
>         n_rows = PQntuples(res);
>         if (n_rows > 0)
>             store_returning_result(fmstate, slots[0], res);
>     }
> 

Thanks for the report. Yeah, I think there's a missing check in
ExecInsert. Adding

  (!resultRelInfo->ri_TrigDesc->trig_insert_after_row)

solves this. But now I'm wondering if this is the wrong place to make
this decision. I mean, why should we make the decision here, when the
decision whether to have a RETURNING clause is made in postgres_fdw in
deparseReturningList? We don't really know what the other FDWs will do,
for example.

So I think we should just move all of this into GetModifyBatchSize. We
can start with ri_BatchSize = 0. And then do

  if (resultRelInfo->ri_BatchSize == 0)
    resultRelInfo->ri_BatchSize =
      resultRelInfo->ri_FdwRoutine->GetModifyBatchSize(resultRelInfo);

  if (resultRelInfo->ri_BatchSize > 1)
  {
    ... do batching ...
  }

The GetModifyBatchSize would always return value > 0, so either 1 (no
batching) or >1 (batching).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: outdated references to replication timeout
Next
From: Sehrope Sarkuni
Date:
Subject: Re: Moving other hex functions to /common