Re: postgres_fdw: batch inserts vs. before row triggers - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: postgres_fdw: batch inserts vs. before row triggers
Date
Msg-id CAPmGK15Y=craA1T-7-35Xk-fPB7VOrh7hFRkjrqYcmZPoRACxg@mail.gmail.com
Whole thread Raw
In response to Re: postgres_fdw: batch inserts vs. before row triggers  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: postgres_fdw: batch inserts vs. before row triggers
List pgsql-hackers
On Sat, Nov 26, 2022 at 1:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't think the committed patch is acceptable at all, at least
> not in the back branches, because it creates a severe ABI break.
> Specifically, by adding a field to ResultRelInfo you have changed
> the array stride of es_result_relations, and that will break any
> previously-compiled extension code that accesses that array.

Ugh.

> I'm not terribly pleased with it having added a field to EState
> either.  That seems much more global than what we need here.

The field stores pending buffered inserts, and I added it to Estate so
that it can be shared across primary/secondary ModifyTable nodes.
(Re-)consider this:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t1 (a text, b int);
create foreign table ft1 (a text, b int) server loopback options
(table_name 't1');
create table w1 (a text, b int);
create function ft1_rowcount_trigf() returns trigger language plpgsql as
$$
begin
  raise notice '%: there are % rows in ft1',
  tg_name, (select count(*) from ft1);
  return new;
end;
$$;
create trigger ft1_rowcount_trigger before insert on w1 for each row
execute function ft1_rowcount_trigf();
alter server loopback options (add batch_size '10');

with t as (insert into w1 values ('foo', 10), ('bar', 20) returning *)
insert into ft1 select * from t;
NOTICE:  ft1_rowcount_trigger: there are 0 rows in ft1
NOTICE:  ft1_rowcount_trigger: there are 1 rows in ft1
INSERT 0 2

For this query, the primary ModifyTable node doing batch insert is
executed concurrently with the secondary ModifyTable node doing the
modifying CTE, and in the secondary ModifyTable node, any pending
buffered insert done in the primary ModifyTable node needs to be
flushed before firing the BEFORE ROW trigger, so the row is visible to
the trigger.  The field is useful for cases like this.

> Couldn't we add the field to ModifyTableState, instead?

We could probably do so, but I thought having a global list would be
more efficient to handle pending buffered inserts than that.

Anyway I will work on this further.  Thanks for looking at this!

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Avoid streaming the transaction which are skipped (in corner cases)
Next
From: Amit Kapila
Date:
Subject: Re: Avoid distributing new catalog snapshot again for the transaction being decoded.