Thread: postgres_fdw: batch inserts vs. before row triggers

postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

17 April 2022, 09:20:48

Hi,
(I added Tomas in CC:.)

One thing I noticed while reviewing the patch for fast copying into
foreign tables/partitions using batch insert [1] is that in
postgres_fdw we allow batch-inserting into foreign tables/partitions
with before row triggers, but such triggers might query the target
table/partition and act differently if the tuples that have already
been processed and prepared for batch-insertion are not there.  Here
is an example using HEAD:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t (a int);
create foreign table ft (a int) server loopback options (table_name 't');
create function ft_rowcount_tf() returns trigger as $$ begin raise
notice '%: rows = %', tg_name, (select count(*) from ft); return new;
end; $$ language plpgsql;
create trigger ft_rowcount before insert on ft for each row execute
function ft_rowcount_tf();

insert into ft select i from generate_series(1, 10) i;
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 1
NOTICE:  ft_rowcount: rows = 2
NOTICE:  ft_rowcount: rows = 3
NOTICE:  ft_rowcount: rows = 4
NOTICE:  ft_rowcount: rows = 5
NOTICE:  ft_rowcount: rows = 6
NOTICE:  ft_rowcount: rows = 7
NOTICE:  ft_rowcount: rows = 8
NOTICE:  ft_rowcount: rows = 9
INSERT 0 10

This looks good, but when batch insert is enabled, the trigger
produces incorrect results:

alter foreign table ft options (add batch_size '10');
delete from ft;

insert into ft select i from generate_series(1, 10) i;
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
NOTICE:  ft_rowcount: rows = 0
INSERT 0 10

So I think we should disable batch insert in such cases, just as we
disable multi insert when there are any before row triggers on the
target (local) tables/partitions in copyfrom.c.  Attached is a patch
for that.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/bc489202-9855-7550-d64c-ad2d83c24867%40postgrespro.ru

Attachment

postgres_fdw-disable-batching.patch

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

19 April 2022, 09:16:03

On Sun, Apr 17, 2022 at 6:20 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> Here
> is an example using HEAD:
>
> create extension postgres_fdw;
> create server loopback foreign data wrapper postgres_fdw options
> (dbname 'postgres');
> create user mapping for current_user server loopback;
> create table t (a int);
> create foreign table ft (a int) server loopback options (table_name 't');
> create function ft_rowcount_tf() returns trigger as $$ begin raise
> notice '%: rows = %', tg_name, (select count(*) from ft); return new;
> end; $$ language plpgsql;
> create trigger ft_rowcount before insert on ft for each row execute
> function ft_rowcount_tf();
>
> insert into ft select i from generate_series(1, 10) i;
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 1
> NOTICE:  ft_rowcount: rows = 2
> NOTICE:  ft_rowcount: rows = 3
> NOTICE:  ft_rowcount: rows = 4
> NOTICE:  ft_rowcount: rows = 5
> NOTICE:  ft_rowcount: rows = 6
> NOTICE:  ft_rowcount: rows = 7
> NOTICE:  ft_rowcount: rows = 8
> NOTICE:  ft_rowcount: rows = 9
> INSERT 0 10
>
> This looks good, but when batch insert is enabled, the trigger
> produces incorrect results:
>
> alter foreign table ft options (add batch_size '10');
> delete from ft;
>
> insert into ft select i from generate_series(1, 10) i;
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> INSERT 0 10

Actually, the results are correct, as we do batch-insert here.  But I
just wanted to show that the trigger behaves *differently* when doing
batch-insert.

> So I think we should disable batch insert in such cases, just as we
> disable multi insert when there are any before row triggers on the
> target (local) tables/partitions in copyfrom.c.  Attached is a patch
> for that.

If there are no objections from Tomas or anyone else, I'll commit the patch.

Best regards,
Etsuro Fujita

Re: postgres_fdw: batch inserts vs. before row triggers

From

Tomas Vondra

Date:

19 April 2022, 12:00:25


On 4/19/22 11:16, Etsuro Fujita wrote:
> On Sun, Apr 17, 2022 at 6:20 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> Here
>> is an example using HEAD:
>>
>> create extension postgres_fdw;
>> create server loopback foreign data wrapper postgres_fdw options
>> (dbname 'postgres');
>> create user mapping for current_user server loopback;
>> create table t (a int);
>> create foreign table ft (a int) server loopback options (table_name 't');
>> create function ft_rowcount_tf() returns trigger as $$ begin raise
>> notice '%: rows = %', tg_name, (select count(*) from ft); return new;
>> end; $$ language plpgsql;
>> create trigger ft_rowcount before insert on ft for each row execute
>> function ft_rowcount_tf();
>>
>> insert into ft select i from generate_series(1, 10) i;
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 1
>> NOTICE:  ft_rowcount: rows = 2
>> NOTICE:  ft_rowcount: rows = 3
>> NOTICE:  ft_rowcount: rows = 4
>> NOTICE:  ft_rowcount: rows = 5
>> NOTICE:  ft_rowcount: rows = 6
>> NOTICE:  ft_rowcount: rows = 7
>> NOTICE:  ft_rowcount: rows = 8
>> NOTICE:  ft_rowcount: rows = 9
>> INSERT 0 10
>>
>> This looks good, but when batch insert is enabled, the trigger
>> produces incorrect results:
>>
>> alter foreign table ft options (add batch_size '10');
>> delete from ft;
>>
>> insert into ft select i from generate_series(1, 10) i;
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> INSERT 0 10
> 
> Actually, the results are correct, as we do batch-insert here.  But I
> just wanted to show that the trigger behaves *differently* when doing
> batch-insert.
> 
>> So I think we should disable batch insert in such cases, just as we
>> disable multi insert when there are any before row triggers on the
>> target (local) tables/partitions in copyfrom.c.  Attached is a patch
>> for that.
> 
> If there are no objections from Tomas or anyone else, I'll commit the patch.
> 

+1, I think it's a bug to do batch insert in this case.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

21 April 2022, 06:49:45

Hi,

On Tue, Apr 19, 2022 at 9:00 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> On 4/19/22 11:16, Etsuro Fujita wrote:
> > On Sun, Apr 17, 2022 at 6:20 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> >> So I think we should disable batch insert in such cases, just as we
> >> disable multi insert when there are any before row triggers on the
> >> target (local) tables/partitions in copyfrom.c.  Attached is a patch
> >> for that.
> >
> > If there are no objections from Tomas or anyone else, I'll commit the patch.

> +1, I think it's a bug to do batch insert in this case.

Pushed and back-patched to v14, after tweaking a comment a little bit.

Thanks!

Best regards,
Etsuro Fujita

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

03 August 2022, 05:24:48

Hi,

Another thing I noticed while working on the "Fast COPY FROM based on
batch insert" patch is: batch inserts vs. WITH CHECK OPTION
constraints from parent views.  Here is an example on a normal build
producing incorrect results.

CREATE TABLE base_tbl (a int, b int);
CREATE FUNCTION row_before_insert_trigfunc() RETURNS trigger AS
$$BEGIN NEW.a := NEW.a + 10; RETURN NEW; END$$ LANGUAGE plpgsql;
CREATE TRIGGER row_before_insert_trigger BEFORE INSERT ON base_tbl FOR
EACH ROW EXECUTE PROCEDURE row_before_insert_trigfunc();
CREATE FOREIGN TABLE foreign_tbl (a int, b int) SERVER loopback
OPTIONS (table_name 'base_tbl');
CREATE VIEW rw_view AS SELECT * FROM foreign_tbl WHERE a < b WITH CHECK OPTION;
ALTER SERVER loopback OPTIONS (ADD batch_size '10');

EXPLAIN VERBOSE INSERT INTO rw_view VALUES (0, 15), (0, 5);
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Insert on public.foreign_tbl  (cost=0.00..0.03 rows=0 width=0)
   Remote SQL: INSERT INTO public.base_tbl(a, b) VALUES ($1, $2) RETURNING a, b
   Batch Size: 10
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=8)
         Output: "*VALUES*".column1, "*VALUES*".column2
(5 rows)

INSERT INTO rw_view VALUES (0, 15), (0, 5);
INSERT 0 2

This isn't correct; the INSERT query should abort because the
second-inserted row violates the WCO constraint as it is changed to
(10, 5) by the BEFORE ROW trigger.

Also, the query caused an assertion failure on an assert-enabled build, like:

TRAP: FailedAssertion("*numSlots == 1", File: "postgres_fdw.c", Line:
4164, PID: 7775)

I think the root cause for these is that WCO constraints are enforced
locally, but in batch-insert mode postgres_fdw cannot currently
retrieve the data needed to enforce such constraints locally that was
actually inserted on the remote side (except for the first-inserted
row).  And I think this leads to the incorrect results on the normal
build as the WCO constraint is enforced with the data passed from the
core for the second-inserted row, and leads to the assertion failure
on the assert-enabled build.

To fix, I modified postgresGetForeignModifyBatchSize() to disable
batch insert when there are any such constraints, like when there are
any AFTER ROW triggers on the foreign table.  Attached is a patch for
that.

If there are no objections, I'll commit the patch.

Best regards,
Etsuro Fujita

Attachment

postgres_fdw-disable-batching-further.patch

Re: postgres_fdw: batch inserts vs. before row triggers

From

Matthias van de Meent

Date:

03 August 2022, 21:52:02

On Tue, 19 Apr 2022 at 14:00, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 4/19/22 11:16, Etsuro Fujita wrote:
> > On Sun, Apr 17, 2022 at 6:20 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> >> Here
> >> is an example using HEAD:
> >>
> >> create extension postgres_fdw;
> >> create server loopback foreign data wrapper postgres_fdw options
> >> (dbname 'postgres');
> >> create user mapping for current_user server loopback;
> >> create table t (a int);
> >> create foreign table ft (a int) server loopback options (table_name 't');
> >> create function ft_rowcount_tf() returns trigger as $$ begin raise
> >> notice '%: rows = %', tg_name, (select count(*) from ft); return new;
> >> end; $$ language plpgsql;
> >> create trigger ft_rowcount before insert on ft for each row execute
> >> function ft_rowcount_tf();
> >>
> >> insert into ft select i from generate_series(1, 10) i;
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 1
> >> NOTICE:  ft_rowcount: rows = 2
> >> NOTICE:  ft_rowcount: rows = 3
> >> NOTICE:  ft_rowcount: rows = 4
> >> NOTICE:  ft_rowcount: rows = 5
> >> NOTICE:  ft_rowcount: rows = 6
> >> NOTICE:  ft_rowcount: rows = 7
> >> NOTICE:  ft_rowcount: rows = 8
> >> NOTICE:  ft_rowcount: rows = 9
> >> INSERT 0 10
> >>
> >> This looks good, but when batch insert is enabled, the trigger
> >> produces incorrect results:
> >>
> >> alter foreign table ft options (add batch_size '10');
> >> delete from ft;
> >>
> >> insert into ft select i from generate_series(1, 10) i;
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> NOTICE:  ft_rowcount: rows = 0
> >> INSERT 0 10
> >
> > Actually, the results are correct, as we do batch-insert here.  But I
> > just wanted to show that the trigger behaves *differently* when doing
> > batch-insert.
>
> +1, I think it's a bug to do batch insert in this case.

I don't have a current version of the SQL spec, but one preliminary
version of SQL:2012 I retrieved via the wiki details that all BEFORE
triggers on INSERT/UPDATE/DELETE statements are all executed before
_any_ of that statements' affected data is modified.

See the "SQL:2011 (preliminary)" document you can grab on the wiki,
Part 2: for INSERT, in 15.10 (4) the BEFORE triggers on the changeset
are executed, and only after that in section 15.10 (5)(c) the
changeset is inserted into the target table. During the BEFORE-trigger
this table does not contain the rows of the changeset, thus a count(*)
on that table would result in a single value for all the BEFORE
triggers triggered on that statement, regardless of the FOR EACH ROW
specifier. The sections for DELETE are 15.7 (6) and 15.7 (7); and for
UPDATE 15.13(7) and 15.13(9) respectively.

I don't know about the semantics of triggers in the latest SQL
standard versions, but based on that sample it seems like we're
non-compliant on BEFORE trigger behaviour, and it doesn't seem like
it's documented in the trigger documentation.

I seem to recall a mail on this topic (changes in trigger execution
order with respect to the DML it is triggered by in the newest SQL
spec) but I can't seem to find that thread.

Kind regards,

Matthias van de Meent

Re: postgres_fdw: batch inserts vs. before row triggers

From

Tom Lane

Date:

03 August 2022, 21:57:57

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> I don't have a current version of the SQL spec, but one preliminary
> version of SQL:2012 I retrieved via the wiki details that all BEFORE
> triggers on INSERT/UPDATE/DELETE statements are all executed before
> _any_ of that statements' affected data is modified.
> ...
> I don't know about the semantics of triggers in the latest SQL
> standard versions, but based on that sample it seems like we're
> non-compliant on BEFORE trigger behaviour, and it doesn't seem like
> it's documented in the trigger documentation.

I think we're compliant if you declare the trigger functions as
stable (or immutable, but in any case where this matters, I think
you'd be lying).  They'll then run with the snapshot of the calling
query, in which those updates are not yet visible.

This is documented somewhere, but maybe not anywhere near triggers.

            regards, tom lane

Re: postgres_fdw: batch inserts vs. before row triggers

From

Matthias van de Meent

Date:

03 August 2022, 22:35:13

On Wed, 3 Aug 2022 at 23:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > I don't have a current version of the SQL spec, but one preliminary
> > version of SQL:2012 I retrieved via the wiki details that all BEFORE
> > triggers on INSERT/UPDATE/DELETE statements are all executed before
> > _any_ of that statements' affected data is modified.
> > ...
> > I don't know about the semantics of triggers in the latest SQL
> > standard versions, but based on that sample it seems like we're
> > non-compliant on BEFORE trigger behaviour, and it doesn't seem like
> > it's documented in the trigger documentation.
>
> I think we're compliant if you declare the trigger functions as
> stable (or immutable, but in any case where this matters, I think
> you'd be lying).  They'll then run with the snapshot of the calling
> query, in which those updates are not yet visible.
>
> This is documented somewhere, but maybe not anywhere near triggers.

Thank you for this pointer.

Looking around a bit, it seems like this behaviour for functions is
indeed documented in xfunc.sgml, but rendered docs page [0] does not
seem to mention triggers, nor does the triggers page link to that part
of the xfunc document. This makes it quite easy to overlook that this
is expected (?) behaviour for VOLATILE functions only.

Kind regards,

Matthias van de Meent

[0] https://www.postgresql.org/docs/current/xfunc-volatility.html

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

05 August 2022, 08:36:22

On Wed, Aug 3, 2022 at 2:24 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> To fix, I modified postgresGetForeignModifyBatchSize() to disable
> batch insert when there are any such constraints, like when there are
> any AFTER ROW triggers on the foreign table.  Attached is a patch for
> that.
>
> If there are no objections, I'll commit the patch.

Pushed after modifying the patch a bit so that in that function the
WCO test in the if test is done before the trigger test, as the former
would be cheaper than the latter.

Best regards,
Etsuro Fujita

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

18 November 2022, 11:46:59

Hi,

While working on something else, I notice some more oddities.  Here is
an example:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t1 (a text, b int);
create foreign table ft1 (a text, b int) server loopback options
(table_name 't1');
create table w1 (a text, b int);
create function ft1_rowcount_trigf() returns trigger language plpgsql
as $$ begin raise notice '%: there are % rows in ft1', tg_name,
(select count(*) from ft1); return new; end; $$;
create trigger ft1_rowcount_trigger before insert on w1 for each row
execute function ft1_rowcount_trigf();
alter server loopback options (add batch_size '10');

with t as (insert into w1 values ('foo', 10), ('bar', 20) returning *)
insert into ft1 select * from t;
NOTICE:  ft1_rowcount_trigger: there are 0 rows in ft1
NOTICE:  ft1_rowcount_trigger: there are 0 rows in ft1
INSERT 0 2

The command tag shows that two rows were inserted into ft1, but:

select * from ft1;
  a  | b
-----+----
 foo | 10
 bar | 20
 foo | 10
 bar | 20
(4 rows)

ft1 has four rows, which is wrong.  Also, when inserting the second
row (‘bar’, 20) into w1, the BEFORE ROW INSERT trigger should see the
first row (‘foo’, 10) in ft1, but it reports no rows were visible
there.

The reason for the former is that this bit added by commit b663a4136
is done not only when running the primary ModifyTable node but when
running the secondary ModifyTable node (with the wrong
ModifyTableState).

    /*
     * Insert remaining tuples for batch insert.
     */
    if (proute)
        relinfos = estate->es_tuple_routing_result_relations;
    else
        relinfos = estate->es_opened_result_relations;

    foreach(lc, relinfos)
    {
        resultRelInfo = lfirst(lc);
        if (resultRelInfo->ri_NumSlots > 0)
            ExecBatchInsert(node, resultRelInfo,
                            resultRelInfo->ri_Slots,
                            resultRelInfo->ri_PlanSlots,
                            resultRelInfo->ri_NumSlots,
                            estate, node->canSetTag);
    }

The reason for the latter is that that commit fails to flush pending
inserts before executing any BEFORE ROW triggers, so that rows are
visible to such triggers.

Attached is a patch for fixing these issues.  In the patch I added to
the EState struct a List member es_insert_pending_result_relations to
store ResultRelInfos for foreign tables on which batch inserts are to
be performed, so that we avoid scanning through
es_tuple_routing_result_relations or es_opened_result_relations each
time when flushing pending inserts to the foreign tables.

Best regards,
Etsuro Fujita

Attachment

fix-handling-of-pending-inserts.patch

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

24 November 2022, 11:19:10

On Fri, Nov 18, 2022 at 8:46 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> Attached is a patch for fixing these issues.

Here is an updated patch.  In the attached, I added an assertion to
ExecInsert().  Also, I tweaked comments and test cases a little bit,
for consistency.  Also, I noticed a copy-and-pasteo in a comment in
ExecBatchInsert(), so I fixed it as well.

Barring objections, I will commit the patch.

Best regards,
Etsuro Fujita

Attachment

fix-handling-of-pending-inserts-2.patch

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

25 November 2022, 08:59:22

On Thu, Nov 24, 2022 at 8:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> Here is an updated patch.  In the attached, I added an assertion to
> ExecInsert().  Also, I tweaked comments and test cases a little bit,
> for consistency.  Also, I noticed a copy-and-pasteo in a comment in
> ExecBatchInsert(), so I fixed it as well.
>
> Barring objections, I will commit the patch.

I have committed the patch.

Best regards,
Etsuro Fujita

Re: postgres_fdw: batch inserts vs. before row triggers

From

Tom Lane

Date:

25 November 2022, 16:57:33

Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> I have committed the patch.

Apologies for not having paid attention to this thread, but ...

I don't think the committed patch is acceptable at all, at least
not in the back branches, because it creates a severe ABI break.
Specifically, by adding a field to ResultRelInfo you have changed
the array stride of es_result_relations, and that will break any
previously-compiled extension code that accesses that array.

I'm not terribly pleased with it having added a field to EState
either.  That seems much more global than what we need here.
Couldn't we add the field to ModifyTableState, instead?

            regards, tom lane

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

26 November 2022, 11:38:11

On Sat, Nov 26, 2022 at 1:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't think the committed patch is acceptable at all, at least
> not in the back branches, because it creates a severe ABI break.
> Specifically, by adding a field to ResultRelInfo you have changed
> the array stride of es_result_relations, and that will break any
> previously-compiled extension code that accesses that array.

Ugh.

> I'm not terribly pleased with it having added a field to EState
> either.  That seems much more global than what we need here.

The field stores pending buffered inserts, and I added it to Estate so
that it can be shared across primary/secondary ModifyTable nodes.
(Re-)consider this:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t1 (a text, b int);
create foreign table ft1 (a text, b int) server loopback options
(table_name 't1');
create table w1 (a text, b int);
create function ft1_rowcount_trigf() returns trigger language plpgsql as
$$
begin
  raise notice '%: there are % rows in ft1',
  tg_name, (select count(*) from ft1);
  return new;
end;
$$;
create trigger ft1_rowcount_trigger before insert on w1 for each row
execute function ft1_rowcount_trigf();
alter server loopback options (add batch_size '10');

with t as (insert into w1 values ('foo', 10), ('bar', 20) returning *)
insert into ft1 select * from t;
NOTICE:  ft1_rowcount_trigger: there are 0 rows in ft1
NOTICE:  ft1_rowcount_trigger: there are 1 rows in ft1
INSERT 0 2

For this query, the primary ModifyTable node doing batch insert is
executed concurrently with the secondary ModifyTable node doing the
modifying CTE, and in the secondary ModifyTable node, any pending
buffered insert done in the primary ModifyTable node needs to be
flushed before firing the BEFORE ROW trigger, so the row is visible to
the trigger.  The field is useful for cases like this.

> Couldn't we add the field to ModifyTableState, instead?

We could probably do so, but I thought having a global list would be
more efficient to handle pending buffered inserts than that.

Anyway I will work on this further.  Thanks for looking at this!

Best regards,
Etsuro Fujita

Re: postgres_fdw: batch inserts vs. before row triggers

From

Tom Lane

Date:

26 November 2022, 15:11:15

Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> On Sat, Nov 26, 2022 at 1:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Couldn't we add the field to ModifyTableState, instead?

> We could probably do so, but I thought having a global list would be
> more efficient to handle pending buffered inserts than that.

OK, as long as there's a reason for doing it that way, it's OK
by me.  I don't think that adding a field at the end of EState
is an ABI problem.

We have to do something else than add to ResultRelInfo, though.

            regards, tom lane

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

02 December 2022, 07:54:34

On Sun, Nov 27, 2022 at 12:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> > On Sat, Nov 26, 2022 at 1:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Couldn't we add the field to ModifyTableState, instead?
>
> > We could probably do so, but I thought having a global list would be
> > more efficient to handle pending buffered inserts than that.
>
> OK, as long as there's a reason for doing it that way, it's OK
> by me.  I don't think that adding a field at the end of EState
> is an ABI problem.
>
> We have to do something else than add to ResultRelInfo, though.

OK, I removed from ResultRelInfo a field that I added in the commit to
save the owning ModifyTableState if insert-pending, and added to
EState another List member to save such ModifyTableStates, instead.  I
am planning to apply this to not only back branches but HEAD, to make
back-patching easy, if there are no objections.

Best regards,
Etsuro Fujita

Attachment

Avoid-ABI-break.patch

Re: postgres_fdw: batch inserts vs. before row triggers

From

Etsuro Fujita

Date:

08 December 2022, 07:39:31

On Fri, Dec 2, 2022 at 4:54 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> On Sun, Nov 27, 2022 at 12:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > OK, as long as there's a reason for doing it that way, it's OK
> > by me.  I don't think that adding a field at the end of EState
> > is an ABI problem.
> >
> > We have to do something else than add to ResultRelInfo, though.
>
> OK, I removed from ResultRelInfo a field that I added in the commit to
> save the owning ModifyTableState if insert-pending, and added to
> EState another List member to save such ModifyTableStates, instead.  I
> am planning to apply this to not only back branches but HEAD, to make
> back-patching easy, if there are no objections.

There seems to be no objection, so pushed.

Best regards,
Etsuro Fujita