Re: Problem with transition tables on partitioned tables with foreign-table partitions - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: Problem with transition tables on partitioned tables with foreign-table partitions |
Date | |
Msg-id | CA+HiwqGAEP8auzZmP4Q7e9p7uR=J+WqMhyzYyZqO+xLc4s_YUw@mail.gmail.com Whole thread Raw |
In response to | Re: Problem with transition tables on partitioned tables with foreign-table partitions (Etsuro Fujita <etsuro.fujita@gmail.com>) |
Responses |
Re: Problem with transition tables on partitioned tables with foreign-table partitions
|
List | pgsql-hackers |
On Wed, Jul 2, 2025 at 7:05 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > On Tue, Jul 1, 2025 at 4:42 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jul 1, 2025 at 11:55 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > > > While working on something else, I noticed that while we disallow > > > transition tables on foreign tables, we allow transition tables on > > > partitioned tables with foreign-table partitions, which produces > > > incorrect results. Here is an example using postgres_fdw: > > > > > > create table parent (a text, b int) partition by list (a); > > > create table loct (a text, b int); > > > create foreign table child (a text, b int) > > > server loopback options (table_name 'loct'); > > > alter table parent attach partition child for values in ('AAA'); > > > > > > create function dump_insert() returns trigger language plpgsql as > > > $$ > > > begin > > > raise notice 'trigger = %, new table = %', > > > TG_NAME, > > > (select string_agg(new_table::text, ', ' order by a) > > > from new_table); > > > return null; > > > end; > > > $$; > > > create trigger parent_insert_trig > > > after insert on parent referencing new table as new_table > > > for each statement execute procedure dump_insert(); > > > > > > create function intercept_insert() returns trigger language plpgsql as > > > $$ > > > begin > > > new.b = new.b + 1000; > > > return new; > > > end; > > > $$; > > > create trigger intercept_insert_loct > > > before insert on loct > > > for each row execute procedure intercept_insert(); > > > > > > insert into parent values ('AAA', 42); > > > NOTICE: trigger = parent_insert_trig, new table = (AAA,42) > > > INSERT 0 1 > > > > > > The trigger shows the original tuple created by the core, not the > > > actual tuple inserted into the foreign-table partition, as > > > postgres_fdw does not collect the actual tuple, of course! > > > > Maybe I'm missing something, but given that the intercept_insert() > > function is applied during the "remote" operation, isn't it expected > > that the parent table's trigger for a "local" operation shows the > > original tuple? > > That is the question of how we define the after image of a row > inserted into a foreign table, but consider the case where the > partition is a plain table: > > create table parent (a text, b int) partition by list (a); > create table child partition of parent for values in ('AAA'); > create trigger intercept_insert_child > before insert on child > for each row execute procedure intercept_insert(); > insert into parent values ('AAA', 42); > NOTICE: trigger = parent_insert_trig, new table = (AAA,1042) > INSERT 0 1 > > The trigger shows the final tuple, not the original tuple. So from a > consistency perspective, I thought it would be good if the trigger > does so even in the case where the partition is a foreign table. Ok, but if you define the trigger on the foreign table partition (child) as follows, you do get what I think is the expected result? create trigger intercept_insert_foreign_child before insert on child for each row execute procedure intercept_insert(); insert into parent values ('AAA', 42); NOTICE: trigger = parent_insert_trig, new table = (AAA,1042) -- 2042, because row modified by both triggers table parent; a | b -----+------ AAA | 2042 (1 row) Or perhaps you're saying that the row returned by this line in ExecInsert(): /* * insert into foreign table: let the FDW do it */ slot = resultRelInfo->ri_FdwRoutine->ExecForeignInsert(estate, resultRelInfo, slot, planSlot); is not the expected "after image", and thus should not be added to the parent's transition table? IIUC, to prevent that, we now hit the following error in: void ExecARInsertTriggers(EState *estate, ResultRelInfo *relinfo, TupleTableSlot *slot, List *recheckIndexes, TransitionCaptureState *transition_capture) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; if (relinfo->ri_FdwRoutine && transition_capture && transition_capture->tcs_insert_new_table) { Assert(relinfo->ri_RootResultRelInfo); ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot collect transition tuples from child foreign tables"))); } > > > So I would > > > like to propose to fix this by the following: 1) disable using direct > > > modify to modify foreign-table partitions if there are any > > > transition-table triggers on the partitioned table, and then 2) throw > > > an error in ExecARInsertTriggers()/ExecARUpdateTriggers()/ExecARDeleteTriggers() > > > if they collects transition tuple(s) from a foreign-table partition. > > > > Is (2) intended to catch cases that occur during a foreign insert and > > foreign/non-direct update/delete? > > That is right; the patch forces the FDW to perform ExecForeign* > functions, and then throws an error in ExecAR* functions. One good > thing about this is that we are able to avoid throwing the error when > local/remote row-level BEFORE triggers return NULL. Given my question above, I’m not sure I fully understand the intention behind adding these checks. -- Thanks, Amit Langote
pgsql-hackers by date: