Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CAA4eK1JVynFsj+mcRWj9sewR2yNUs6LuNxJ0eN-gNJ83oKcUOQ@mail.gmail.com Whole thread Raw |
In response to | RE: Support logical replication of DDLs ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>) |
List | pgsql-hackers |
On Thu, Jun 9, 2022 at 5:14 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > Hi, > > I did some research for one potential problem[1] mentioned earlier which is related > to the function execution when replicating DDL. > > [1]> 4. Statements that have nondeterministic side effects (e.g., as caused > > by triggers, stored procedures, user-defined functions) may result in > > different side effects occurring on each subscriber. > > > > Think how to handle triggers and functions with same name but different > > purpose. > > > > Examples: > ALTER TABLE ADD CONSTRAINT func() > ALTER TABLE ADD COLUMN DEFAULT func() > CREATE TRIGGER ... execute procedure func() > ... > > When replication the above DDLs, there are some cases we need to think about. > > ---------------- > 1) The functions used in DDL have same definition among pub/sub. > > In this case, if the functions include DML/DDL operations, it will result in > unexpected behavior. > > For example: > > --- both pub and sub have the same function test_func(). > create function test_func() > returns integer as ' > begin > CREATE TABLE test(a int); > INSERT INTO test values(1); > return 0; > end > ' language plpgsql; > > --- We replicate the following DDL > ALTER TABLE t ADD COLUMN a int DEFAULT test_func(); > > There are three SQLs that would be replicated to the subscriber: > CREATE TABLE test(a int); > INSERT(1); > ALTER TABLE t ADD COLUMN a int DEFAULT test_func(); > > Then, we would create the table "test" twice and insert same value twice(first by > executing DEFAULT function, second by replaying the CREATE TABLE and INSERT > command) on subcriber. > The other kind of problem is that we don't know whether the tables being accessed by these DDL/DML are published or not. So blindly allowing such functions can allow unintended clauses like the tables accessed in those functions may not even exist on the subscriber-side. > One possible idea is that we only allow replicating 'immutable/stable' function > which won't include write action so that we don't have this problem. But it > seems a bit restrictive. > > ---------------- > > 2) The functions used in DDL have different definitions among pub/sub. > > I am not sure how to handle this case as this could result in unpredictable > behavior based on the different definitions. And it's difficult to compare the > function definition among pub/sub because the function could call other > functions nested. > > OTOH, the current behavior of this case on HEAD is that we don't check the > consistency of the functions executed on pub/sub. For example: the row triggers > will always be fired on subscriber(when enabled) without checking if the same > trigger exists on publisher. So, it might be acceptable if we document this > restriction, although I am not sure. > > ******************* > > - About the solution for the above two points. I think one solution could be: > > We can document that user should make sure the DDL to be replicated should not > execute any function which could execute DML/DDL. This seems acceptable as it's > a pretty rare case to execute DML/DDL in a CONSTRAINT function or DEFAULT > function. And the document[1] already suggest similar thing for CONSTRAINT > function. Besides, we can also document that and the functions should be > defined in a consistent way among pub/sub. > > [1] https://www.postgresql.org/docs/devel/ddl-constraints.html > > PostgreSQL assumes that CHECK constraints' conditions are immutable, that is, > > they will always give the same result for the same input row. This assumption > > is what justifies examining CHECK constraints only when rows are inserted or > > updated, and not at other times. (The warning above about not referencing > > other table data is really a special case of this restriction.) > > - Another solution could be: > > We coud introduce a new function flag called (replica_safety) and the values > could be 'safe'/'unsafe'. 'safe' indicates that it's safe to be replicated to > subscriber and it's safe to be executed when replay the DDL from other > publisher. > Yeah, something like this could be a good idea but I think for the first version we should simply raise a WARNING for such DDLs indicating that they won't be replicated. -- With Regards, Amit Kapila.
pgsql-hackers by date: