Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Jonathan S. Katz |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | 831d8428-7a62-3bc6-4e3b-93458469f6a7@postgresql.org Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Support logical replication of DDLs
|
List | pgsql-hackers |
On 2/17/23 4:15 AM, Amit Kapila wrote: > On Fri, Feb 17, 2023 at 1:13 AM Jonathan S. Katz <jkatz@postgresql.org> wrote: >> >> On 2/16/23 2:38 PM, Alvaro Herrera wrote: >>> On 2023-Feb-16, Jonathan S. Katz wrote: >>> >>>> On 2/16/23 12:53 PM, Alvaro Herrera wrote: >>> >>>>> I don't think this is the fault of logical replication. Consider that >>>>> for the backend server, the function source code is just an opaque >>>>> string that is given to the plpgsql engine to interpret. So there's no >>>>> way for the logical DDL replication engine to turn this into runnable >>>>> code if the table name is not qualified. >>>> >>>> Sure, that's fair. That said, the example above would fall under a "typical >>>> use case", i.e. I'm replicating functions that call tables without schema >>>> qualification. This is pretty common, and as logical replication becomes >>>> used for more types of workloads (e.g. high availability), we'll definitely >>>> see this. >>> >>> Hmm, I think you're saying that replay should turn check_function_bodies >>> off, and I think I agree with that. >> >> Yes, exactly. +1 >> > > But will that be sufficient? I guess such functions can give errors at > a later stage when invoked at DML or another DDL time. Consider the > following example: > > Pub: > CREATE PUBLICATION pub FOR ALL TABLES with (ddl = 'all'); > > Sub: > (Set check_function_bodies = off in postgresql.conf) > CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub; > > Pub: > CREATE FUNCTION t1(a int) RETURNS int AS $$ > select a+1; > $$ LANGUAGE sql; > CREATE FUNCTION t(a int) RETURNS int AS $$ > select t1(a); > $$ LANGUAGE sql; > CREATE TABLE tbl1 (a int primary key, b text); > create index idx on tbl1(t(a)); > > insert into tbl1 values (1,1); -- This insert on publisher causes an > error on the subscriber. Check subscriber Logs (ERROR: function > t1(integer) does not exist at character 9.) I did reproduce this as is. I also reproduced this when I rewrote the function in PL/pgSQL. I also did an experiment using PL/v8[1] where I rewrote the functions above and did two tests: one via SPI, the other via PL/v8's ability to find and call a function[2]. In both cases, the INSERT statement failed citing the inability to find the function. The calls did work when I schema-qualified. However, when I converted the SQL-only functions to use the SQL standard syntax (BEGIN ATOMIC), I did not get this error and was able to successfully use this index with the table on both publisher and subscriber. I believe this is due to the generated function body having all of the schema qualifications in it. > This happens because of the function used in the index expression. > Now, this is not the only thing, the replication can even fail during > DDL replication when the function like above is IMMUTABLE and used as > follows: ALTER TABLE tbl ADD COLUMN d int DEFAULT t(1); > > Normally, it is recommended that users can fix such errors by > schema-qualifying affected names. See commits 11da97024a and > 582edc369c. I'm very familiar with those CVEs, but even though these are our recommended best practices, there is still a lot of code that does not schema-qualify the names of functions (including many of our own examples ;) If we're going to say "You can use logical replication to replicate functions, but you have to ensure you've schema-qualified any function calls within them," I think that will prevent people from being able to use this feature, particularly on existing applications. I guess there's a connection I'm missing here. For the failing examples above, I look at the pg_proc entries on both the publisher and the subscriber and they're identical. I'm not understanding why creating and executing the functions works on the publisher, but it does not on the subscriber. What additional info would the subscriber need to be able to successfully run these functions? Would we need to pass in some additional context, e.g. what the search_path was at the time the publisher created the function? Thanks, Jonathan [1] https://plv8.github.io/ [2] https://plv8.github.io/#-code-plv8-find_function-code-
Attachment
pgsql-hackers by date: