Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From Jonathan S. Katz
Subject Re: Support logical replication of DDLs
Date
Msg-id e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org
Whole thread Raw
In response to Re: Support logical replication of DDLs  ("Jonathan S. Katz" <jkatz@postgresql.org>)
Responses Re: Support logical replication of DDLs
List pgsql-hackers
On 2/16/23 2:43 PM, Jonathan S. Katz wrote:
> On 2/16/23 2:38 PM, Alvaro Herrera wrote:
>> On 2023-Feb-16, Jonathan S. Katz wrote:
>>
>>> On 2/16/23 12:53 PM, Alvaro Herrera wrote:
>>
>>>> I don't think this is the fault of logical replication.  Consider that
>>>> for the backend server, the function source code is just an opaque
>>>> string that is given to the plpgsql engine to interpret.  So there's no
>>>> way for the logical DDL replication engine to turn this into runnable
>>>> code if the table name is not qualified.
>>>
>>> Sure, that's fair. That said, the example above would fall under a 
>>> "typical
>>> use case", i.e. I'm replicating functions that call tables without 
>>> schema
>>> qualification. This is pretty common, and as logical replication becomes
>>> used for more types of workloads (e.g. high availability), we'll 
>>> definitely
>>> see this.
>>
>> Hmm, I think you're saying that replay should turn check_function_bodies
>> off, and I think I agree with that.
> 
> Yes, exactly. +1

I drilled into this a bit more using the SQL standard bodies (BEGIN 
ATOMIC) to see if there were any other behaviors we needed to account 
for. Overall, it worked well but I ran into one issue.

First, functions with "BEGIN ATOMIC" ignores "check_function_bodies" 
which is by design based on how this feature works. We should still turn 
"check_function_bodies" to "off" though, per above discussion.

In the context of DDL replication, "BEGIN ATOMIC" does support 
schema-unqualified functions, presumably because it includes the parsed 
content?

I created an updated example[1] where I converted the SQL functions to 
use the standard syntax and I returned the table names to be schema 
unqualified. This seemed to work, but I ran into a weird case with this 
function:

CREATE OR REPLACE FUNCTION public.calendar_manage(room_id int, 
calendar_date date)
RETURNS void
LANGUAGE SQL
BEGIN ATOMIC
     WITH delete_calendar AS (
         DELETE FROM calendar
         WHERE
             room_id = $1 AND
             calendar_date = $2
     )
     INSERT INTO calendar (room_id, status, calendar_date, calendar_range)
     SELECT $1, c.status, $2, c.calendar_range
     FROM calendar_generate_calendar($1, tstzrange($2, $2 + 1)) c;
END;

This produced an error on the subscriber, with the following message:

2023-02-16 20:58:24.096 UTC [26864] ERROR:  missing FROM-clause entry 
for table "calendar_1" at character 322
2023-02-16 20:58:24.096 UTC [26864] CONTEXT:  processing remote data for 
replication origin "pg_18658" during message type "DDL" in transaction 
980, finished at 0/C099A7D8
2023-02-16 20:58:24.096 UTC [26864] STATEMENT:  CREATE OR REPLACE 
FUNCTION public.calendar_manage ( IN room_id pg_catalog.int4, IN 
calendar_date pg_catalog.date ) RETURNS pg_catalog.void LANGUAGE sql 
VOLATILE PARALLEL UNSAFE CALLED ON NULL INPUT SECURITY INVOKER COST 100 
BEGIN ATOMIC
     WITH delete_calendar AS (
              DELETE FROM public.calendar
               WHERE ((calendar_1.room_id OPERATOR(pg_catalog.=) 
calendar_manage.room_id) AND (calendar_1.calendar_date 
OPERATOR(pg_catalog.=) calendar_manage.calendar_date))
             )
      INSERT INTO public.calendar (room_id, status, calendar_date, 
calendar_range)  SELECT calendar_manage.room_id,
                 c.status,
                 calendar_manage.calendar_date,
                 c.calendar_range
                FROM 
public.calendar_generate_calendar(calendar_manage.room_id, 
pg_catalog.tstzrange((calendar_manage.calendar_date)::timestamp with 
time zone, ((calendar_manage.calendar_date OPERATOR(pg_catalog.+) 
1))::timestamp with time zone)) c(status, calendar_range);
    END

This seemed to add an additional, incorrect reference to the origin 
table for the "room_id" and "calendar_date" attributes within the CTE of 
this function. I don't know if this is directly related to the DDL 
replication patch, but reporting it as I triggered the behavior through it.

Thanks,

Jonathan

[1] https://gist.github.com/jkatz/fe29006b724fd6f32ee849a96dc01608


Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: recovery modules
Next
From: Andres Freund
Date:
Subject: Re: recovery modules