Re: Support logical replication of DDLs - Mailing list pgsql-hackers

From vignesh C
Subject Re: Support logical replication of DDLs
Date
Msg-id CALDaNm1CMnLfZRSNUxTZrAZpjKYhx4Pz9vZ-296eLGp8wi8Bzw@mail.gmail.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  ("Jonathan S. Katz" <jkatz@postgresql.org>)
List pgsql-hackers
On Fri, 17 Feb 2023 at 02:38, Jonathan S. Katz <jkatz@postgresql.org> wrote:
>
> On 2/16/23 2:43 PM, Jonathan S. Katz wrote:
> > On 2/16/23 2:38 PM, Alvaro Herrera wrote:
> >> On 2023-Feb-16, Jonathan S. Katz wrote:
> >>
> >>> On 2/16/23 12:53 PM, Alvaro Herrera wrote:
> >>
> >>>> I don't think this is the fault of logical replication.  Consider that
> >>>> for the backend server, the function source code is just an opaque
> >>>> string that is given to the plpgsql engine to interpret.  So there's no
> >>>> way for the logical DDL replication engine to turn this into runnable
> >>>> code if the table name is not qualified.
> >>>
> >>> Sure, that's fair. That said, the example above would fall under a
> >>> "typical
> >>> use case", i.e. I'm replicating functions that call tables without
> >>> schema
> >>> qualification. This is pretty common, and as logical replication becomes
> >>> used for more types of workloads (e.g. high availability), we'll
> >>> definitely
> >>> see this.
> >>
> >> Hmm, I think you're saying that replay should turn check_function_bodies
> >> off, and I think I agree with that.
> >
> > Yes, exactly. +1
>
> I drilled into this a bit more using the SQL standard bodies (BEGIN
> ATOMIC) to see if there were any other behaviors we needed to account
> for. Overall, it worked well but I ran into one issue.
>
> First, functions with "BEGIN ATOMIC" ignores "check_function_bodies"
> which is by design based on how this feature works. We should still turn
> "check_function_bodies" to "off" though, per above discussion.
>
> In the context of DDL replication, "BEGIN ATOMIC" does support
> schema-unqualified functions, presumably because it includes the parsed
> content?
>
> I created an updated example[1] where I converted the SQL functions to
> use the standard syntax and I returned the table names to be schema
> unqualified. This seemed to work, but I ran into a weird case with this
> function:
>
> CREATE OR REPLACE FUNCTION public.calendar_manage(room_id int,
> calendar_date date)
> RETURNS void
> LANGUAGE SQL
> BEGIN ATOMIC
>      WITH delete_calendar AS (
>          DELETE FROM calendar
>          WHERE
>              room_id = $1 AND
>              calendar_date = $2
>      )
>      INSERT INTO calendar (room_id, status, calendar_date, calendar_range)
>      SELECT $1, c.status, $2, c.calendar_range
>      FROM calendar_generate_calendar($1, tstzrange($2, $2 + 1)) c;
> END;
>
> This produced an error on the subscriber, with the following message:
>
> 2023-02-16 20:58:24.096 UTC [26864] ERROR:  missing FROM-clause entry
> for table "calendar_1" at character 322
> 2023-02-16 20:58:24.096 UTC [26864] CONTEXT:  processing remote data for
> replication origin "pg_18658" during message type "DDL" in transaction
> 980, finished at 0/C099A7D8
> 2023-02-16 20:58:24.096 UTC [26864] STATEMENT:  CREATE OR REPLACE
> FUNCTION public.calendar_manage ( IN room_id pg_catalog.int4, IN
> calendar_date pg_catalog.date ) RETURNS pg_catalog.void LANGUAGE sql
> VOLATILE PARALLEL UNSAFE CALLED ON NULL INPUT SECURITY INVOKER COST 100
> BEGIN ATOMIC
>          WITH delete_calendar AS (
>                   DELETE FROM public.calendar
>                    WHERE ((calendar_1.room_id OPERATOR(pg_catalog.=)
> calendar_manage.room_id) AND (calendar_1.calendar_date
> OPERATOR(pg_catalog.=) calendar_manage.calendar_date))
>                  )
>           INSERT INTO public.calendar (room_id, status, calendar_date,
> calendar_range)  SELECT calendar_manage.room_id,
>                      c.status,
>                      calendar_manage.calendar_date,
>                      c.calendar_range
>                     FROM
> public.calendar_generate_calendar(calendar_manage.room_id,
> pg_catalog.tstzrange((calendar_manage.calendar_date)::timestamp with
> time zone, ((calendar_manage.calendar_date OPERATOR(pg_catalog.+)
> 1))::timestamp with time zone)) c(status, calendar_range);
>         END
>
> This seemed to add an additional, incorrect reference to the origin
> table for the "room_id" and "calendar_date" attributes within the CTE of
> this function. I don't know if this is directly related to the DDL
> replication patch, but reporting it as I triggered the behavior through it.

I had analyzed this issue and found that this issue exists with
getting query definition, I could reproduce the issue with pg_dump and
pg_get_function_sqlbody. I have started a new thread for this at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1MMntjmT_NJGp-Z%3DxbF02qHGAyuSHfYHias3TqQbPF2w%40mail.gmail.com

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: Pavel Luzanov
Date:
Subject: Re: psql: Add role's membership options to the \du+ command
Next
From: Alvaro Herrera
Date:
Subject: Re: ExecRTCheckPerms() and many prunable partitions (checkAsUser)