RE: Initial Schema Sync for Logical Replication - Mailing list pgsql-hackers

From Wei Wang (Fujitsu)
Subject RE: Initial Schema Sync for Logical Replication
Date
Msg-id OS3PR01MB627503ED3CE7E2E3F14AFE9A9E6A9@OS3PR01MB6275.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Initial Schema Sync for Logical Replication  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Initial Schema Sync for Logical Replication
List pgsql-hackers
On Fri, Apr 21, 2023 at 16:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Thu, Apr 20, 2023 at 8:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 17, 2023 at 9:12 AM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Apr 7, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 6, 2023 at 6:57 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > >
> > > > > While writing a PoC patch, I found some difficulties in this idea.
> > > > > First, I tried to add schemaname+relname to pg_subscription_rel but I
> > > > > could not define the primary key of pg_subscription_rel. The primary
> > > > > key on (srsubid, srrelid) doesn't work since srrelid could be NULL.
> > > > > Similarly, the primary key on (srsubid, srrelid, schemaname, relname)
> > > > > also doesn't work.
> > > > >
> > > >
> > > > Can we think of having a separate catalog table say
> > > > pg_subscription_remote_rel for this? You can have srsubid,
> > > > remote_schema_name, remote_rel_name, etc. We may need some other
> state
> > > > to be maintained during the initial schema sync where this table can
> > > > be used. Basically, this can be used to maintain the state till the
> > > > initial schema sync is complete because we can create a relation entry
> > > > in pg_subscritption_rel only after the initial schema sync is
> > > > complete.
> > >
> > > It might not be ideal but I guess it works. But I think we need to
> > > modify the name of replication slot for initial sync as it currently
> > > includes OID of the table:
> > >
> > > void
> > > ReplicationSlotNameForTablesync(Oid suboid, Oid relid,
> > >                                 char *syncslotname, Size szslot)
> > > {
> > >     snprintf(syncslotname, szslot, "pg_%u_sync_%u_" UINT64_FORMAT,
> suboid,
> > >              relid, GetSystemIdentifier());
> > > }
> > >
> > > If we use both schema name and table name, it's possible that slot
> > > names are duplicated if schema and/or table names are long. Another
> > > idea is to use the hash value of schema+table names, but it cannot
> > > completely eliminate that possibility, and probably would make
> > > investigation and debugging hard in case of any failure. Probably we
> > > can use the OID of each entry in pg_subscription_remote_rel instead,
> > > but I'm not sure it's a good idea, mainly because we will end up using
> > > twice as many OIDs as before.
> > >
> >
> > The other possibility is to use worker_pid. To make debugging easier,
> > we may want to LOG schema_name+rel_name vs slot_name mapping at
> DEBUG1
> > log level.
> 
> Since worker_pid changes after the worker restarts, a new worker
> cannot find the slot that had been used, no?
> 
> After thinking it over, a better solution would be that we add an oid
> column, nspname column, and relname column to pg_subscription_rel and
> the primary key on the oid. If the table is not present on the
> subscriber we store the schema name and table name to the catalog, and
> otherwise we store the local table oid same as today. The local table
> oid will be filled after the schema sync. The names of origin and
> replication slot the tablesync worker uses use the oid instead of the
> table oid.
> 
> I've attached a PoC patch of this idea (very rough patch and has many
> TODO comments). It mixes the following changes:
> 
> 1. Add oid column to the pg_subscription_rel. The oid is used as the
> primary key and in the names of origin and slot the tablesync workers
> use.
> 
> 2. Add copy_schema = on/off option to CREATE SUBSCRIPTION (not yet
> support for ALTER SUBSCRIPTION).
> 
> 3. Add CRS_EXPORT_USE_SNAPSHOT new action in order to use the same
> snapshot by both walsender and other processes (e.g. pg_dump). In this
> patch, the snapshot is exported for pg_dump and is used by the
> walsender for COPY.
> 
> It seems to work well but there might be a pitfall as I've not fully
> implemented it.

Thanks for your POC patch.
After reviewing this patch, I have a question below that want to confirm:

1. In the function synchronize_table_schema.
I think some changes to GUC and table-related object SQLs are included in the
pg_dump result. And in this POC, these SQLs will be executed. Do we need to
alter the pg_dump results to only execute the table schema related SQLs?
For example, if we have below table schema in the publisher-side:
```
create table tbl(a int, b int);
create index idx_t on tbl (a);
CREATE FUNCTION trigger_func() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$ BEGIN INSERT INTO public.tbl VALUES (NEW.*);
RETURNNEW; END; $$;
 
CREATE TRIGGER tri_tbl BEFORE INSERT ON public.tbl FOR EACH ROW EXECUTE PROCEDURE trigger_func();
```
The result of pg_dump executed on the subscriber-side:
```
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
SET default_tablespace = '';
SET default_table_access_method = heap;

CREATE TABLE public.tbl (
    a integer,
    b integer
);

ALTER TABLE public.tbl OWNER TO postgres;

CREATE INDEX idx_t ON public.tbl USING btree (a);

CREATE TRIGGER tri_tbl BEFORE INSERT ON public.tbl FOR EACH ROW EXECUTE FUNCTION public.trigger_func();
```
And this will cause an error when `CREATE TRIGGER` because we did not dump the
function trigger_func.

Regards,
Wang Wei

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: pg_stat_io not tracking smgrwriteback() is confusing
Next
From: Amit Kapila
Date:
Subject: Re: Add two missing tests in 035_standby_logical_decoding.pl