Re: Logical Replication Custom Column Expression - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Logical Replication Custom Column Expression
Date
Msg-id CAExHW5shj=tSP553u+xRGTKhdgtF9CR-52YPhja_H_Nw_b8Zmw@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication Custom Column Expression  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Logical Replication Custom Column Expression
List pgsql-hackers
On Wed, Nov 23, 2022 at 4:54 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, Nov 23, 2022 at 7:38 AM Stavros Koureas
> <koureasstavros@gmail.com> wrote:
> >
> > Reading more carefully what you described, I think you are interested in getting something you call origin from
publishers,probably some metadata from the publications. 
> >
> > This identifier in those metadata maybe does not have business value on the reporting side. The idea is to use a
valuewhich has specific meaning to the user at the end. 
> >
> > For example assigning 1 for tenant 1, 2 for tenant 2 and so one, at the end based on a dimension table which holds
thismapping the user would be able to filter the data. So programmatically the user can set the id value of the column
pluscreating the mapping table from an application let’s say and be able to distinguish the data. 
> >
> > In addition this column should have the ability to be part of the primary key on the subscription table in order to
notconflict with lines from other tenants having the same keys. 
> >
> >
>
> I was wondering if a simpler syntax solution might also work here.
>
> Imagine another SUBSCRIPTION parameter that indicates to write the
> *name* of the subscription to some pre-defined table column:
> e.g. CREATE SUBSCRIPTION subname FOR PUBLICATION pub_tenant_1
> CONNECTION '...' WITH (subscription_column);
>
> Logical Replication already allows the subscriber table to have extra
> columns, so you just need to manually create the extra 'subscription'
> column up-front.
>
> Then...
>
> ~~
>
> On Publisher:
>
> test_pub=# CREATE TABLE tab(id int primary key, description varchar);
> CREATE TABLE
>
> test_pub=# INSERT INTO tab VALUES (1,'one'),(2,'two'),(3,'three');
> INSERT 0 3
>
> test_pub=# CREATE PUBLICATION tenant1 FOR ALL TABLES;
> CREATE PUBLICATION
>
> ~~
>
> On Subscriber:
>
> test_sub=# CREATE TABLE tab(id int, description varchar, subscription varchar);
> CREATE TABLE
>
> test_sub=# CREATE SUBSCRIPTION sub_tenant1 CONNECTION 'host=localhost
> dbname=test_pub' PUBLICATION tenant1 WITH (subscription_column);
> CREATE SUBSCRIPTION
>
> test_sub=# SELECT * FROM tab;
>  id | description | subscription
> ----+-------------+--------------
>   1 | one         | sub_tenant1
>   2 | two         | sub_tenant1
>   3 | three       | sub_tenant1
> (3 rows)
>
> ~~
>
Thanks for the example. This is more concrete than just verbal description.

In this example, do all the tables that a subscription subscribes to
need that additional column or somehow the pglogical receiver will
figure out which tables have that column and populate rows
accordingly?

My further fear is that the subscriber will also need to match the
subscription column along with the rest of PK so as not to update rows
from other subscriptions.
--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Next
From: Peter Eisentraut
Date:
Subject: Re: PGDOCS - Logical replication GUCs - added some xrefs