Re: Logical Replication Custom Column Expression - Mailing list pgsql-hackers

From Stavros Koureas
Subject Re: Logical Replication Custom Column Expression
Date
Msg-id CA+O1jk5VztsmtcUQAG-HLg7o6gbbJuOD7tNiyX4bguN0qOiF+w@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication Custom Column Expression  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
It's easy to answer this question.

Imagine that in a software company who sells the product and also offers reporting solutions, the ERP tables will not have this additional column to all the tables.
Now the reporting department comes and needs to consolidate all that data from different databases (publishers) and create one multitenant database to have all the data.
So in an ERP like NAV or anything else you cannot suggest change all the code to all of the tables plus all functions to add one additional column to this table, even that was possible then you cannot work with integers but you need to work with GUIDs as this column should be predefined to each ERP. Then joining with GUID in the second phase for reporting definitely will slow down the performance.

In summary:
  1. Cannot touch the underlying source (important)
  2. GUID identifier column will slow down the reporting performance

Στις Τετ 23 Νοε 2022 στις 5:19 π.μ., ο/η Amit Kapila <amit.kapila16@gmail.com> έγραψε:
On Wed, Nov 23, 2022 at 1:40 AM Stavros Koureas
<koureasstavros@gmail.com> wrote:
>
> Reading more carefully what you described, I think you are interested in getting something you call origin from publishers, probably some metadata from the publications.
>
> This identifier in those metadata maybe does not have business value on the reporting side. The idea is to use a value which has specific meaning to the user at the end.
>
> For example assigning 1 for tenant 1, 2 for tenant 2 and so one, at the end based on a dimension table which holds this mapping the user would be able to filter the data. So programmatically the user can set the id value of the column plus creating the mapping table from an application let’s say and be able to distinguish the data.
>

In your example, are different tenants represent different publisher
nodes? If so, why can't we have a predefined column and value for the
required tables on each publisher rather than logical replication
generate that value while replicating data?

--
With Regards,
Amit Kapila.

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: drop postmaster symlink
Next
From: Julien Rouhaud
Date:
Subject: Re: Allow file inclusion in pg_hba and pg_ident files