Re: [GENERAL] data transformation and replication - Mailing list pgsql-general

From Adrian Klaver
Subject Re: [GENERAL] data transformation and replication
Date
Msg-id cc5d60bf-2429-e285-7f7d-0a6184588281@aklaver.com
Whole thread Raw
In response to Re: [GENERAL] data transformation and replication  ("Armand Pirvu (home)" <armand.pirvu@gmail.com>)
Responses Re: [GENERAL] data transformation and replication  ("Armand Pirvu (home)" <armand.pirvu@gmail.com>)
List pgsql-general
On 05/08/2017 08:31 PM, Armand Pirvu (home) wrote:
>
>
> My bad
>
>
> db1 I have two tables t1 and t2 (or more)
> db2 has one table t3 for example which can get data aggregated from one
> or more multiple tables from the above set . I can
> updates/inserts/deletes in db1.t1 and/or db1.t2 which combined may mean
> related data in db.t3 would need to be inserted/deleted/updated. Think
> of it like ETL processing if you will. This is what I mean by data
> massaging/transformation
>
>
> db1 and db2 are two different servers.

What are the Postgres versions?

>
>
> So I was initially thinking that I can have on db2 the same set of
> tables from db1, replication being done using pglogical. Once data gets
> to db2 t1 and t2, I can have on db2 a set of functions/triggers which
> can transform the data and as such do the relevant
> inserts/updates/delete from db2.t3
>
> Apparently though that is not possible unless I am missing something

Probably this:

https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/

"4.11 Triggers

Apply process and the initial COPY process both run with
session_replication_role set to replica which means that ENABLE REPLICA
and ENABLE ALWAYS triggers will be fired."

https://www.postgresql.org/docs/9.6/static/sql-altertable.html

"DISABLE/ENABLE [ REPLICA | ALWAYS ] TRIGGER

   ...  The trigger firing mechanism is also affected by the
configuration variable session_replication_role. Simply enabled triggers
will fire when the replication role is "origin" (the default) or
"local". Triggers configured as ENABLE REPLICA will only fire if the
session is in "replica" mode, and triggers configured as ENABLE ALWAYS
will fire regardless of the current replication mode.
...
"

So did you ENABLE REPLICA or ALWAYS on the db2 table triggers?


>
> I reached that conclusion by using a trigger and a function like the
> auditing one to track insers/updates/deletes in an audit table
>
> Having these said I was thinking
>
> (a) -
> On db1 I will have the t3 table as is on dsb2. All data transformation
> goes into db1.t3 which on it's turn will replicate to db2.t3 using pglogical
>
> (b) -
> On db2 I will have the t1 t2 as they are on db1. Those are replicated
> using Slony/Bucardo. Once data lands on db2.t1 and db2.t2 another set of
> triggers/functions responsible for data transformation will do the
> inserts/deletes/updates in db2.t3
>
> I wold much prefer pglogical approach as stated in the what I see as a
> failed case
>
>
> If the only options is Slony/Bucardo , so be it. but that begs the
> following questions
> - which one has the smallest overhead ?
> - which one is the easiest to manage ?
> - which one is the most reliable ?
> - I recall data transformation can be used in Bucardo but did not see
> any examples on that. Any pointers ?
>
> Thanks
> Armand
>
>
>
> On May 8, 2017, at 4:49 PM, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>> On 05/08/2017 12:46 PM, Armand Pirvu (home) wrote:
>>> Hi
>>>
>>> Here it is a scenario which I am faced with  and I am hoping to find
>>> a pointer/tip/help
>>>
>>> db1 is the OLTP system
>>> db2 is the Reporting system
>>>
>>> The data from db1 needs to get to db2, but the database on those two
>>> have tables with different layout/structure and hence data will need
>>> to suffer some transformation in between in real time
>>>
>>> I was looking at something like
>>>
>>> db1 -> db2 replicates the same set of tables and with the same
>>> structures using pglogical for example
>>> db2.tbl1 -> db2.tbl2 data gets massages/transformed based on what
>>> replicates from db1.tbl1 using triggers and functions
>>>
>>>
>>> Other than that I reckon db1 -> db2 would be trigger based using
>>> something like slonik maybe (?) and data massage/transformation gets
>>> moved from db2 to db1 machine and then db1.tbl2 -> db2.tbl2 using
>>> pglogical
>>
>> I was following you until the last part, "... moved from db2 to db1
>> machine and then db1.tbl2 -> db2.tbl2 ..."
>>
>> Is this correct?
>>
>> If so why db1 --> db2 --> db1 --> db2?
>>
>> A complete answer is going to depend on at least an outline of what
>> you mean by massage/transform?
>>
>>>
>>>
>>> Is this doable ? If so any pointers as to where to look about it ?
>>>
>>>
>>> Many thanks
>>> Armand
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: vinny
Date:
Subject: Re: [GENERAL] Python versus Other Languages using PostgreSQL
Next
From: D'Arcy Cain
Date:
Subject: Re: [GENERAL] Python versus Other Languages using PostgreSQL