Re: row filtering for logical replication - Mailing list pgsql-hackers

From Erik Rijkers
Subject Re: row filtering for logical replication
Date
Msg-id 93288fa52c63c09149a7b99d5b1e3805@xs4all.nl
Whole thread Raw
In response to Re: row filtering for logical replication  (Erik Rijkers <er@xs4all.nl>)
Responses Re: row filtering for logical replication
List pgsql-hackers
On 2018-03-01 16:27, Erik Rijkers wrote:
> On 2018-03-01 00:03, Euler Taveira wrote:
>> The attached patches add support for filtering rows in the publisher.
> 
>> 001-Refactor-function-create_estate_for_relation.patch
>> 0002-Rename-a-WHERE-node.patch
>> 0003-Row-filtering-for-logical-replication.patch
> 
>> Comments?
> 
> Very, very useful.  I really do hope this patch survives the 
> late-arrival-cull.
> 
> I built this functionality into a test program I have been using and
> in simple cascading replication tests it works well.
> 
> I did find what I think is a bug (a bug easy to avoid but also easy to
> run into):
> The test I used was to cascade 3 instances (all on one machine) from 
> A->B->C
> I ran a pgbench session in instance A, and used:
>   in A: alter publication pub0_6515 add table pgbench_accounts where
> (aid between 40000 and 60000-1);
>   in B: alter publication pub1_6516 add table pgbench_accounts;
> 
> The above worked well, but when I did the same but used the filter in
> both publications:
>   in A: alter publication pub0_6515 add table pgbench_accounts where
> (aid between 40000 and 60000-1);
>   in B: alter publication pub1_6516 add table pgbench_accounts where
> (aid between 40000 and 60000-1);
> 
> then the replication only worked for (pgbench-)scale 1 (hence: very
> little data); with larger scales it became slow (taking many minutes
> where the above had taken less than 1 minute), and ended up using far
> too much memory (or blowing up/crashing altogether).  Something not
> quite right there.
> 
> Nevertheless, I am much in favour of acquiring this functionality as
> soon as possible.


Attached is 'logrep_rowfilter.sh', a demonstration of above-described 
bug.

The program runs initdb for 3 instances in /tmp (using ports 6515, 6516, 
and 6517) and sets up logical replication from 1->2->3.

It can be made to work by removing de where-clause on the second 'create 
publication' ( i.e., outcomment the $where2 variable ).


> Thanks,
> 
> 
> Erik Rijkers

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Removing LEFT JOINs in more cases
Next
From: Andres Freund
Date:
Subject: Re: Parallel Aggregates for string_agg and array_agg