Re: row filtering for logical replication - Mailing list pgsql-hackers

From Peter Smith
Subject Re: row filtering for logical replication
Date
Msg-id CAHut+PsBXHXoQviWuayvcDUX4LKhVBn_OkL1wz7RPW3XEk9Vag@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Ajin Cherian <itsajin@gmail.com>)
List pgsql-hackers
On Wed, Feb 2, 2022 at 8:16 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> Hi Peter,
>
> I just tried scenario b that Andres suggested:
>
> For scenario b, I did some testing with row-filter-patch v74 and
> various levels of filtering. 0% replicated to 100% rows replicated.
> The times are in seconds, I did 5 runs each.
>
> Results:
>
> RUN  HEAD     "with patch 0%" "row-filter-patch 25%" "row-filter-patch
> v74 50%" "row-filter-patch 75%" "row-filter-patch v74 100%"
> 1   17.26178  12.573736       12.869635              13.742167
>           17.977112              17.75814
> 2   17.522473 12.919554       12.640879              14.202737
>           14.515481              16.961836
> 3   17.124001 12.640879       12.706631              14.220245
>           15.686613              17.219355
> 4   17.24122  12.602345       12.674566              14.219423
>           15.564312              17.432765
> 5   17.25352  12.610657       12.689842              14.210725
>           15.613708              17.403821
>
> As can see the performance seen on HEAD is similar to that which the
> patch achieves with all rows (100%) replication. The performance
> improves linearly with
> more rows filtered.
>
> The test scenario used was:
>
> 1. On publisher and subscriber:
> CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));
>
> 2. On publisher: (based on which scenario is being tested)
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000); -- 0% allowed
>
> 3. On the subscriber:
> CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432
> dbname=postgres application_name=sync_sub' PUBLICATION pub_1;
>
> 4. now modify the postgresql.conf on the publisher side
> synchronous_standby_names = 'sync_sub' and restart.
>
> 5. The test case:
>
> DO
> $do$
> BEGIN
> FOR i IN 1..1000001 BY 10 LOOP
> INSERT INTO test VALUES(i,'BAH', row_to_json(row(i)));
> UPDATE test SET value = 'FOO' WHERE key = i;
> IF I % 1000 = 0 THEN
> COMMIT;
> END IF;
> END LOOP;
> END
> $do$;
>
>

Thanks!

I have put your results as a bar chart same as for the previous workload case:

HEAD 17.25
v74 no filters NA
v74 allow 100% 17.35
v74 allow 75% 15.62
v74 allow 50% 14.21
v74 allow 25% 12.69
v74 allow 0% 12.62

PSA.

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Replace pg_controldata output fields with macros for better code manageability
Next
From: Bruce Momjian
Date:
Subject: Unclear problem reports