Re: row filtering for logical replication - Mailing list pgsql-hackers

From Peter Smith
Subject Re: row filtering for logical replication
Date
Msg-id CAHut+PtMaTFXFSK+k_X6Qu1gsqqbkH6n7_aMRJ1vAi+-Pao7iA@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
> Are there any recent performance evaluations of the overhead of row filters? I
> think it'd be good to get some numbers comparing:
>
> 1) $workload with master
> 2) $workload with patch, but no row filters
> 3) $workload with patch, row filter matching everything
> 4) $workload with patch, row filter matching few rows
>
> For workload I think it'd be worth testing:
> a) bulk COPY/INSERT into one table
> b) Many transactions doing small modifications to one table
> c) Many transactions targetting many different tables
> d) Interspersed DDL + small changes to a table
>

We have collected the performance data results for the workloads "a",
"b", "c" (will do case "d" later).

This time the tests were re-run now using pg_recvlogical and steps as
Andres suggested [1].

Note - "Allow 100%" is included as a test case, but in practice, a
user is unlikely to deliberately use a filter that allows everything
to pass through it.

PSA the bar charts of the results. All other details are below.

~~~~~

RESULTS - workload "a"
======================
HEAD        18.40
No Filters  18.86
Allow 100%  17.96
Allow 75%   16.39
Allow 50%   14.60
Allow 25%   11.23
Allow 0%    9.41


RESULTS - workload "b"
======================
HEAD        2.30
No Filters  1.96
Allow 100%  1.99
Allow 75%   1.65
Allow 50%   1.35
Allow 25%   1.17
Allow 0%    0.84


RESULTS - workload "c"
======================
HEAD        20.40
No Filters  19.85
Allow 100%  20.94
Allow 75%   17.26
Allow 50%   16.13
Allow 25%   13.32
Allow 0%    10.33


RESULTS - workload "d"
======================
(later)

~~~~~~

Details - workload "a"
=======================

CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));

CREATE PUBLICATION pub_1 FOR TABLE test;
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000); -- 0% allowed

INSERT INTO test SELECT i, i::text, row_to_json(row(i)) FROM
generate_series(1,1000001)i;


Details - workload "b"
======================

CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));

CREATE PUBLICATION pub_1 FOR TABLE test;
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000); -- 0% allowed

DO
$do$
BEGIN
FOR i IN 0..1000001 BY 10 LOOP
INSERT INTO test VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test SET value = 'FOO' WHERE key = i;
IF I % 1000 = 0 THEN
COMMIT;
END IF;
END LOOP;
END
$do$;


Details - workload "c"
======================

CREATE TABLE test1 (key int, value text, data jsonb, PRIMARY KEY(key, value));
CREATE TABLE test2 (key int, value text, data jsonb, PRIMARY KEY(key, value));
CREATE TABLE test3 (key int, value text, data jsonb, PRIMARY KEY(key, value));
CREATE TABLE test4 (key int, value text, data jsonb, PRIMARY KEY(key, value));
CREATE TABLE test5 (key int, value text, data jsonb, PRIMARY KEY(key, value));

CREATE PUBLICATION pub_1 FOR TABLE test1, test2, test3, test4, test5;
CREATE PUBLICATION pub_1 FOR TABLE test1 WHERE (key > 0), test2 WHERE
(key > 0), test3 WHERE (key > 0), test4 WHERE (key > 0), test5 WHERE
(key > 0);
CREATE PUBLICATION pub_1 FOR TABLE test1 WHERE (key > 250000), test2
WHERE (key > 250000), test3 WHERE (key > 250000), test4 WHERE (key >
250000), test5 WHERE (key > 250000);
CREATE PUBLICATION pub_1 FOR TABLE test1 WHERE (key > 500000), test2
WHERE (key > 500000), test3 WHERE (key > 500000), test4 WHERE (key >
500000), test5 WHERE (key > 500000);
CREATE PUBLICATION pub_1 FOR TABLE test1 WHERE (key > 750000), test2
WHERE (key > 750000), test3 WHERE (key > 750000), test4 WHERE (key >
750000), test5 WHERE (key > 750000);
CREATE PUBLICATION pub_1 FOR TABLE test1 WHERE (key > 1000000), test2
WHERE (key > 1000000), test3 WHERE (key > 1000000), test4 WHERE (key >
1000000), test5 WHERE (key > 1000000);

DO
$do$
BEGIN
FOR i IN 0..1000001 BY 10 LOOP
-- test1
INSERT INTO test1 VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test1 SET value = 'FOO' WHERE key = i;
-- test2
INSERT INTO test2 VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test2 SET value = 'FOO' WHERE key = i;
-- test3
INSERT INTO test3 VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test3 SET value = 'FOO' WHERE key = i;
-- test4
INSERT INTO test4 VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test4 SET value = 'FOO' WHERE key = i;
-- test5
INSERT INTO test5 VALUES(i,'BAH', row_to_json(row(i)));
UPDATE test5 SET value = 'FOO' WHERE key = i;

IF I % 1000 = 0 THEN
-- raise notice 'commit: %', i;
COMMIT;
END IF;
END LOOP;
END
$do$;

Details - workload "d"
======================
(later)

------
[1] https://www.postgresql.org/message-id/20220203182922.344fhhqzjp2ah6yp%40alap3.anarazel.de

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Sergey Dudoladov
Date:
Subject: Re: Add connection active, idle time to pg_stat_activity
Next
From: Dag Lem
Date:
Subject: Re: Add psql command to list constraints