Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Nisha Moond
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CABdArM7Sq5LifxN+Gsx=bubT_8nRWP1+ucf-uEr1hzuEYE3FRA@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
Did performance test on optimization patch
(v2-0001-optimize-the-slot-advancement.patch). Please find the
results:

Setup:
- One primary node with 100 failover-enabled logical slots
    - 20 DBs, each having 5 failover-enabled logical replication slots
- One physical standby node with 'sync_replication_slots' as off but
other parameters required by slot-sync as enabled.

Node Configurations: please see config.txt

Test Plan:
1) Create 20 Databases on Primary node, each with 5 failover slots
using "pg_create_logical_replication_slot()". Overall 100 failover
slots.
2) Use pg_sync_replication_slot() to sync them to the standby. Note
the execution time of sync and lsns values.
3) On Primary node, run pgbench for 15 mins on postgres db
4) Advance lsns of all the 100 slots on primary using
pg_replication_slot_advance().
5) Use pg_sync_replication_slot() to sync slots to the standby. Note
the execution time of sync and lsns values.

Executed the above test plan for three cases and did time elapsed
comparison for the pg_replication_slot_advance()-

(1) HEAD
Time taken by pg_sync_replication_slot() on Standby node -
  a) The initial sync (step 2) = 140.208 ms
  b) Sync after pgbench run on primary (step 5) = 66.994 ms

(2) HEAD + v3-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch
  a) The initial sync (step 2) = 163.885 ms
  b) Sync after pgbench run on primary (step 5) = 837901.290 ms (13:57.901)

  >> With v3 patch, the pg_sync_replication_slot() takes a significant
amount of time to sync the slots.

(3) HEAD + v3-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch
+ v2-0001-optimize-the-slot-advancement.patch
  a) The initial sync (step 2) = 165.554 ms
  b) Sync after pgbench run on primary (step 5) = 7991.718 ms (00:07.992)

  >> With the optimization patch, the time taken by
pg_sync_replication_slot() is reduced significantly to ~7 seconds.

We did the same test with a single DB too by creating all 100 failover
slots in postgres DB and the results were almost similar.

Attached the scripts used for the test  -
"v3_perf_test_scripts.tar.gz" include files -
setup_multidb.sh : setup primary and standby nodes
createdb20.sql : create 20 DBs
createslot20.sql : create total 100 logical slots, 5 on each DB
run_sync.sql : call pg_replication_slot_advance() with timing
advance20.sql : advance lsn of all slots on Primary node to current lsn
advance20_perdb.sql : use on HEAD to advance lsn on Primary node
get_synced_data.sql : get details of the
config.txt : configuration used for nodes

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Next
From: Alvaro Herrera
Date:
Subject: Re: Psql meta-command conninfo+