Re: Replication slot stats misgivings - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Replication slot stats misgivings |
Date | |
Msg-id | CAD21AoAarZeXkzGtHpssO-G4=+5z7zTE7TjidwaN5u+XT+Ktvw@mail.gmail.com Whole thread Raw |
In response to | Re: Replication slot stats misgivings (vignesh C <vignesh21@gmail.com>) |
Responses |
Re: Replication slot stats misgivings
|
List | pgsql-hackers |
On Mon, Apr 12, 2021 at 9:16 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Apr 12, 2021 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002. > > > > <entry role="catalog_table_entry"><para role="column_definition"> > > + <structfield>total_bytes</structfield><type>bigint</type> > > + </para> > > + <para> > > + Amount of decoded transactions data sent to the decoding output plugin > > + while decoding the changes from WAL for this slot. This can be used to > > + gauge the total amount of data sent during logical decoding. > > > > Can we slightly extend it to say something like: Note that this > > includes the bytes streamed and or spilled. Similarly, we can extend > > it for total_txns. > > > > Thanks for the comments, the comments are fixed in the v8 patch attached. > Thoughts? Here are review comments on new TAP tests: +# Create replication slots. +$node->safe_psql('postgres', + "SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot1', 'test_decoding')"); +$node->safe_psql('postgres', + "SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot2', 'test_decoding')"); +$node->safe_psql('postgres', + "SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot3', 'test_decoding')"); +$node->safe_psql('postgres', + "SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot4', 'test_decoding')"); and + +$node->safe_psql('postgres', + "SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1')"); +$node->safe_psql('postgres', + "SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1')"); +$node->safe_psql('postgres', + "SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1')"); +$node->safe_psql('postgres', + "SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1')"); I think we can do those similar queries in a single psql connection like follows: # Create replication slots. $node->safe_psql('postgres', qq[ SELECT pg_create_logical_replication_slot('regression_slot1', 'test_decoding'); SELECT pg_create_logical_replication_slot('regression_slot2', 'test_decoding'); SELECT pg_create_logical_replication_slot('regression_slot3', 'test_decoding'); SELECT pg_create_logical_replication_slot('regression_slot4', 'test_decoding'); ]); and $node->safe_psql('postgres', qq[ SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); ]); --- +# Wait for the statistics to be updated. +my $slot1_stat_check_query = + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot1' AND total_txns > 0 AND total_bytes > 0;"; +my $slot2_stat_check_query = + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot2' AND total_txns > 0 AND total_bytes > 0;"; +my $slot3_stat_check_query = + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot3' AND total_txns > 0 AND total_bytes > 0;"; +my $slot4_stat_check_query = + "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name = 'regression_slot4' AND total_txns > 0 AND total_bytes > 0;"; + +# Verify that the statistics have been updated. +$node->poll_query_until('postgres', $slot1_stat_check_query) + or die "Timed out while waiting for statistics to be updated"; +$node->poll_query_until('postgres', $slot2_stat_check_query) + or die "Timed out while waiting for statistics to be updated"; +$node->poll_query_until('postgres', $slot3_stat_check_query) + or die "Timed out while waiting for statistics to be updated"; +$node->poll_query_until('postgres', $slot4_stat_check_query) + or die "Timed out while waiting for statistics to be updated"; We can simplify the above code to something like: $node->poll_query_until( 'postgres', qq[ SELECT count(slot_name) >= 4 FROM pg_stat_replication_slots WHERE slot_name ~ 'regression_slot' AND total_txns > 0 AND total_bytes > 0; ]) or die "Timed out while waiting for statistics to be updated"; --- +# Test to remove one of the replication slots and adjust max_replication_slots +# accordingly to the number of slots and verify replication statistics data is +# fine after restart. I think it's better if we explain in detail what cases we're trying to test. How about the following description? Test to remove one of the replication slots and adjust max_replication_slots accordingly to the number of slots. This leads to a mismatch of the number of slots between in the stats file and on shared memory, simulating the message for dropping a slot got lost. We verify replication statistics data is fine after restart. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: