Re: Replication slot stats misgivings - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Replication slot stats misgivings
Date
Msg-id CAD21AoAarZeXkzGtHpssO-G4=+5z7zTE7TjidwaN5u+XT+Ktvw@mail.gmail.com
Whole thread Raw
In response to Re: Replication slot stats misgivings  (vignesh C <vignesh21@gmail.com>)
Responses Re: Replication slot stats misgivings
List pgsql-hackers
On Mon, Apr 12, 2021 at 9:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002.
> >
> > <entry role="catalog_table_entry"><para role="column_definition">
> > +        <structfield>total_bytes</structfield><type>bigint</type>
> > +       </para>
> > +       <para>
> > +        Amount of decoded transactions data sent to the decoding output plugin
> > +        while decoding the changes from WAL for this slot. This can be used to
> > +        gauge the total amount of data sent during logical decoding.
> >
> > Can we slightly extend it to say something like: Note that this
> > includes the bytes streamed and or spilled. Similarly, we can extend
> > it for total_txns.
> >
>
> Thanks for the comments, the comments are fixed in the v8 patch attached.
> Thoughts?

Here are review comments on new TAP tests:

+# Create replication slots.
+$node->safe_psql('postgres',
+       "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot1',
'test_decoding')");
+$node->safe_psql('postgres',
+       "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot2',
'test_decoding')");
+$node->safe_psql('postgres',
+       "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot3',
'test_decoding')");
+$node->safe_psql('postgres',
+       "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot4',
'test_decoding')");

and

+
+$node->safe_psql('postgres',
+       "SELECT data FROM
pg_logical_slot_get_changes('regression_slot1', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");
+$node->safe_psql('postgres',
+       "SELECT data FROM
pg_logical_slot_get_changes('regression_slot2', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");
+$node->safe_psql('postgres',
+        "SELECT data FROM
pg_logical_slot_get_changes('regression_slot3', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");
+$node->safe_psql('postgres',
+       "SELECT data FROM
pg_logical_slot_get_changes('regression_slot4', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");

I think we can do those similar queries in a single psql connection
like follows:

 # Create replication slots.
 $node->safe_psql('postgres',
   qq[
SELECT pg_create_logical_replication_slot('regression_slot1', 'test_decoding');
SELECT pg_create_logical_replication_slot('regression_slot2', 'test_decoding');
SELECT pg_create_logical_replication_slot('regression_slot3', 'test_decoding');
SELECT pg_create_logical_replication_slot('regression_slot4', 'test_decoding');
]);

and

$node->safe_psql('postgres',
   qq[
SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
      ]);

---
+# Wait for the statistics to be updated.
+my $slot1_stat_check_query =
+  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot1' AND total_txns > 0 AND total_bytes > 0;";
+my $slot2_stat_check_query =
+  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot2' AND total_txns > 0 AND total_bytes > 0;";
+my $slot3_stat_check_query =
+  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot3' AND total_txns > 0 AND total_bytes > 0;";
+my $slot4_stat_check_query =
+  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot4' AND total_txns > 0 AND total_bytes > 0;";
+
+# Verify that the statistics have been updated.
+$node->poll_query_until('postgres', $slot1_stat_check_query)
+  or die "Timed out while waiting for statistics to be updated";
+$node->poll_query_until('postgres', $slot2_stat_check_query)
+  or die "Timed out while waiting for statistics to be updated";
+$node->poll_query_until('postgres', $slot3_stat_check_query)
+  or die "Timed out while waiting for statistics to be updated";
+$node->poll_query_until('postgres', $slot4_stat_check_query)
+  or die "Timed out while waiting for statistics to be updated";

We can simplify the above code to something like:

$node->poll_query_until(
   'postgres', qq[
SELECT count(slot_name) >= 4
FROM pg_stat_replication_slots
WHERE slot_name ~ 'regression_slot'
    AND total_txns > 0
    AND total_bytes > 0;
]) or die "Timed out while waiting for statistics to be updated";

---
+# Test to remove one of the replication slots and adjust max_replication_slots
+# accordingly to the number of slots and verify replication statistics data is
+# fine after restart.

I think it's better if we explain in detail what cases we're trying to
test. How about the following description?

Test to remove one of the replication slots and adjust
max_replication_slots accordingly to the number of slots. This leads
to a mismatch of the number of slots between in the stats file and on
shared memory, simulating the message for dropping a slot got lost. We
verify replication statistics data is fine after restart.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Virender Singla
Date:
Subject: Re: vacuum freeze - possible improvements
Next
From: Kohei KaiGai
Date:
Subject: Re: TRUNCATE on foreign table