Re: Replication slot stats misgivings - Mailing list pgsql-hackers

From vignesh C
Subject Re: Replication slot stats misgivings
Date
Msg-id CALDaNm1ZkBc=ycQyi=mavZK0CUg-iQPXUc7mxszNvk7L5+7TKA@mail.gmail.com
Whole thread Raw
In response to Re: Replication slot stats misgivings  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Tue, Apr 13, 2021 at 10:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 12, 2021 at 9:16 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > >
> > > Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002.
> > >
> > > <entry role="catalog_table_entry"><para role="column_definition">
> > > +        <structfield>total_bytes</structfield><type>bigint</type>
> > > +       </para>
> > > +       <para>
> > > +        Amount of decoded transactions data sent to the decoding output plugin
> > > +        while decoding the changes from WAL for this slot. This can be used to
> > > +        gauge the total amount of data sent during logical decoding.
> > >
> > > Can we slightly extend it to say something like: Note that this
> > > includes the bytes streamed and or spilled. Similarly, we can extend
> > > it for total_txns.
> > >
> >
> > Thanks for the comments, the comments are fixed in the v8 patch attached.
> > Thoughts?
>
> Here are review comments on new TAP tests:

Thanks for the comments.

> +# Create replication slots.
> +$node->safe_psql('postgres',
> +       "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot1',
> 'test_decoding')");
> +$node->safe_psql('postgres',
> +       "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot2',
> 'test_decoding')");
> +$node->safe_psql('postgres',
> +       "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot3',
> 'test_decoding')");
> +$node->safe_psql('postgres',
> +       "SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot4',
> 'test_decoding')");
>
> and
>
> +
> +$node->safe_psql('postgres',
> +       "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot1', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> +       "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot2', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> +        "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot3', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
> +$node->safe_psql('postgres',
> +       "SELECT data FROM
> pg_logical_slot_get_changes('regression_slot4', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1')");
>
> I think we can do those similar queries in a single psql connection
> like follows:
>
>  # Create replication slots.
>  $node->safe_psql('postgres',
>    qq[
> SELECT pg_create_logical_replication_slot('regression_slot1', 'test_decoding');
> SELECT pg_create_logical_replication_slot('regression_slot2', 'test_decoding');
> SELECT pg_create_logical_replication_slot('regression_slot3', 'test_decoding');
> SELECT pg_create_logical_replication_slot('regression_slot4', 'test_decoding');
> ]);
>
> and
>
> $node->safe_psql('postgres',
>    qq[
> SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL,
> NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
>       ]);
>

Modified.

> ---
> +# Wait for the statistics to be updated.
> +my $slot1_stat_check_query =
> +  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot1' AND total_txns > 0 AND total_bytes > 0;";
> +my $slot2_stat_check_query =
> +  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot2' AND total_txns > 0 AND total_bytes > 0;";
> +my $slot3_stat_check_query =
> +  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot3' AND total_txns > 0 AND total_bytes > 0;";
> +my $slot4_stat_check_query =
> +  "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
> = 'regression_slot4' AND total_txns > 0 AND total_bytes > 0;";
> +
> +# Verify that the statistics have been updated.
> +$node->poll_query_until('postgres', $slot1_stat_check_query)
> +  or die "Timed out while waiting for statistics to be updated";
> +$node->poll_query_until('postgres', $slot2_stat_check_query)
> +  or die "Timed out while waiting for statistics to be updated";
> +$node->poll_query_until('postgres', $slot3_stat_check_query)
> +  or die "Timed out while waiting for statistics to be updated";
> +$node->poll_query_until('postgres', $slot4_stat_check_query)
> +  or die "Timed out while waiting for statistics to be updated";
>
> We can simplify the above code to something like:
>
> $node->poll_query_until(
>    'postgres', qq[
> SELECT count(slot_name) >= 4
> FROM pg_stat_replication_slots
> WHERE slot_name ~ 'regression_slot'
>     AND total_txns > 0
>     AND total_bytes > 0;
> ]) or die "Timed out while waiting for statistics to be updated";
>

Modified.

> ---
> +# Test to remove one of the replication slots and adjust max_replication_slots
> +# accordingly to the number of slots and verify replication statistics data is
> +# fine after restart.
>
> I think it's better if we explain in detail what cases we're trying to
> test. How about the following description?
>
> Test to remove one of the replication slots and adjust
> max_replication_slots accordingly to the number of slots. This leads
> to a mismatch of the number of slots between in the stats file and on
> shared memory, simulating the message for dropping a slot got lost. We
> verify replication statistics data is fine after restart.

Slightly reworded and modified it.

These comments are fixed as part of the v9 patch posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm3CtPUYkFjPhzX0AcuRiK2MzdCR%2B_w8ok1kCcykveuL2Q%40mail.gmail.com

Regards,
Vignesh

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Extensions not dumped when --schema is used
Next
From: Noah Misch
Date:
Subject: Re: Converting contrib SQL functions to new style