Re: Replication slot stats misgivings - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Replication slot stats misgivings |
Date | |
Msg-id | CALDaNm3jhjd5P8-w=AKRORa0Q_A1sprG8Qk7A5-0GaT1J7dZSw@mail.gmail.com Whole thread Raw |
In response to | Re: Replication slot stats misgivings (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Tue, Apr 6, 2021 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Apr 5, 2021 at 8:51 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Few comments on the latest patches: > Comments on 0001 > -------------------------------- > 1. > @@ -659,6 +661,8 @@ ReorderBufferTXNByXid(ReorderBuffer *rb, > TransactionId xid, bool create, > dlist_push_tail(&rb->toplevel_by_lsn, &txn->node); > AssertTXNLsnOrder(rb); > } > + > + rb->totalTxns++; > } > else > txn = NULL; /* not found and not asked to create */ > @@ -3078,6 +3082,7 @@ ReorderBufferChangeMemoryUpdate(ReorderBuffer *rb, > { > txn->size += sz; > rb->size += sz; > + rb->totalBytes += sz; > > I think this will include the txns that are aborted and for which we > don't send anything. It might be better to update these stats in > ReorderBufferProcessTXN or ReorderBufferReplay where we are sure we > have sent the data. We can probably use size/total_size in txn. We > need to be careful to not double include the totaltxn or totalBytes > for streaming xacts as we might process the same txn multiple times. > > 2. > + Amount of decoded transactions data sent to the decoding output plugin > + while decoding the changes from WAL for this slot. This and total_txns > + for this slot can be used to gauge the total amount of data during > + logical decoding. > > I think we can slightly modify the second line here: "This can be used > to gauge the total amount of data sent during logical decoding.". Why > we need to include total_txns along with it. > > 0002 > ---------- > 3. > + -- we don't want to wait forever; loop will exit after 30 seconds > + FOR i IN 1 .. 5 LOOP > + > ... > ... > + > + -- wait a little > + perform pg_sleep_for('100 milliseconds'); > > I think this loop needs to be executed 300 times instead of 5 times, > if the above comments and code needs to do what is expected here? > > > 4. > +# Test to drop one of the subscribers and verify replication statistics data is > +# fine after publisher is restarted. > +$node->safe_psql('postgres', "SELECT > pg_drop_replication_slot('regression_slot4')"); > + > +$node->stop; > +$node->start; > + > +# Verify statistics data present in pg_stat_replication_slots are sane after > +# publisher is restarted > +$result = $node->safe_psql('postgres', > + "SELECT slot_name, total_txns > 0 AS total_txn, total_bytes > 0 AS total_bytes > + FROM pg_stat_replication_slots ORDER BY slot_name" > > Various comments in the 0002 refer to publisher/subscriber which is > not what we are using here. > > 5. > +# Create table. > +$node->safe_psql('postgres', > + "CREATE TABLE test_repl_stat(col1 int)"); > +$node->safe_psql('postgres', > + "SELECT data FROM > pg_logical_slot_get_changes('regression_slot1', NULL, NULL, > 'include-xids', '0', 'skip-empty-xacts', '1')"); > +$node->safe_psql('postgres', > + "SELECT data FROM > pg_logical_slot_get_changes('regression_slot2', NULL, NULL, > 'include-xids', '0', 'skip-empty-xacts', '1')"); > +$node->safe_psql('postgres', > + "SELECT data FROM > pg_logical_slot_get_changes('regression_slot3', NULL, NULL, > 'include-xids', '0', 'skip-empty-xacts', '1')"); > +$node->safe_psql('postgres', > + "SELECT data FROM > pg_logical_slot_get_changes('regression_slot4', NULL, NULL, > 'include-xids', '0', 'skip-empty-xacts', '1')"); > > I think we can save the above calls to pg_logical_slot_get_changes if > we create table before creating the slots in this test. > > 0003 > --------- > 6. In the tests/code, publisher is used at multiple places. I think > that is not required because this can happen via plugin as well. > 7. > + if (max_replication_slots == nReplSlotStats) > + { > + ereport(pgStatRunningInCollector ? LOG : WARNING, > + (errmsg("skipping \"%s\" replication slot statistics as > pg_stat_replication_slots does not have enough slots", > + NameStr(replSlotStats[nReplSlotStats].slotname)))); > + memset(&replSlotStats[nReplSlotStats], 0, sizeof(PgStat_ReplSlotStats)); > > Do we need memset here? Isn't this location is past the max location? Thanks for the comments, I will fix and post a patch for this soon. Regards, Vignesh
pgsql-hackers by date: