Re: Replication slot stats misgivings - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Replication slot stats misgivings |
Date | |
Msg-id | CAD21AoDK5Rq1dVEVqW-x+XWU6=ji7RN063tMYKACZXAp7QoZBg@mail.gmail.com Whole thread Raw |
In response to | Re: Replication slot stats misgivings (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Replication slot stats misgivings
|
List | pgsql-hackers |
On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Apr 15, 2021 at 4:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Thank you for the update! The patch looks good to me. > > > > I have pushed the first patch. Comments on the next patch > v13-0001-Use-HTAB-for-replication-slot-statistics: > 1. > + /* > + * Check for all replication slots in stats hash table. We do this check > + * when replSlotStats has more than max_replication_slots entries, i.e, > + * when there are stats for the already-dropped slot, to avoid frequent > + * call SearchNamedReplicationSlot() which acquires LWLock. > + */ > + if (replSlotStats && hash_get_num_entries(replSlotStats) > > max_replication_slots) > + { > + PgStat_ReplSlotEntry *slotentry; > + > + hash_seq_init(&hstat, replSlotStats); > + while ((slotentry = (PgStat_ReplSlotEntry *) hash_seq_search(&hstat)) != NULL) > + { > + if (SearchNamedReplicationSlot(NameStr(slotentry->slotname), true) == NULL) > + pgstat_report_replslot_drop(NameStr(slotentry->slotname)); > + } > + } > > Is SearchNamedReplicationSlot() so frequently used that we need to do > this only when the hash table has entries more than > max_replication_slots? I think it would be better if we can do it > without such a condition to reduce the chances of missing the slot > stats. We don't have any such restrictions for any other cases in this > function. Please see below comment on #4. > > I think it is better to add CHECK_FOR_INTERRUPTS in the above while loop? Agreed. > > 2. > /* > * Replication slot statistics kept in the stats collector > */ > -typedef struct PgStat_ReplSlotStats > +typedef struct PgStat_ReplSlotEntry > > I think the comment above this structure can be changed to "The > collector's data per slot" or something like that. Also, if we have to > follow table/function/db style, then probably this structure should be > named as PgStat_StatReplSlotEntry. Agreed. > > 3. > - * create the statistics for the replication slot. > + * create the statistics for the replication slot. In case where the > + * message for dropping the old slot gets lost and a slot with the same is > > /the same is/the same name is/. > > Can we mention something similar to what you have added here in docs as well? Agreed. > > 4. > +CREATE VIEW pg_stat_replication_slots AS > + SELECT > + s.slot_name, > + s.spill_txns, > + s.spill_count, > + s.spill_bytes, > + s.stream_txns, > + s.stream_count, > + s.stream_bytes, > + s.total_txns, > + s.total_bytes, > + s.stats_reset > + FROM pg_replication_slots as r, > + LATERAL pg_stat_get_replication_slot(slot_name) as s > + WHERE r.datoid IS NOT NULL; -- excluding physical slots > .. > .. > > -/* Get the statistics for the replication slots */ > +/* Get the statistics for the replication slot */ > Datum > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS) > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS) > { > #define PG_STAT_GET_REPLICATION_SLOT_COLS 10 > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; > + text *slotname_text = PG_GETARG_TEXT_P(0); > + NameData slotname; > > I think with the above changes getting all the slot stats has become > much costlier. Is there any reason why can't we get all the stats from > the new hash_table in one shot and return them to the user? I think the advantage of this approach would be that it can avoid showing the stats for already-dropped slots. Like other statistics views such as pg_stat_all_tables and pg_stat_all_functions, searching the stats by the name got from pg_replication_slots can show only available slot stats even if the hash table has garbage slot stats. Given that pg_stat_replication_slots doesn’t show garbage slot stats even if it has, I thought we can avoid checking those garbage stats frequently. It should not essentially be a problem for the hash table to have entries up to max_replication_slots regardless of live or already-dropped. As another design, we can get all stats from the hash table in one shot as you suggested. If we do that, it's better to check garbage slot stats every time pgstat_vacuum_stat() is called so the view doesn't show those stats but cannot avoid that completely. I'll change the code pointed out by #1 and #4 according to this design discussion. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: