Re: Flush some statistics within running transactions - Mailing list pgsql-hackers
| From | Sami Imseih |
|---|---|
| Subject | Re: Flush some statistics within running transactions |
| Date | |
| Msg-id | CAA5RZ0tGhu9jXGgFjzx7yK4cseE+t40Q4qZ+KhY66B2MZ=pBrQ@mail.gmail.com Whole thread Raw |
| In response to | Re: Flush some statistics within running transactions (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
| Responses |
Re: Flush some statistics within running transactions
|
| List | pgsql-hackers |
> Also the attached is now split in 4 sub-patches with 0002 introducing a new
> GUC to control the flush interval (default is 10s). Note that 0001 to 0003 could
> be merged as one patch but I did it that way to ease the review.
Thanks for the patches!
I do like the GUC introduced in 0002 to control the ANYTIME stats
flush interval.
But, as I was looking at this a bit more today and testing it, I am
not quite sure
I like what is happening here:
+void
+pgstat_report_anytime_stat(bool force)
+{
+ bool nowait = !force;
+
+ pgstat_assert_is_up();
+
+ /*
+ * Exit if no pending stats at all. This avoids unnecessary work when
+ * backends are idle or in sessions without stats accumulation.
+ *
+ * Note: This check isn't precise as there might be only transactional
+ * stats pending, which we'll skip during the flush. However,
maintaining
+ * precise tracking would add complexity that does not seem
worth it from
+ * a performance point of view (no noticeable performance regression has
+ * been observed with the current implementation).
+ */
+ if (dlist_is_empty(&pgStatPending) && !pgstat_report_fixed)
+ return;
+
+ /* Flush stats outside of transaction boundary */
+ pgstat_flush_pending_entries(nowait, true);
+ pgstat_flush_fixed_stats(nowait, true);
+}
There is a check if pgStatPending is empty so we can return early.
However, with FLUSH_MIXED, the scenario is we will more than likely always
have TXN_BOUNDARY stats that can only be flushed at the end of a transaction.
This means that most of the time we will call pgstat_flush_pending_entries
and then leave it up to the anytime_cb to check for pending ANYTIME stats to
flush (before taking the lock ).
+bool
+pgstat_relation_flush_anytime_cb(PgStat_EntryRef *entry_ref, bool nowait)
+{
+ Oid dboid;
+ PgStat_TableStatus *lstats; /* pending stats entry */
+ PgStatShared_Relation *shtabstats;
+ PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
+ PgStat_StatDBEntry *dbentry; /* pending database entry */
+
+ dboid = entry_ref->shared_entry->key.dboid;
+ lstats = (PgStat_TableStatus *) entry_ref->pending;
+ shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
+
+ /*
+ * Check if there are any non-transactional stats to flush. Avoid
+ * unnecessarily locking the entry if nothing accumulated.
+ */
+ if (!(lstats->counts.numscans > 0 ||
+ lstats->counts.tuples_returned > 0 ||
+ lstats->counts.tuples_fetched > 0 ||
+ lstats->counts.blocks_fetched > 0 ||
+ lstats->counts.blocks_hit > 0))
+ return true;
This makes things confusing because instead of just relying on
dlist_is_empty(&pgStatPending) to check if we need to flush anything,
the responsibility now moves to the callback, which now also has to
account for all the ANYTIME fields.
The way I can think about making this better is to somehow track if
we have ANYTIME data that needs to be flushed a different way
( maybe a second pgStatPending list for anytime stats ?? )
What do you think?
--
Sami Imseih
Amazon Web Services (AWS)
pgsql-hackers by date: