Re: Flush some statistics within running transactions - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: Flush some statistics within running transactions
Date
Msg-id CAA5RZ0tvxZTZ7Xwm7FjWxAY_OphCq7tmpCaLgas0qrh+J86pRg@mail.gmail.com
Whole thread Raw
In response to Re: Flush some statistics within running transactions  (Sami Imseih <samimseih@gmail.com>)
List pgsql-hackers
> This makes things confusing because instead of just relying on
> dlist_is_empty(&pgStatPending) to check if we need to flush anything,
> the responsibility now moves to the callback, which now also has to
> account for all the ANYTIME fields.
>
> The way I can think about making this better is to somehow track if
> we have ANYTIME data that needs to be flushed a different way
> ( maybe a second pgStatPending list for anytime stats ?? )

Thinking about this a bit more, it seems the FLUSH_MIXED may
not be needed. Instead can we just introduce a new global variable
called pgstat_report_variable_anytime which acts like
pgstat_report_fixed, except it's set to true whenever we update
anytime variable-numbered stats

```
 #define pgstat_count_heap_scan(rel)
                                 \
        do {
                                                          \
                pgstat_report_variable_anytime = true;
                         \
                if (pgstat_should_count_relation(rel))
                         \
                        (rel)->pgstat_info->counts.numscans++;
                         \
        } while (0)

````

and reset whenever we call pgstat_report_anytime_stat, like this:

```
void
pgstat_report_anytime_stat(bool force)
{
.....
.........
         if (!pgstat_report_variable_anytime && !pgstat_report_fixed)
             return;

         /* Flush stats outside of transaction boundary */
        if (pgstat_report_variable_anytime)
               pgstat_flush_pending_entries(nowait, true);
....
........
          pgstat_report_variable_anytime = false;
}
```

and then inside pgstat_flush_pending_entries, we just look
for kind_info->flush_mode == FLUSH_ANYTIME  to flush.

```
/* flush the stats (with the appropriate callback), if possible */
if (anytime_only &&
kind_info->flush_mode == FLUSH_ANYTIME &&
kind_info->flush_anytime_cb != NULL)
{
          /* Partial flush of non-transactional fields only */
        did_flush = kind_info->flush_anytime_cb(entry_ref, nowait);
        is_partial_flush = true;
}
````

With this approach, we will not enter pgstat_flush_pending_entries
inside pgstat_report_anytime_stat unless we have anytime
variable stats to report.

Also, the anytime flush callback does not need to check if there are
any variable-numbered stats to flush. This will not be needed as
it is in v4-0004

```
+ /*
+ * Check if there are any non-transactional stats to flush. Avoid
+ * unnecessarily locking the entry if nothing accumulated.
+ */
+ if (!(lstats->counts.numscans > 0 ||
+  lstats->counts.tuples_returned > 0 ||
+  lstats->counts.tuples_fetched > 0 ||
+  lstats->counts.blocks_fetched > 0 ||
+  lstats->counts.blocks_hit > 0))
+ return true;
```

This feels like an easier approach to reason about and we don't
need to add a third flush mode.

Thoughts?

--
Sami Imseih
Amazon Web Services (AWS)



pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: ABI Compliance Checker GSoC Project
Next
From: Álvaro Herrera
Date:
Subject: Re: using index to speedup add not null constraints to a table