Thread: Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

Hi,

On Thu, Aug 29, 2024 at 04:04:05PM +0200, Guillaume Lelarge wrote:
> Hello,
> 
> This patch was a bit discussed on [1], and with more details on [2]. It
> introduces four new columns in pg_stat_all_tables:
> 
> * parallel_seq_scan
> * last_parallel_seq_scan
> * parallel_idx_scan
> * last_parallel_idx_scan
> 
> and two new columns in pg_stat_all_indexes:
> 
> * parallel_idx_scan
> * last_parallel_idx_scan
> 
> As Benoit said yesterday, the intent is to help administrators evaluate the
> usage of parallel workers in their databases and help configuring
> parallelization usage.

Thanks for the patch. I think that's a good idea to provide more instrumentation
in this area. So, +1 regarding this patch.

A few random comments:

1 ===

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>parallel_seq_scan</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of parallel sequential scans initiated on this table
+      </para></entry>
+     </row>

I wonder if we should not update the seq_scan too to indicate that it
includes the parallel_seq_scan.

Same kind of comment for last_seq_scan, idx_scan and last_idx_scan.

2 ===

@@ -410,6 +410,8 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
         */
        if (scan->rs_base.rs_flags & SO_TYPE_SEQSCAN)
                pgstat_count_heap_scan(scan->rs_base.rs_rd);
+  if (scan->rs_base.rs_parallel != NULL)
+               pgstat_count_parallel_heap_scan(scan->rs_base.rs_rd);

Indentation seems broken.

Shouldn't the parallel counter relies on the "scan->rs_base.rs_flags & SO_TYPE_SEQSCAN"
test too?

What about to get rid of the pgstat_count_parallel_heap_scan and add an extra
bolean parameter to pgstat_count_heap_scan to indicate if counts.parallelnumscans
should be incremented too?

Something like:

pgstat_count_heap_scan(scan->rs_base.rs_rd, scan->rs_base.rs_parallel != NULL)

3 ===

Same comment for pgstat_count_index_scan (add an extra bolean parameter) and
get rid of pgstat_count_parallel_index_scan()).

I think that 2 === and 3 === would help to avoid missing increments should we
add those call to other places in the future.

4 ===

+       if (lstats->counts.numscans || lstats->counts.parallelnumscans)

Is it possible to have (lstats->counts.parallelnumscans) whithout having
(lstats->counts.numscans) ?

> First time I had to add new columns to a statistics catalog. I'm actually
> not sure that we were right to change pg_proc.dat manually.

I think that's the right way to do.

I don't see a CF entry for this patch. Would you mind creating one so that
we don't lost track of it?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

On Wed, Sep 04, 2024 at 02:51:51PM +0200, Guillaume Lelarge wrote:
> Le mer. 4 sept. 2024 à 10:47, Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
> a écrit :
> 
> > I don't see a CF entry for this patch. Would you mind creating one so that
> > we don't lost track of it?
> >
> >
> I don't mind adding it, though I don't know if I should add it to the
> September or November commit fest. Which one should I choose?

Thanks! That should be the November one (as the September one already started).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Le mer. 4 sept. 2024 à 14:58, Bertrand Drouvot <bertranddrouvot.pg@gmail.com> a écrit :
Hi,

On Wed, Sep 04, 2024 at 02:51:51PM +0200, Guillaume Lelarge wrote:
> Le mer. 4 sept. 2024 ą 10:47, Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
> a écrit :
>
> > I don't see a CF entry for this patch. Would you mind creating one so that
> > we don't lost track of it?
> >
> >
> I don't mind adding it, though I don't know if I should add it to the
> September or November commit fest. Which one should I choose?

Thanks! That should be the November one (as the September one already started).


I should have gone to the commit fest website, it says the same. I had the recollection that it started on the 15th. Anyway, added to the november commit fest (https://commitfest.postgresql.org/50/5238/).
 

--
Guillaume.
Hi,

On Wed, Sep 04, 2024 at 02:51:51PM +0200, Guillaume Lelarge wrote:
> Hi,
> 
> Le mer. 4 sept. 2024 à 10:47, Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
> a écrit :
> 
> > What about to get rid of the pgstat_count_parallel_heap_scan and add an
> > extra
> > bolean parameter to pgstat_count_heap_scan to indicate if
> > counts.parallelnumscans
> > should be incremented too?
> >
> > Something like:
> >
> > pgstat_count_heap_scan(scan->rs_base.rs_rd, scan->rs_base.rs_parallel !=
> > NULL)
> >
> > 3 ===
> >
> > Same comment for pgstat_count_index_scan (add an extra bolean parameter)
> > and
> > get rid of pgstat_count_parallel_index_scan()).
> >
> > I think that 2 === and 3 === would help to avoid missing increments should
> > we
> > add those call to other places in the future.
> >
> >
> Oh OK, understood.  Done for both.

Thanks for v2!

1 ===

-#define pgstat_count_heap_scan(rel)
+#define pgstat_count_heap_scan(rel, parallel)
        do {
-               if (pgstat_should_count_relation(rel))
-                       (rel)->pgstat_info->counts.numscans++;
+            if (pgstat_should_count_relation(rel)) {
+                       if (!parallel)
+                               (rel)->pgstat_info->counts.numscans++;
+                       else
+                               (rel)->pgstat_info->counts.parallelnumscans++;
+               }

I think counts.numscans has to be incremented in all the cases (so even if
"parallel" is true).

Same comment for pgstat_count_index_scan().

> 4 ===
> >
> > +       if (lstats->counts.numscans || lstats->counts.parallelnumscans)
> >
> > Is it possible to have (lstats->counts.parallelnumscans) whithout having
> > (lstats->counts.numscans) ?
> >
> >
> Nope, parallel scans are included in seq/index scans, as far as I can tell.
> I could remove the parallelnumscans testing but it would be less obvious to
> read.

2 ===

What about adding a comment instead of this extra check?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

On Wed, Sep 04, 2024 at 04:37:19PM +0200, Guillaume Lelarge wrote:
> Hi,
> 
> Le mer. 4 sept. 2024 à 16:18, Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
> a écrit :
> > What about adding a comment instead of this extra check?
> >
> >
> Done too in v3.

Thanks!

1 ===

+       /*
+        * Don't check counts.parallelnumscans because counts.numscans includes
+        * counts.parallelnumscans
+        */

"." is missing at the end of the comment.

2 ===

-               if (t > tabentry->lastscan)
+               if (t > tabentry->lastscan && lstats->counts.numscans)

The extra check on lstats->counts.numscans is not needed as it's already done
a few lines before.

3 ===

+               if (t > tabentry->parallellastscan && lstats->counts.parallelnumscans)

This one makes sense.
 
And now I'm wondering if the extra comment added in v3 is really worth it (and
does not sound confusing)? I mean, the parallel check is done once we passe
the initial test on counts.numscans. I think the code is clear enough without
this extra comment, thoughts? 

4 ===

What about adding a few tests? or do you want to wait a bit more to see if "
there's an agreement on this patch" (as you stated at the start of this thread).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com