Re: progress report for ANALYZE - Mailing list pgsql-hackers

From Tatsuro Yamada
Subject Re: progress report for ANALYZE
Date
Msg-id 725ba79a-fdfd-256a-a1ac-ace2a53b47b2@nttcom.co.jp_1
Whole thread Raw
In response to Re: progress report for ANALYZE  (Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp>)
Responses Re: progress report for ANALYZE
Re: progress report for ANALYZE
List pgsql-hackers
Hi Amit-san!

>> I think include_children and current_relid are not enough to
>> understand the progress of analyzing inheritance trees, because even
>> with current_relid being updated, I can't tell how many more there
>> will be.  I think it'd be better to show the total number of children
>> and the number of children processed, just like
>> pg_stat_progress_create_index does for partitions.  So, instead of
>> include_children and current_relid, I think it's better to have
>> child_tables_total, child_tables_done, and current_child_relid, placed
>> last in the set of columns.
> 
> Ah, I understood.
> I'll check pg_stat_progress_create_index does for partitions,
> and will create a new patch. 

Fixed.

But I just remembered I replaced column name "*_table" with "*_relid"
based on Robert's comment three months ago, see below:

> /me reviews.
> 
> +      <entry><structfield>scanning_table</structfield></entry>
> 
> I think this should be retitled to something that ends in 'relid',
> like all of the corresponding cases in existing progress views.
> Perhaps 'active_relid' or 'current_relid'.

So, it would be better to use same rule against child_tables_total and
child_tables_done. Thus I changed these column names to others and added
to the view. I also removed include_children and current_relid.
The following columns are new version.

<New columns of the view>
  pid
  datid
  datname
  relid
  phase
  sample_blks_total
  heap_blks_scanned
  ext_stats_total     <= Added (based on Alvaro's comment)
  ext_stats_computed  <= Renamed
  child_relids_total  <= Added
  child_relids_done   <= Added
  current_child_relid <= Added


>> Also, inheritance tree stats are created *after* creating single table
>> stats, so I think that it would be better to have a distinct phase
>> name for that, say "acquiring inherited sample rows".  In
>> do_analyze_rel(), you can select which of two phases to set based on
>> whether inh is true or not.  For partitioned tables, the progress
>> output will immediately switch to this phase, because partitioned
>> table itself is empty so there's nothing to do in the "acquiring
>> sample rows" phase.
>>
>> That's all for now.
> 
> 
> Okay! I'll also add the new phase "acquiring inherited sample rows" on
> the next patch. 


Fixed.

I tried to abbreviate it to "acquiring inh sample rows" because I thought
"acquiring inherited sample rows" is a little long for the phase name.

Attached WIP patch is including these fixes:
   - Remove columns: include_children and current_relid
   - Add new columns: child_relieds_total, child_relids_done and current_child_relid
   - Add new phase "acquiring inh sample rows"

   Note: the document is not updated, I'll fix it later. :)

Attached testcase.sql is for creating base table and partitioning table.
You can check the patch by using the following procedures, easily.

Terminal #1
--------------
\a \t
select * from pg_stat_progress_analyze; \watch 0.0001
--------------

Terminal #2
--------------
\i testcase.sql
analyze hoge;
analyze hoge2;
--------------

Thanks,
Tatsuro Yamada




Attachment

pgsql-hackers by date:

Previous
From: Tatsuro Yamada
Date:
Subject: Re: progress report for ANALYZE
Next
From: Michael Paquier
Date:
Subject: Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly