Thread: ANALYZE ONLY

ANALYZE ONLY

From
Michael Harris
Date:
Hello Postgres Hackers

An application that I am developing uses Postgresql, and includes a fairly large
number of partitioned tables which are used to store time series data.

The tables are partitioned by time, and typically there is only one partition
at a time - the current partition - that is actually being updated.
Older partitions
are available for query and eventually dropped.

As per the documentation, partitioned tables are not analyzed by the autovacuum
workers, although their partitions are. Statistics are needed on the partitioned
table level for at least some query planning activities.

The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the individual
partitions. There is currently no option to prevent it from including the
partitions.

This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.

I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very significant.

For my use case at least it would be fantastic if we could add an ONLY option
to ANALYZE, which would cause it to analyze the named table only and not
descend into the partitions.

I took a look at the source and it looks doable, but before launching into it
I thought I would ask a few questions here.

  1. Would such a feature be welcomed? Are there any traps I might not
have thought of?

  2. The existing ANALYZE command has the following structure:

     ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

     It would be easiest to add ONLY as another option, but that
doesn't look quite
     right to me - surely the ONLY should be attached to the table name?
     An alternative would be:

     ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.

Regards
Mike.



Re: ANALYZE ONLY

From
Jelte Fennema-Nio
Date:
On Tue, 20 Aug 2024 at 07:52, Michael Harris <harmic@gmail.com> wrote:
>   1. Would such a feature be welcomed? Are there any traps I might not
> have thought of?

I think this sounds like a reasonable feature.


>   2. The existing ANALYZE command has the following structure:
>
>      ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>
>      It would be easiest to add ONLY as another option, but that
> doesn't look quite
>      right to me - surely the ONLY should be attached to the table name?
>      An alternative would be:
>
>      ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
>
> Any feedback or advice would be great.

Personally I'd prefer a new option to be added. But I agree ONLY isn't
a good name then. Maybe something like SKIP_PARTITIONS.



Re: ANALYZE ONLY

From
Ilia Evdokimov
Date:

On 20.8.24 10:42, Jelte Fennema-Nio wrote:

On Tue, 20 Aug 2024 at 07:52, Michael Harris <harmic@gmail.com> wrote:
  1. Would such a feature be welcomed? Are there any traps I might not
have thought of?
I think this sounds like a reasonable feature.


  2. The existing ANALYZE command has the following structure:
     ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
     It would be easiest to add ONLY as another option, but that
doesn't look quite     right to me - surely the ONLY should be attached to the table name?     An alternative would be:
     ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.
Personally I'd prefer a new option to be added. But I agree ONLY isn't
a good name then. Maybe something like SKIP_PARTITIONS.


Hi everyone,

Your proposal is indeed interesting, but I have a question: can't your issue be resolved by properly configuring autovacuum instead of developing a new feature for ANALYZE?

From my perspective, ANALYZE is intended to forcibly gather statistics from all partitioned tables. If the goal is to ensure that statistics are updated at the right moment, adjusting the autovacuum_analyze_threshold and autovacuum_analyze_scale_factor parameters might be the solution.

-- 
Regards,
Ilia Evdokimov,
Tantor Labs LCC.

Re: ANALYZE ONLY

From
David Rowley
Date:
On Tue, 20 Aug 2024 at 23:25, Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:
> Your proposal is indeed interesting, but I have a question: can't your issue be resolved by properly configuring
autovacuuminstead of developing a new feature for ANALYZE?
 

Basically, no. There's a "tip" in [1] which provides information on
the limitation, namely:

"The autovacuum daemon does not issue ANALYZE commands for partitioned
tables. Inheritance parents will only be analyzed if the parent itself
is changed - changes to child tables do not trigger autoanalyze on the
parent table. If your queries require statistics on parent tables for
proper planning, it is necessary to periodically run a manual ANALYZE
on those tables to keep the statistics up to date."

There is also some discussion about removing the limitation in [2].
While I agree that it would be nice to have autovacuum handle this,
it's not clear how exactly it would work. Additionally, if we had
that, it would still be useful if the ANALYZE command could be
instructed to just gather statistics for the partitioned table only.

David

[1] https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-STATISTICS
[2] https://www.postgresql.org/message-id/flat/CAKkQ508_PwVgwJyBY%3D0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA%40mail.gmail.com



Re: ANALYZE ONLY

From
Melih Mutlu
Date:
Hi Michael,

Thanks for starting this thread. I've also spent a bit time on this after reading your first thread on this issue [1] 

Michael Harris <harmic@gmail.com>, 20 Ağu 2024 Sal, 08:52 tarihinde şunu yazdı:
The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the individual
partitions. There is currently no option to prevent it from including the
partitions.

This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.

I agree that it's a waste to analyze partitions when they're already analyzed by autovacuum. It would be nice to have a way to run analyze only on a partitioned table without its partitions.

 
I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very significant.

I quickly tweaked the code a bit to exclude partitions when a partitioned table is being analyzed. I can confirm that there is a significant gain even on a simple case like a partitioned table with 10 partitions and 1M rows in each partition.

  1. Would such a feature be welcomed? Are there any traps I might not
have thought of?

  2. The existing ANALYZE command has the following structure:

     ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

     It would be easiest to add ONLY as another option, but that
doesn't look quite
     right to me - surely the ONLY should be attached to the table name
     An alternative would be:

     ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

I feel closer to adding this as an option instead of a new keyword in ANALYZE grammar. To me, it would be easier to have this option and then give the names of partitioned tables as opposed to typing ONLY before each partition table.
But we should think of another name as ONLY is used differently (attached to the table name as you mentioned) in other contexts.

I've been also thinking about how this new option should affect inheritance tables. Should it have just no impact on them or only analyze the parent table without taking child tables into account? There are two records for an inheritance parent table in pg_statistic, one row for only the parent table and a second row including children. We might only analyze the parent table if this new "ONLY" option is specified. I'm not sure if that would be something users would need or not, but I think this option should behave similarly for both partitioned tables and inheritance tables.

If we decide to go with only partition tables and not care about inheritance, then naming this option to SKIP_PARTITIONS as Jelte suggested sounds fine. But that name wouldn't work if this option will affect inheritance tables.


Thanks,
--
Melih Mutlu
Microsoft

Re: ANALYZE ONLY

From
Melih Mutlu
Date:
Melih Mutlu <m.melihmutlu@gmail.com>, 20 Ağu 2024 Sal, 19:26 tarihinde şunu yazdı:
Hi Michael,

Thanks for starting this thread. I've also spent a bit time on this after reading your first thread on this issue [1] 
Forgot to add the reference [1]

Re: ANALYZE ONLY

From
Robert Haas
Date:
On Tue, Aug 20, 2024 at 1:52 AM Michael Harris <harmic@gmail.com> wrote:
>   2. The existing ANALYZE command has the following structure:
>
>      ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>
>      It would be easiest to add ONLY as another option, but that
> doesn't look quite
>      right to me - surely the ONLY should be attached to the table name?
>      An alternative would be:
>
>      ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
>
> Any feedback or advice would be great.

I like trying to use ONLY somehow.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: ANALYZE ONLY

From
David Rowley
Date:
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
> I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.

1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.

Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.

If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:

postgres=# analyze (skip_locked) a;
WARNING:  skipping analyze of "a" --- lock not available

David



Re: ANALYZE ONLY

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
>> I like trying to use ONLY somehow.

> Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
> or as a table modifier like gram.y's extended_relation_expr?

> Making it a command option means that the option would apply to all
> tables listed, whereas if it was more like an extended_relation_expr,
> the option would be applied per table listed in the command.

> 1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
> partitions but get stats on ptab2 and stats on its partitions too.
> 2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
> without doing that on any of their partitions.

FWIW, I think that's the right approach, for consistency with the
way that ONLY works in DML.

            regards, tom lane



Re: ANALYZE ONLY

From
Robert Haas
Date:
On Tue, Aug 20, 2024 at 6:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
> On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
> > I like trying to use ONLY somehow.
>
> Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
> or as a table modifier like gram.y's extended_relation_expr?

The table modifier idea seems nice to me.

> If we did it as a per-table option, then we'd need to consider what
> should happen if someone did: "VACUUM ONLY parttab;". Probably
> silently doing nothing wouldn't be good. Maybe a warning, akin to
> what's done in:
>
> postgres=# analyze (skip_locked) a;
> WARNING:  skipping analyze of "a" --- lock not available

Perhaps. I'm not sure about this part.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: ANALYZE ONLY

From
Melih Mutlu
Date:
David Rowley <dgrowleyml@gmail.com>, 21 Ağu 2024 Çar, 01:53 tarihinde şunu yazdı:
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
> I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.

1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.

I believe we should go this route if we want this to be called "ONLY" so that it would be consistent with other places too.

Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.

Havin it as an option would be easier to use when we have several partition tables. But I agree that if we call it "ONLY ", it may be confusing and the other approach would be appropriate.
 
If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:

postgres=# analyze (skip_locked) a;
WARNING:  skipping analyze of "a" --- lock not available

+1 to raising a warning message in that case instead of staying silent. We might also not allow ONLY if ANALYZE is not present in VACUUM query and raise an error. But that would require changes in grams.y and could complicate things. So it may not be necessary and we may be fine with just a warning.

Regards,
--
Melih Mutlu
Microsoft

Re: ANALYZE ONLY

From
Michael Harris
Date:
Thank you all for the replies & discussion.

It sounds like more are in favour of using the ONLY syntax attached to
the tables is the best way to go, with the main advantages being:
 - consistency with other commands
 - flexibility in allowing to specify whether to include partitions
for individual tables when supplying a list of tables

I will start working on an implementation along those lines. It looks
like we can simply replace qualified_name with relation_expr in the
production for vacuum_relation within gram.y.

One other thing I noticed when reading the code. The function
expand_vacuum_rel in vacuum.c seems to be responsible for adding the
partitions. If I am reading it correctly, it only adds child tables in
the case of a partitioned table, not in the case of an inheritance
parent:

        include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
..
        if (include_parts)
        {
        .. add partitions ..

This is a little different to some other contexts where the ONLY
keyword is used, in that ONLY would be the default and only available
mode of operation for an inheritance parent.

Regards,
Mike

On Wed, 21 Aug 2024 at 20:04, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> David Rowley <dgrowleyml@gmail.com>, 21 Ağu 2024 Çar, 01:53 tarihinde şunu yazdı:
>>
>> On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
>> > I like trying to use ONLY somehow.
>>
>> Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
>> or as a table modifier like gram.y's extended_relation_expr?
>>
>> Making it a command option means that the option would apply to all
>> tables listed, whereas if it was more like an extended_relation_expr,
>> the option would be applied per table listed in the command.
>>
>> 1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
>> partitions but get stats on ptab2 and stats on its partitions too.
>> 2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
>> without doing that on any of their partitions.
>
>
> I believe we should go this route if we want this to be called "ONLY" so that it would be consistent with other
placestoo. 
>
>> Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
>> behaviour of #2.
>
>
> Havin it as an option would be easier to use when we have several partition tables. But I agree that if we call it
"ONLY", it may be confusing and the other approach would be appropriate. 
>
>>
>> If we did it as a per-table option, then we'd need to consider what
>> should happen if someone did: "VACUUM ONLY parttab;". Probably
>> silently doing nothing wouldn't be good. Maybe a warning, akin to
>> what's done in:
>>
>> postgres=# analyze (skip_locked) a;
>> WARNING:  skipping analyze of "a" --- lock not available
>
>
> +1 to raising a warning message in that case instead of staying silent. We might also not allow ONLY if ANALYZE is
notpresent in VACUUM query and raise an error. But that would require changes in grams.y and could complicate things.
Soit may not be necessary and we may be fine with just a warning. 
>
> Regards,
> --
> Melih Mutlu
> Microsoft



Re: ANALYZE ONLY

From
David Rowley
Date:
On Thu, 22 Aug 2024 at 11:32, Michael Harris <harmic@gmail.com> wrote:
> One other thing I noticed when reading the code. The function
> expand_vacuum_rel in vacuum.c seems to be responsible for adding the
> partitions. If I am reading it correctly, it only adds child tables in
> the case of a partitioned table, not in the case of an inheritance
> parent:
>
>         include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
> ..
>         if (include_parts)
>         {
>         .. add partitions ..
>
> This is a little different to some other contexts where the ONLY
> keyword is used, in that ONLY would be the default and only available
> mode of operation for an inheritance parent.

That's inconvenient and quite long-established behaviour. I had a look
as far back as 9.2 and we only analyze parents there too. I'm keen on
the ONLY syntax, but it would be strange if ONLY did the same thing as
not using ONLY for inheritance parents.

I feel like we might need to either bite the bullet and make ONLY work
consistently with both, or think of another way to have ANALYZE not
recursively gather stats for each partition on partitioned tables.
Could we possibly get away with changing inheritance parent behaviour?

David



Re: ANALYZE ONLY

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> I feel like we might need to either bite the bullet and make ONLY work
> consistently with both, or think of another way to have ANALYZE not
> recursively gather stats for each partition on partitioned tables.
> Could we possibly get away with changing inheritance parent behaviour?

Yeah, my thought was to change the behavior so it's consistent in
that case too.  This doesn't seem to me like a change that'd
really cause anybody serious problems: ANALYZE without ONLY would
do more than before, but it's work you probably want done anyway.

            regards, tom lane



Re: ANALYZE ONLY

From
Michael Harris
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, my thought was to change the behavior so it's consistent in
> that case too.  This doesn't seem to me like a change that'd
> really cause anybody serious problems: ANALYZE without ONLY would
> do more than before, but it's work you probably want done anyway.

Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.

It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.



Re: ANALYZE ONLY

From
David Rowley
Date:
On Thu, 22 Aug 2024 at 13:38, Michael Harris <harmic@gmail.com> wrote:
> Would we want to apply that change to VACUUM too? That might be a
> bit drastic, especially if you had a multi-level inheritance structure featuring
> large tables.

I think they'd need to work the same way as for "VACUUM (ANALYZE)", it
would be strange to analyze some tables that you didn't vacuum. It's
just a much bigger pill to swallow in terms of the additional effort.

> It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
> For VACUUM it does not make much sense to vacuum only at the partitioned
> table level and not include the partitions, since it would do nothing
> - that might
> be why the existing code always adds the partitions.

Yeah, I suspect that's exactly why it was coded that way.

David



Re: ANALYZE ONLY

From
Melih Mutlu
Date:
Hi Michael,

Thanks for the patch.
I quickly tried running some ANALYZE ONLY queries, it seems like it works fine.

-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]

It seems like extended_relation_expr allows "tablename *" syntax too. That should be added in docs as well. (Same for VACUUM doc)

 <para>
    For partitioned tables, <command>ANALYZE</command> gathers statistics by
    sampling rows from all partitions; in addition, it will recurse into each
    partition and update its statistics.  Each leaf partition is analyzed only
    once, even with multi-level partitioning.  No statistics are collected for
    only the parent table (without data from its partitions), because with
    partitioning it's guaranteed to be empty.
  </para>

We may also want to update the above note in ANALYZE doc.

+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO only_parted1 VALUES (1, 'a');

Tests don't seem right to me.
I believe the above should be " PARTITION OF only_parted " instead of vacparted.
It may better to insert into partitioned table (only_parted) instead of the partition (only_parted1);

Also it may be a good idea to test VACUUM ONLY for inheritance tables the same way you test ANALYZE ONLY.

Lastly, the patch includes an unrelated file (compile_flags.txt) and has whitespace errors when I apply it.

Regards,
--
Melih Mutlu
Microsoft

Re: ANALYZE ONLY

From
Melih Mutlu
Date:
Hi Michael,

Re: ANALYZE ONLY

From
torikoshia
Date:
Hi Michael,

On 2024-08-23 19:01, Michael Harris wrote:

> V2 of the patch is attached.

Thanks for the proposal and the patch.

You said this patch was a first draft version, so these may be too minor 
comments, but I will share them:


-- https://www.postgresql.org/docs/devel/progress-reporting.html
> Note that when ANALYZE is run on a partitioned table, all of its 
> partitions are also recursively analyzed.

Should we also note this is the default, i.e. not using ONLY option 
behavior here?


-- https://www.postgresql.org/docs/devel/ddl-partitioning.html
> If you are using manual VACUUM or ANALYZE commands, don't forget that 
> you need to run them on each child table individually. A command like:
> 
> ANALYZE measurement;
> will only process the root table.

This part also should be modified, shouldn't it?


When running ANALYZE VERBOSE ONLY on a partition table, the INFO message 
is like this:

   =# ANALYZE VERBOSE ONLY only_parted;
   INFO:  analyzing "public.only_parted" inheritance tree

I may be wrong, but 'inheritance tree' makes me feel it includes child 
tables.
Removing 'inheritance tree' and just describing the table name as below 
might be better:

   INFO:  analyzing "public.only_parted"


-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



Re: ANALYZE ONLY

From
jian he
Date:
On Thu, Aug 29, 2024 at 7:31 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi Michael,
>
> Michael Harris <harmic@gmail.com>, 23 Ağu 2024 Cum, 13:01 tarihinde şunu yazdı:
>>
>> V2 of the patch is attached.
>
>
> Thanks for updating the patch. I have a few more minor feedbacks.
>
>> -ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable
class="parameter">table_and_columns</replaceable>[, ...] ] 
>> +ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable
class="parameter">table_and_columns</replaceable>[, ...] ] 
>
>
> I believe moving "[ ONLY ]" to the definitions of table_and_columns below, as you did with "[ * ]", might be better
tobe consistent with other places (see [1]) 
>
>> + if ((options & VACOPT_VACUUM) && is_partitioned_table && ! include_children)
>
>

I think you are right.

ANALYZE [ ( option [, ...] ) ] [ [ ONLY ] table_and_columns [, ...] ]

seems not explain commands like:

ANALYZE ONLY only_parted(a), ONLY only_parted(b);



Re: ANALYZE ONLY

From
David Rowley
Date:
Hi Michael,

On Fri, 23 Aug 2024 at 22:01, Michael Harris <harmic@gmail.com> wrote:
> V2 of the patch is attached.

(I understand this is your first PostgreSQL patch)

I just wanted to make sure you knew about the commitfest cycle we use
for the development of PostgreSQL.  If you've not got one already, can
you register a community account. That'll allow you to include this
patch in the September commitfest that starts on the 1st. See
https://commitfest.postgresql.org/49/

I understand there's now some sort of cool-off period for new
community accounts, so if you have trouble adding the patch before the
CF starts, let me know and I should be able to do it for you. The
commitfest normally closes for new patches around 12:00 UTC on the 1st
of the month.

There's a bit more information in
https://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission

David



Re: ANALYZE ONLY

From
Michael Harris
Date:
Thanks David

I had not read that wiki page well enough, so many thanks for alerting
me that I had to create a commitfest entry. I already had a community
account so I was able to login and create it here:

  https://commitfest.postgresql.org/49/5226/

I was not sure what to put for some of the fields - for 'reviewer' should
I have put the people who have provided feedback on this thread?

Is there anything else I need to do?

Thanks again

Cheers
Mike

On Fri, 30 Aug 2024 at 18:44, David Rowley <dgrowleyml@gmail.com> wrote:
>
> Hi Michael,
>
> On Fri, 23 Aug 2024 at 22:01, Michael Harris <harmic@gmail.com> wrote:
> > V2 of the patch is attached.
>
> (I understand this is your first PostgreSQL patch)
>
> I just wanted to make sure you knew about the commitfest cycle we use
> for the development of PostgreSQL.  If you've not got one already, can
> you register a community account. That'll allow you to include this
> patch in the September commitfest that starts on the 1st. See
> https://commitfest.postgresql.org/49/
>
> I understand there's now some sort of cool-off period for new
> community accounts, so if you have trouble adding the patch before the
> CF starts, let me know and I should be able to do it for you. The
> commitfest normally closes for new patches around 12:00 UTC on the 1st
> of the month.
>
> There's a bit more information in
> https://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission
>
> David



Re: ANALYZE ONLY

From
David Rowley
Date:
On Sun, 1 Sept 2024 at 13:41, Michael Harris <harmic@gmail.com> wrote:
>   https://commitfest.postgresql.org/49/5226/
>
> I was not sure what to put for some of the fields - for 'reviewer' should
> I have put the people who have provided feedback on this thread?

I think it's fairly common that only reviewers either add themselves
or don't bother.  I don't think it's common that patch authors add
reviewers. Some reviewers like their reviews to be more informal and
putting themselves as a reviewer can put other people off reviewing as
they think it's already handled by someone else. That may result in
the patch being neglected by reviewers.

> Is there anything else I need to do?

I see you set the target version as 17. That should be blank or "18".
PG17 is about to go into release candidate, so it is too late for
that. It's been fairly consistent that we open for new patches in
early July and close before the 2nd week of April the following year.
We're currently 2 months into PG18 dev. We don't back-patch new
features.

Nothing else aside from continuing to address reviews, as you are already.

David



Re: ANALYZE ONLY

From
torikoshia
Date:
On 2024-09-01 10:31, Michael Harris wrote:
> Hello Atsushi & Melih
> 
> Thank you both for your further feedback.
> 
> On Thu, 29 Aug 2024 at 21:31, Melih Mutlu <m.melihmutlu@gmail.com> 
> wrote:
>> I believe moving "[ ONLY ]" to the definitions of table_and_columns 
>> below, as you did with "[ * ]", might be better to be consistent with 
>> other places (see [1])
> 
> Agreed. I have updated this.
> 
>>> + if ((options & VACOPT_VACUUM) && is_partitioned_table && ! 
>>> include_children)
>> 
>> There are also some issues with coding conventions in some places 
>> (e.g. the space between "!" and "include_children" abode). I think 
>> running pgindent would resolve such issue in most places.
> 
> I fixed that extra space, and then ran pgindent. It did not report any
> more problems.
> 
> On Fri, 30 Aug 2024 at 16:45, torikoshia <torikoshia@oss.nttdata.com> 
> wrote:
>> -- https://www.postgresql.org/docs/devel/progress-reporting.html
>> > Note that when ANALYZE is run on a partitioned table, all of its
>> > partitions are also recursively analyzed.
>> 
>> Should we also note this is the default, i.e. not using ONLY option
>> behavior here?
> 
>> -- https://www.postgresql.org/docs/devel/ddl-partitioning.html
>> > If you are using manual VACUUM or ANALYZE commands, don't forget that
>> > you need to run them on each child table individually. A command like:
>> >
>> > ANALYZE measurement;
>> > will only process the root table.
>> 
>> This part also should be modified, shouldn't it?
> 
> Agreed. I have updated both of these.

Thanks!

>> When running ANALYZE VERBOSE ONLY on a partition table, the INFO 
>> message
>> is like this:
>> 
>>    =# ANALYZE VERBOSE ONLY only_parted;
>>    INFO:  analyzing "public.only_parted" inheritance tree
>> 
>> I may be wrong, but 'inheritance tree' makes me feel it includes child
>> tables.
>> Removing 'inheritance tree' and just describing the table name as 
>> below
>> might be better:
>> 
>>    INFO:  analyzing "public.only_parted"
> 
> I'm not sure about that one. If I understand the source correctly,
> that particular progress
> message tells the user that the system is gathering stats from the 
> inheritance
> tree in order to update the stats of the given table, not that it is
> actually updating
> the stats of the descendant tables.

That makes sense.

> When analyzing an inheritance structure with the ONLY you see
> something like this:
> 
> => ANALYZE VERBOSE ONLY only_inh_parent;
> INFO:  analyzing "public.only_inh_parent"
> INFO:  "only_inh_parent": scanned 0 of 0 pages, containing 0 live rows
> and 0 dead rows; 0 rows in sample, 0 estimated total rows
> INFO:  analyzing "public.only_inh_parent" inheritance tree
> INFO:  "only_inh_child": scanned 1 of 1 pages, containing 3 live rows
> and 0 dead rows; 3 rows in sample, 3 estimated total rows
> ANALYZE
> 
> The reason you don't see the first one for partitioned tables is that
> it corresponds
> to sampling the contents of the parent table itself, which in the case
> of a partitioned
> table is guaranteed to be empty, so it is skipped.
> 
> I agree the text could be confusing, and in fact is probably confusing
> even today
> without the ONLY keyword,

Yeah, it seems this isn't dependent on your proposal.
(BTW I'm also wondering if the expression “inheritance" is appropriate 
when the target is a partitioned table, but this should be discussed in 
another thread)


-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



Re: ANALYZE ONLY

From
David Rowley
Date:
On Sun, 1 Sept 2024 at 13:32, Michael Harris <harmic@gmail.com> wrote:
> v3 of the patch is attached, and I will submit it to the commitfest.

I think this patch is pretty good.

I only have a few things I'd point out:

1. The change to ddl.sgml, technically you're removing the caveat, but
I think the choice you've made to mention the updated behaviour is
likely a good idea still. So I agree with this change, but just wanted
to mention it as someone else might think otherwise.

2. I think there's some mixing of tense in the changes to analyze.sgml:

"If <literal>ONLY</literal> was not specified, it will also recurse
into each partition and update its statistics."

You've written "was" (past tense), but then the existing text uses
"will" (future tense). I guess if the point in time is after parse and
before work has been done, then that's correct, but I think using "is"
instead of "was" is better.

3. In vacuum.sgml;

"If <literal>ONLY</literal> is not specified, the table and all its
descendant tables or partitions (if any) are vacuumed"

Maybe "are also vacuumed" instead of "are vacuumed" is more clear?
It's certainly true for inheritance parents, but I guess you could
argue that's wrong for partitioned tables.

4. A very minor detail, but I think in vacuum.c the WARNING you've
added should use RelationGetRelationName(). We seem to be very
inconsistent with using that macro and I see it's not used just above
for the lock warning, which I imagine you copied.

Aside from those, that just leaves me with the behavioural change. I
noted Tom was ok with the change in behaviour for ANALYZE (mentioned
in [1]). Tom, wondering if you feel the same for VACUUM too?  If we're
doing this, I think we'd need to be quite clear about it on the
release notes.

David

[1] https://postgr.es/m/1919201.1724289136@sss.pgh.pa.us



Re: ANALYZE ONLY

From
torikoshia
Date:
On 2024-09-09 16:56, Michael Harris wrote:

Thanks for updating the patch.
Here is a minor comment.

> @@ -944,20 +948,32 @@ expand_vacuum_rel(VacuumRelation *vrel, 
> MemoryContext vac_context,
>            MemoryContextSwitchTo(oldcontext);
>        }
..
> +        * Unless the user has specified ONLY, make relation list 
> entries for
> +        * its partitions and/or descendant tables.

Regarding the addition of partition descendant tables, should we also 
update the below comment on expand_vacuum_rel? Currently it refers only 
partitions:

|  * Given a VacuumRelation, fill in the table OID if it wasn't 
specified,
|  * and optionally add VacuumRelations for partitions of the table.

Other than this and the following, it looks good to me.

On Mon, Sep 9, 2024 at 10:27 AM David Rowley <dgrowleyml@gmail.com> 
wrote:
> Aside from those, that just leaves me with the behavioural change. I
> noted Tom was ok with the change in behaviour for ANALYZE (mentioned
> in [1]). Tom, wondering if you feel the same for VACUUM too?  If we're
> doing this, I think we'd need to be quite clear about it on the
> release notes.


-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



Re: ANALYZE ONLY

From
torikoshia
Date:
On 2024-09-11 16:47, Michael Harris wrote:
> Thanks for the feedback.
> 
> On Tue, 10 Sept 2024 at 22:03, torikoshia <torikoshia@oss.nttdata.com> 
> wrote:
>> Regarding the addition of partition descendant tables, should we also
>> update the below comment on expand_vacuum_rel? Currently it refers 
>> only
>> partitions:
>> 
>> |  * Given a VacuumRelation, fill in the table OID if it wasn't
>> specified,
>> |  * and optionally add VacuumRelations for partitions of the table.
>> 
> 
> Well spotted! I have updated the comment to read:
> 
>  * Given a VacuumRelation, fill in the table OID if it wasn't 
> specified,
>  * and optionally add VacuumRelations for partitions or descendant 
> tables
>  * of the table.

Thanks for the update!

I've switched the status on the commitfest to 'ready for committer'.

-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



Re: ANALYZE ONLY

From
jian he
Date:
hi.

in https://www.postgresql.org/docs/current/ddl-inherit.html
<<<
Commands that do database maintenance and tuning (e.g., REINDEX,
VACUUM) typically only work on individual, physical tables and do not
support recursing over inheritance hierarchies. The respective
behavior of each individual command is documented in its reference
page (SQL Commands).
<<<
does the above para need some tweaks?


in section, your patch output is
<<<<<<<<<
and table_and_columns is:
    [ ONLY ] table_name [ * ] [ ( column_name [, ...] ) ]
<<<<<<<<<

ANALYZE ONLY only_parted*;
will fail.
Maybe we need a sentence saying ONLY and * are mutually exclusive?


>>>
If the table being analyzed has inheritance children, ANALYZE gathers
two sets of statistics: one on the rows of the parent table only, and
a second including rows of both the parent table and all of its
children. This second set of statistics is needed when planning
queries that process the inheritance tree as a whole. The child tables
themselves are not individually analyzed in this case.
>>>
is this still true for table inheritance.

for example:
drop table if exists only_inh_parent,only_inh_child;
CREATE TABLE only_inh_parent (a int , b INT) with (autovacuum_enabled = false);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent) with
(autovacuum_enabled = false);
INSERT INTO only_inh_child(a,b) select g % 80, (g + 1) % 200 from
generate_series(1,1000) g;
select pg_table_size('only_inh_parent');

ANALYZE ONLY only_inh_parent;

here, will "ANALYZE ONLY only_inh_parent;" will have minimal "effects".
since the physical table only_inh_parent has zero storage.
but
select stadistinct,starelid::regclass,staattnum, stainherit
from pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);

output is
 stadistinct |    starelid     | staattnum | stainherit
-------------+-----------------+-----------+------------
          80 | only_inh_parent |         1 | t
        -0.2 | only_inh_parent |         2 | t

I am not sure if this output and the manual description about table
inheritance is consistent.



Re: ANALYZE ONLY

From
Michael Harris
Date:
On Sun, 22 Sept 2024 at 23:09, David Rowley <dgrowleyml@gmail.com> wrote:
> I'd like to push this soon, so if anyone has any last-minute feedback,
> please let me know in the next few days.

Many thanks for your updates and assistance.

Looks good. Agreed, I was probably too conservative in some of the
doc updates.

Thanks to all reviewers for helping to get it to this stage!

Cheers
Mike



Re: ANALYZE ONLY

From
jian he
Date:
On Sun, Sep 22, 2024 at 9:09 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> v7-0002 is all my changes.
>
> I'd like to push this soon, so if anyone has any last-minute feedback,
> please let me know in the next few days.
>


drop table if exists only_inh_parent,only_inh_child;
CREATE TABLE only_inh_parent (a int , b INT) with (autovacuum_enabled = false);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent) with
(autovacuum_enabled = false);
INSERT INTO only_inh_child(a,b) select g % 80, (g + 1) % 200 from
generate_series(1,1000) g;
select pg_table_size('only_inh_parent');
ANALYZE ONLY only_inh_parent;

select  stadistinct,starelid::regclass,staattnum, stainherit
from    pg_statistic
where   starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);

 stadistinct |    starelid     | staattnum | stainherit
-------------+-----------------+-----------+------------
          80 | only_inh_parent |         1 | t
        -0.2 | only_inh_parent |         2 | t

---------------
catalog-pg-statistic.html
stainherit bool
If true, the stats include values from child tables, not just the
values in the specified relation

Normally there is one entry, with stainherit = false, for each table
column that has been analyzed. If the table has inheritance children
or partitions, a second entry with stainherit = true is also created.
This row represents the column's statistics over the inheritance tree,
i.e., statistics for the data you'd see with SELECT column FROM
table*, whereas the stainherit = false row represents the results of
SELECT column FROM ONLY table.


I do understand what Michael Harris said in [1]
---------------


Given the above context, I am still confused with this sentence in
sql-analyze.html.
"If ONLY is specified before the table name, only that table is analyzed."

like in the above sql example, only_inh_parent's child is also being analyzed.


[1] https://www.postgresql.org/message-id/CADofcAW43AD%3Dqqtj1cLkTyRpPM6JB5ZALUK7CA1KZZqpcouoYw%40mail.gmail.com



Re: ANALYZE ONLY

From
David Rowley
Date:
On Mon, 23 Sept 2024 at 15:29, jian he <jian.universality@gmail.com> wrote:
> Given the above context, I am still confused with this sentence in
> sql-analyze.html.
> "If ONLY is specified before the table name, only that table is analyzed."
>
> like in the above sql example, only_inh_parent's child is also being analyzed.

I guess it depends what you define "analyzed" to mean. In this
context, it means gathering statistics specifically for a certain
table.

Would it be more clear if "only that table is analyzed." was changed
to "then statistics are only gathered specifically for that table."?

David



Re: ANALYZE ONLY

From
jian he
Date:
On Mon, Sep 23, 2024 at 12:46 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Mon, 23 Sept 2024 at 15:29, jian he <jian.universality@gmail.com> wrote:
> > Given the above context, I am still confused with this sentence in
> > sql-analyze.html.
> > "If ONLY is specified before the table name, only that table is analyzed."
> >
> > like in the above sql example, only_inh_parent's child is also being analyzed.
>
> I guess it depends what you define "analyzed" to mean. In this
> context, it means gathering statistics specifically for a certain
> table.
>
> Would it be more clear if "only that table is analyzed." was changed
> to "then statistics are only gathered specifically for that table."?
>

looking at expand_vacuum_rel, analyze_rel.
if we
---------
if (onerel->rd_rel->relhassubclass)
do_analyze_rel(onerel, params, va_cols, acquirefunc, relpages,
  true, in_outer_xact, elevel);

change to

    if (onerel->rd_rel->relhassubclass && ((!relation ||
relation->inh) || onerel->rd_rel->relkind ==
RELKIND_PARTITIONED_TABLE))
        do_analyze_rel(onerel, params, va_cols, acquirefunc, relpages,
                       true, in_outer_xact, elevel);


then the inheritance table will behave the way the doc says.

for example:
drop table if exists only_inh_parent,only_inh_child;
CREATE TABLE only_inh_parent (a int , b INT) with (autovacuum_enabled = false);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent) with
(autovacuum_enabled = false);
INSERT INTO only_inh_child(a,b) select g % 80, (g + 1) % 200 from
generate_series(1,1000) g;
ANALYZE ONLY only_inh_parent;
select stadistinct,starelid::regclass,staattnum, stainherit
from  pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);

will return zero rows, since the physical table only_inh_parent has no storage.




sql-analyze.html
For partitioned tables, ANALYZE gathers statistics by sampling rows
from all partitions. Each leaf partition is also recursively analyzed
and the statistics updated. This recursive part may be disabled by
using the ONLY keyword, otherwise, leaf partitions are analyzed only
once, even with multi-level partitioning. No statistics are collected
for only the parent table (without data from its partitions), because
with partitioning it's guaranteed to be empty.

allow me to ask anenglish language question.
here "otherwise" means  specify ONLY or not?
As far as i understand.
if you specify ONLY, postgres will only do "For partitioned tables,
ANALYZE gathers statistics by sampling rows from all partitions"
if you not specify ONLY, postgres will do  "For partitioned tables,
ANALYZE gathers statistics by sampling rows from all partitions *AND*
also recursively analyze each leaf partition"

Is my understanding correct?


I think there is a whitespace error in "ANALYZE ONLY vacparted(a,b); "
in vacuum.out.



Re: ANALYZE ONLY

From
jian he
Date:
On Mon, Sep 23, 2024 at 6:04 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
>
> If this is confusing, I think there's a bunch of detail that I tried
> to keep that's just not that useful.  The part about analyzing
> partitions just once and the part about not collecting non-inheritance
> stats for the partitioned table seems like extra detail that's either
> obvious or just not that important.
>
> Can you have a look at the attached and let me know if it's easier to
> understand now?
>

Now the regress test passed.

  <para>
    For partitioned tables, <command>ANALYZE</command> gathers statistics by
    sampling rows from all partitions.  By default,
    <command>ANALYZE</command> will also recursively collect and update the
    statistics for each partition.  The <literal>ONLY</literal> keyword may be
    used to disable this.
  </para>

is very clear to me!


  <para>
    If the table being analyzed has inheritance children,
    <command>ANALYZE</command> gathers two sets of statistics: one on the rows
    of the parent table only, and a second including rows of both the parent
    table and all of its children.  This second set of statistics is needed when
    planning queries that process the inheritance tree as a whole.  The
    autovacuum daemon, however, will only consider inserts or updates on the
    parent table itself when deciding whether to trigger an automatic analyze
    for that table.  If that table is rarely inserted into or updated, the
    inheritance statistics will not be up to date unless you run
    <command>ANALYZE</command> manually.  By default,
    <command>ANALYZE</command> will also recursively collect and update the
    statistics for each inheritance child table.  The <literal>ONLY</literal>
    keyword may be used to disable this.
  </para>


looks fine. but maybe we can add the following information
"if The <literal>ONLY</literal> is specified, the second set of
statistics won't include each children individual statistics"
I think that's the main difference between specifying ONLY or not?

catalog-pg-statistic.html second paragraph seems very clear to me.
Maybe we can link it somehow

Other than that, it looks good to me.



Re: ANALYZE ONLY

From
David Rowley
Date:
On Mon, 23 Sept 2024 at 23:23, jian he <jian.universality@gmail.com> wrote:
> looks fine. but maybe we can add the following information
> "if The <literal>ONLY</literal> is specified, the second set of
> statistics won't include each children individual statistics"
> I think that's the main difference between specifying ONLY or not?

Ok, I think you're not understanding this yet and I'm not sure what I
can make more clear in the documents.

Let me explain... For inheritance parent tables, ANALYZE ONLY will
gather inheritance and non-inheritance statistics for ONLY the parent.

Here's an example of that:

drop table if exists parent,child;
create table parent(a int);
create table child () inherits (parent);
insert into parent values(1);
insert into child values(1);

analyze ONLY parent;
select starelid::regclass,stainherit,stadistinct from pg_statistic
where starelid in ('parent'::regclass,'child'::regclass);
 starelid | stainherit | stadistinct
----------+------------+-------------
 parent   | f          |          -1  <- this is the distinct estimate
for SELECT * FROM ONLY parent;
 parent   | t          |        -0.5 <- this is the distinct estimate
for SELECT * FROM parent;
(2 rows)

For the stainherit==false stats, only 1 row is sampled here as that's
the only row directly located in the "parent" table.
For the stainherit==true stats, 2 rows are sampled, both of them have
"a" == 1. The stadistinct reflects that fact.

Note there have been no statistics recorded for "child". However,
analyze did sample rows in that table as part of gathering sample rows
for "parent" for the stainherit==true row.

Now let's try again without ONLY.

analyze parent;

select starelid::regclass,stainherit,stadistinct from pg_statistic
where starelid in ('parent'::regclass,'child'::regclass);
 starelid | stainherit | stadistinct
----------+------------+-------------
 parent   | f          |          -1
 parent   | t          |        -0.5
 child    | f          |          -1
(3 rows)

All of the above rows were re-calculated with the "analyze parent"
command, the first two rows have the same values as nothing changed in
the table, however, there are now statistics stored for the "child"
table.

> catalog-pg-statistic.html second paragraph seems very clear to me.
> Maybe we can link it somehow

I don't know what "link it" means in this context.

David



Re: ANALYZE ONLY

From
jian he
Date:
On Mon, Sep 23, 2024 at 7:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Mon, 23 Sept 2024 at 23:23, jian he <jian.universality@gmail.com> wrote:
> > looks fine. but maybe we can add the following information
> > "if The <literal>ONLY</literal> is specified, the second set of
> > statistics won't include each children individual statistics"
> > I think that's the main difference between specifying ONLY or not?
>
> Ok, I think you're not understanding this yet and I'm not sure what I
> can make more clear in the documents.
>
> Let me explain... For inheritance parent tables, ANALYZE ONLY will
> gather inheritance and non-inheritance statistics for ONLY the parent.
>
> Here's an example of that:
>
> drop table if exists parent,child;
> create table parent(a int);
> create table child () inherits (parent);
> insert into parent values(1);
> insert into child values(1);
>
> analyze ONLY parent;
> select starelid::regclass,stainherit,stadistinct from pg_statistic
> where starelid in ('parent'::regclass,'child'::regclass);
>  starelid | stainherit | stadistinct
> ----------+------------+-------------
>  parent   | f          |          -1  <- this is the distinct estimate
> for SELECT * FROM ONLY parent;
>  parent   | t          |        -0.5 <- this is the distinct estimate
> for SELECT * FROM parent;
> (2 rows)
>
> For the stainherit==false stats, only 1 row is sampled here as that's
> the only row directly located in the "parent" table.
> For the stainherit==true stats, 2 rows are sampled, both of them have
> "a" == 1. The stadistinct reflects that fact.
>
> Note there have been no statistics recorded for "child". However,
> analyze did sample rows in that table as part of gathering sample rows
> for "parent" for the stainherit==true row.
>
> Now let's try again without ONLY.
>
> analyze parent;
>
> select starelid::regclass,stainherit,stadistinct from pg_statistic
> where starelid in ('parent'::regclass,'child'::regclass);
>  starelid | stainherit | stadistinct
> ----------+------------+-------------
>  parent   | f          |          -1
>  parent   | t          |        -0.5
>  child    | f          |          -1
> (3 rows)
>
> All of the above rows were re-calculated with the "analyze parent"
> command, the first two rows have the same values as nothing changed in
> the table, however, there are now statistics stored for the "child"
> table.

thanks for your explanation!
now I don't have any questions about this patch.



> > catalog-pg-statistic.html second paragraph seems very clear to me.
> > Maybe we can link it somehow
>
> I don't know what "link it" means in this context.
>

i mean, change to:

By default,
    <command>ANALYZE</command> will also recursively collect and update the
    statistics for each inheritance child table.  The <literal>ONLY</literal>
    keyword may be used to disable this.
    You may also refer to catalog <link
linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
    description about <literal>stainherit</literal>.


but <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
already mentioned once.
maybe not a good idea.



Re: ANALYZE ONLY

From
David Rowley
Date:
On Mon, 23 Sept 2024 at 22:04, David Rowley <dgrowleyml@gmail.com> wrote:
> Can you have a look at the attached and let me know if it's easier to
> understand now?

Pushed.

David