Re: Autovacuum on partitioned table (autoanalyze) - Mailing list pgsql-hackers

From yuzuko
Subject Re: Autovacuum on partitioned table (autoanalyze)
Date
Msg-id CAKkQ509GJpRHRSYkrvcAnLbO3KEs4=5Nbydpv54Gu2wOQtZPJw@mail.gmail.com
Whole thread Raw
In response to Re: Autovacuum on partitioned table (autoanalyze)  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Autovacuum on partitioned table (autoanalyze)
List pgsql-hackers
Hello Alvaro,

Thank you for your comments.

>
> > In second thought about the reason for the "toprel_oid". It is perhaps
> > to avoid "wrongly" propagated values to ancestors after a manual
> > ANALYZE on a partitioned table.  But the same happens after an
> > autoanalyze iteration if some of the ancestors of a leaf relation are
> > analyzed before the leaf relation in a autoanalyze iteration.  That
> > can trigger an unnecessary analyzing for some of the ancestors.
>
> I'm not sure I understand this point.  I think we should only trigger
> this on analyzes of *leaf* partitions, not intermediate partitioned
> relations.  That way you would never get these unnecesary analyzes.
> Am I missing something?
>
> (So with my proposal in the previous email, we would send the list of
> ancestor relations after analyzing a leaf partition.  Whenever we
> analyze a non-leaf, then the list of ancestors is sent as an empty
> list.)
>
The problem Horiguchi-san mentioned is as follows:

create table p1 (i int) partition by range(i);
create table p1_1 partition of p1 for values from (0) to (500)
partition by range(i);
create table p1_1_1 partition of p1_1 for values from (0) to (300);
insert into p1 select generate_series(0,299);

-- After auto analyze (first time)
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
 relname | n_mod_since_analyze |       last_autoanalyze
---------+---------------------+-------------------------------
 p1       |                  300 |
 p1_1    |                 300 |
 p1_1_1  |                   0 | 2020-12-02 22:24:18.753574+09
(3 rows)

-- Insert more rows
postgres=# insert into p1 select generate_series(0,199);
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
 relname | n_mod_since_analyze |       last_autoanalyze
---------+---------------------+-------------------------------
 p1      |                   300 |
 p1_1    |                 300 |
 p1_1_1  |                 200 | 2020-12-02 22:24:18.753574+09
(3 rows)

-- After auto analyze (second time)
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
relname | n_mod_since_analyze |       last_autoanalyze
---------+---------------------+-------------------------------
 p1      |                       0 | 2020-12-02 22:25:18.857248+09
 p1_1    |                 200 | 2020-12-02 22:25:18.661932+09
 p1_1_1  |                   0 | 2020-12-02 22:25:18.792078+09

After 2nd auto analyze, all relations' n_mod_since_analyze should be 0,
but p1_1's is not.  This is because p1_1 was analyzed before p1_1_1.
So p1_1 will be analyzed again in the 3rd auto analyze.
That is propagating changes_since_analyze to *all ancestors* may cause
unnecessary analyzes even if we do this only when analyzing leaf partitions.
So I think we should track ancestors which are not analyzed in the current
iteration as Horiguchi-san proposed.

Regarding your idea:
> typedef struct PgStat_MsgAnalyze
> {
>    PgStat_MsgHdr  m_hdr;
>    Oid            m_databaseid;
>    Oid            m_tableoid;
>    bool           m_autovacuum;
>    bool           m_resetcounter;
>    TimestampTz    m_analyzetime;
>    PgStat_Counter m_live_tuples;
>    PgStat_Counter m_dead_tuples;
>    int            m_nancestors;
>    Oid            m_ancestors[PGSTAT_NUM_ANCESTORENTRIES];
> } PgStat_MsgAnalyze;

I'm not sure but how about storing only ancestors that aren't analyzed
in the current
iteration in m_ancestors[PGSTAT_NUM_ANCESTORENTRIES] ?


> > > > Regarding the counters in pg_stat_all_tables: maybe some of these should be
> > > > null rather than zero ?  Or else you should make an 0001 patch to fully
> > > > implement this view, with all relevant counters, not just n_mod_since_analyze,
> > > > last_*analyze, and *analyze_count.  These are specifically misleading:
> > > >
> > > > last_vacuum         |
> > > > last_autovacuum     |
> > > > n_ins_since_vacuum  | 0
> > > > vacuum_count        | 0
> > > > autovacuum_count    | 0
> > > >
> > > I haven't modified this part yet, but you meant that we should set
> > > null to counters
> > > about vacuum because partitioned tables are not vacuumed?
> >
> > Perhaps bacause partitioned tables *cannot* be vacuumed.  I'm not sure
> > what is the best way here.  Showing null seems reasonable but I'm not
> > sure that doesn't break anything.
>
> I agree that showing NULLs for the vacuum columns is better.  Perhaps
> the most reasonable way to do this is use -1 as an indicator that NULL
> ought to be returned from pg_stat_get_vacuum_count() et al, and add a
> boolean in PgStat_TableCounts next to t_truncated, maybe "t_nullvacuum"
> that says to store -1 instead of 0 in pgstat_recv_tabstat.
>
Thank you for the advice.  I'll fix it based on this idea.

-- 
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: convert elog(LOG) calls to ereport
Next
From: Sergei Kornilov
Date:
Subject: Re: pg_stat_statements oddity with track = all