Thread: Partitioned index can be not dumped

Partitioned index can be not dumped

From
Alexander Pyhalov
Date:
Hi.

I've seen the following effect on PostgreSQL 14 stable branch.
Index, created on partitioned table, disappears from pg_dump or psql \d 
output.
This seems to begin after analyze. Partitoned relation relhasindex 
pg_class field suddenly becomes false.

The issue happens after

commit 0e69f705cc1a3df273b38c9883fb5765991e04fe (HEAD, refs/bisect/bad)
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date:   Fri Apr 9 11:29:08 2021 -0400

     Set pg_class.reltuples for partitioned tables

     When commit 0827e8af70f4 added auto-analyze support for partitioned
     tables, it included code to obtain reltuples for the partitioned 
table
     as a number of catalog accesses to read pg_class.reltuples for each
     partition.  That's not only very inefficient, but also problematic
     because autovacuum doesn't hold any locks on any of those tables -- 
and
     doesn't want to.  Replace that code with a read of 
pg_class.reltuples
     for the partitioned table, and make sure ANALYZE and TRUNCATE 
properly
     maintain that value.

     I found no code that would be affected by the change of relpages 
from
     zero to non-zero for partitioned tables, and no other code that 
should
     be maintaining it, but if there is, hopefully it'll be an easy fix.

     Per buildfarm.

     Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
     Reviewed-by: Zhihong Yu <zyu@yugabyte.com>

It seems that in this commit we unconditionally overwrite this data with 
0.
I've tried to fix it by getting this information when inh is true and 
ignoring nindexes when inh is not true.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachment

Re: Partitioned index can be not dumped

From
Alexander Pyhalov
Date:
Alexander Pyhalov писал 2021-06-30 17:26:
> Hi.
> 
> 

Sorry, test had an issue.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachment

Re: Partitioned index can be not dumped

From
Álvaro Herrera
Date:
On 2021-Jun-30, Alexander Pyhalov wrote:

> Hi.
> 
> I've seen the following effect on PostgreSQL 14 stable branch.
> Index, created on partitioned table, disappears from pg_dump or psql \d
> output.
> This seems to begin after analyze. Partitoned relation relhasindex pg_class
> field suddenly becomes false.

Uh, ouch.

I'll look into this shortly.



-- 
Álvaro Herrera                        Valdivia, Chile
                        https://www.EnterpriseDB.com/



Re: Partitioned index can be not dumped

From
Álvaro Herrera
Date:
On 2021-Jun-30, Alexander Pyhalov wrote:

> I've seen the following effect on PostgreSQL 14 stable branch.
> Index, created on partitioned table, disappears from pg_dump or psql \d
> output.
> This seems to begin after analyze. Partitoned relation relhasindex pg_class
> field suddenly becomes false.

Yeah, that seems correct.

I didn't verify your test case, but after looking at the code I thought
there was a bit too much churn and the new conditions looked quite messy
and unexplained.  It seems simpler to be explicit at the start about
what we're doing, and keep nindexes=0 for partitioned tables; with that,
the code works unchanged because the "for" loops do nothing without
having to check for anything.  My proposal is attached.

I did run the tests and they do pass, but I didn't look very closely at
what the tests are actually doing.

I noticed that part of that comment seems to be a leftover from ... I
don't know when: "We do not analyze index columns if there was an
explicit column list in the ANALYZE command, however."  I suppose this
is about some code that was removed, but I didn't dig into it.

-- 
Álvaro Herrera                        Valdivia, Chile
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it."         (ncm, http://lwn.net/Articles/174769/)

Attachment

Re: Partitioned index can be not dumped

From
Zhihong Yu
Date:


On Wed, Jun 30, 2021 at 11:54 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jun-30, Alexander Pyhalov wrote:

> I've seen the following effect on PostgreSQL 14 stable branch.
> Index, created on partitioned table, disappears from pg_dump or psql \d
> output.
> This seems to begin after analyze. Partitoned relation relhasindex pg_class
> field suddenly becomes false.

Yeah, that seems correct.

I didn't verify your test case, but after looking at the code I thought
there was a bit too much churn and the new conditions looked quite messy
and unexplained.  It seems simpler to be explicit at the start about
what we're doing, and keep nindexes=0 for partitioned tables; with that,
the code works unchanged because the "for" loops do nothing without
having to check for anything.  My proposal is attached.

I did run the tests and they do pass, but I didn't look very closely at
what the tests are actually doing.

I noticed that part of that comment seems to be a leftover from ... I
don't know when: "We do not analyze index columns if there was an
explicit column list in the ANALYZE command, however."  I suppose this
is about some code that was removed, but I didn't dig into it.

--
Álvaro Herrera                        Valdivia, Chile
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it."         (ncm, http://lwn.net/Articles/174769/)
Hi,
nit:
-       if (hasindex)
+       if (nindexes > 0) 

It seems hasindex is no longer needed since nindexes is checked.

Cheers

Re: Partitioned index can be not dumped

From
Alexander Pyhalov
Date:
Álvaro Herrera писал 2021-06-30 21:54:
> On 2021-Jun-30, Alexander Pyhalov wrote:
> 
>> I've seen the following effect on PostgreSQL 14 stable branch.
>> Index, created on partitioned table, disappears from pg_dump or psql 
>> \d
>> output.
>> This seems to begin after analyze. Partitoned relation relhasindex 
>> pg_class
>> field suddenly becomes false.
> 
> Yeah, that seems correct.
> 
> I didn't verify your test case, but after looking at the code I thought
> there was a bit too much churn and the new conditions looked quite 
> messy
> and unexplained.  It seems simpler to be explicit at the start about
> what we're doing, and keep nindexes=0 for partitioned tables; with 
> that,
> the code works unchanged because the "for" loops do nothing without
> having to check for anything.  My proposal is attached.
> 
> I did run the tests and they do pass, but I didn't look very closely at
> what the tests are actually doing.
> 
> I noticed that part of that comment seems to be a leftover from ... I
> don't know when: "We do not analyze index columns if there was an
> explicit column list in the ANALYZE command, however."  I suppose this
> is about some code that was removed, but I didn't dig into it.

Looks good. It seems this comment refers to line 455.

  445     if (nindexes > 0)
  446     {
  447         indexdata = (AnlIndexData *) palloc0(nindexes * 
sizeof(AnlIndexData));
  448         for (ind = 0; ind < nindexes; ind++)
  449         {
  450             AnlIndexData *thisdata = &indexdata[ind];
  451             IndexInfo  *indexInfo;
  452
  453             thisdata->indexInfo = indexInfo = 
BuildIndexInfo(Irel[ind]);
  454             thisdata->tupleFract = 1.0; /* fix later if partial */
  455             if (indexInfo->ii_Expressions != NIL && va_cols == NIL)
  456             {
  457                 ListCell   *indexpr_item = 
list_head(indexInfo->ii_Expressions);
  458
  459                 thisdata->vacattrstats = (VacAttrStats **)
  460                     palloc(indexInfo->ii_NumIndexAttrs * 
sizeof(VacAttrStats *));

Also I've added non-necessary new line in test.
Restored comment and removed new line.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachment

Re: Partitioned index can be not dumped

From
Álvaro Herrera
Date:
On 2021-Jun-30, Zhihong Yu wrote:

> Hi,
> nit:
> -       if (hasindex)
> +       if (nindexes > 0)
> 
> It seems hasindex is no longer needed since nindexes is checked.

It's still used to call vac_update_relstats().  We want nindexes to be 0
for partitioned tables, but still pass true when there are indexes.

Please don't forget to trim the text of the email you're replying to.

-- 
Álvaro Herrera                        Valdivia, Chile
                        https://www.EnterpriseDB.com/



Re: Partitioned index can be not dumped

From
Zhihong Yu
Date:


On Wed, Jun 30, 2021 at 2:32 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jun-30, Zhihong Yu wrote:

> Hi,
> nit:
> -       if (hasindex)
> +       if (nindexes > 0)
>
> It seems hasindex is no longer needed since nindexes is checked.

It's still used to call vac_update_relstats().  We want nindexes to be 0
for partitioned tables, but still pass true when there are indexes.
Hi,
In that case, I wonder whether nindexes can be negated following the call to vac_open_indexes().

        vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
+       nindexes = -nindexes;

That way, hasindex can be dropped.
vac_update_relstats() call would become:

        vac_update_relstats(onerel, -1, totalrows,
-                           0, false, InvalidTransactionId,
+                           0, nindexes != 0, InvalidTransactionId,

My thinking is that without hasindex, the code is easier to maintain.

Thanks


Please don't forget to trim the text of the email you're replying to.

--
Álvaro Herrera                        Valdivia, Chile
                        https://www.EnterpriseDB.com/

Re: Partitioned index can be not dumped

From
Álvaro Herrera
Date:
On 2021-Jun-30, Zhihong Yu wrote:

> Hi,
> In that case, I wonder whether nindexes can be negated following the call
> to vac_open_indexes().
> 
>         vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
> +       nindexes = -nindexes;
> 
> That way, hasindex can be dropped.
> vac_update_relstats() call would become:
> 
>         vac_update_relstats(onerel, -1, totalrows,
> -                           0, false, InvalidTransactionId,
> +                           0, nindexes != 0, InvalidTransactionId,

Perhaps this works, but I don't think it's a readability improvement.

> My thinking is that without hasindex, the code is easier to maintain.

You have one less variable but one additional concept (negative
nindexes).  It doesn't seem easier to me, TBH, rather the opposite.

-- 
Álvaro Herrera                            39°49'30"S 73°17'W
                               https://www.EnterpriseDB.com/



Re: Partitioned index can be not dumped

From
Álvaro Herrera
Date:
On 2021-Jun-30, Alexander Pyhalov wrote:

> Looks good. It seems this comment refers to line 455.

Ah, so it does.

I realized that we don't need to do vac_open_indexes for partitioned
tables: it is sufficient to know whether there are any indexes at all.
So I replaced that with RelationGetIndexList() and checking if the list
is nonempty, specifically for the partitioned table case.  Pushed now.

Thanks for reporting and fixing this, and to Zhihong Yu for reviewing.

-- 
Álvaro Herrera              Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)