Thread: Partitioned index can be not dumped
Hi. I've seen the following effect on PostgreSQL 14 stable branch. Index, created on partitioned table, disappears from pg_dump or psql \d output. This seems to begin after analyze. Partitoned relation relhasindex pg_class field suddenly becomes false. The issue happens after commit 0e69f705cc1a3df273b38c9883fb5765991e04fe (HEAD, refs/bisect/bad) Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Date: Fri Apr 9 11:29:08 2021 -0400 Set pg_class.reltuples for partitioned tables When commit 0827e8af70f4 added auto-analyze support for partitioned tables, it included code to obtain reltuples for the partitioned table as a number of catalog accesses to read pg_class.reltuples for each partition. That's not only very inefficient, but also problematic because autovacuum doesn't hold any locks on any of those tables -- and doesn't want to. Replace that code with a read of pg_class.reltuples for the partitioned table, and make sure ANALYZE and TRUNCATE properly maintain that value. I found no code that would be affected by the change of relpages from zero to non-zero for partitioned tables, and no other code that should be maintaining it, but if there is, hopefully it'll be an easy fix. Per buildfarm. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> It seems that in this commit we unconditionally overwrite this data with 0. I've tried to fix it by getting this information when inh is true and ignoring nindexes when inh is not true. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Alexander Pyhalov писал 2021-06-30 17:26: > Hi. > > Sorry, test had an issue. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
On 2021-Jun-30, Alexander Pyhalov wrote: > Hi. > > I've seen the following effect on PostgreSQL 14 stable branch. > Index, created on partitioned table, disappears from pg_dump or psql \d > output. > This seems to begin after analyze. Partitoned relation relhasindex pg_class > field suddenly becomes false. Uh, ouch. I'll look into this shortly. -- Álvaro Herrera Valdivia, Chile https://www.EnterpriseDB.com/
On 2021-Jun-30, Alexander Pyhalov wrote: > I've seen the following effect on PostgreSQL 14 stable branch. > Index, created on partitioned table, disappears from pg_dump or psql \d > output. > This seems to begin after analyze. Partitoned relation relhasindex pg_class > field suddenly becomes false. Yeah, that seems correct. I didn't verify your test case, but after looking at the code I thought there was a bit too much churn and the new conditions looked quite messy and unexplained. It seems simpler to be explicit at the start about what we're doing, and keep nindexes=0 for partitioned tables; with that, the code works unchanged because the "for" loops do nothing without having to check for anything. My proposal is attached. I did run the tests and they do pass, but I didn't look very closely at what the tests are actually doing. I noticed that part of that comment seems to be a leftover from ... I don't know when: "We do not analyze index columns if there was an explicit column list in the ANALYZE command, however." I suppose this is about some code that was removed, but I didn't dig into it. -- Álvaro Herrera Valdivia, Chile "How strange it is to find the words "Perl" and "saner" in such close proximity, with no apparent sense of irony. I doubt that Larry himself could have managed it." (ncm, http://lwn.net/Articles/174769/)
Attachment
On Wed, Jun 30, 2021 at 11:54 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jun-30, Alexander Pyhalov wrote:
> I've seen the following effect on PostgreSQL 14 stable branch.
> Index, created on partitioned table, disappears from pg_dump or psql \d
> output.
> This seems to begin after analyze. Partitoned relation relhasindex pg_class
> field suddenly becomes false.
Yeah, that seems correct.
I didn't verify your test case, but after looking at the code I thought
there was a bit too much churn and the new conditions looked quite messy
and unexplained. It seems simpler to be explicit at the start about
what we're doing, and keep nindexes=0 for partitioned tables; with that,
the code works unchanged because the "for" loops do nothing without
having to check for anything. My proposal is attached.
I did run the tests and they do pass, but I didn't look very closely at
what the tests are actually doing.
I noticed that part of that comment seems to be a leftover from ... I
don't know when: "We do not analyze index columns if there was an
explicit column list in the ANALYZE command, however." I suppose this
is about some code that was removed, but I didn't dig into it.
--
Álvaro Herrera Valdivia, Chile
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it." (ncm, http://lwn.net/Articles/174769/)
Hi,
nit:
- if (hasindex)+ if (nindexes > 0)
It seems hasindex is no longer needed since nindexes is checked.
Cheers
Álvaro Herrera писал 2021-06-30 21:54: > On 2021-Jun-30, Alexander Pyhalov wrote: > >> I've seen the following effect on PostgreSQL 14 stable branch. >> Index, created on partitioned table, disappears from pg_dump or psql >> \d >> output. >> This seems to begin after analyze. Partitoned relation relhasindex >> pg_class >> field suddenly becomes false. > > Yeah, that seems correct. > > I didn't verify your test case, but after looking at the code I thought > there was a bit too much churn and the new conditions looked quite > messy > and unexplained. It seems simpler to be explicit at the start about > what we're doing, and keep nindexes=0 for partitioned tables; with > that, > the code works unchanged because the "for" loops do nothing without > having to check for anything. My proposal is attached. > > I did run the tests and they do pass, but I didn't look very closely at > what the tests are actually doing. > > I noticed that part of that comment seems to be a leftover from ... I > don't know when: "We do not analyze index columns if there was an > explicit column list in the ANALYZE command, however." I suppose this > is about some code that was removed, but I didn't dig into it. Looks good. It seems this comment refers to line 455. 445 if (nindexes > 0) 446 { 447 indexdata = (AnlIndexData *) palloc0(nindexes * sizeof(AnlIndexData)); 448 for (ind = 0; ind < nindexes; ind++) 449 { 450 AnlIndexData *thisdata = &indexdata[ind]; 451 IndexInfo *indexInfo; 452 453 thisdata->indexInfo = indexInfo = BuildIndexInfo(Irel[ind]); 454 thisdata->tupleFract = 1.0; /* fix later if partial */ 455 if (indexInfo->ii_Expressions != NIL && va_cols == NIL) 456 { 457 ListCell *indexpr_item = list_head(indexInfo->ii_Expressions); 458 459 thisdata->vacattrstats = (VacAttrStats **) 460 palloc(indexInfo->ii_NumIndexAttrs * sizeof(VacAttrStats *)); Also I've added non-necessary new line in test. Restored comment and removed new line. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
On 2021-Jun-30, Zhihong Yu wrote: > Hi, > nit: > - if (hasindex) > + if (nindexes > 0) > > It seems hasindex is no longer needed since nindexes is checked. It's still used to call vac_update_relstats(). We want nindexes to be 0 for partitioned tables, but still pass true when there are indexes. Please don't forget to trim the text of the email you're replying to. -- Álvaro Herrera Valdivia, Chile https://www.EnterpriseDB.com/
On Wed, Jun 30, 2021 at 2:32 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jun-30, Zhihong Yu wrote:
> Hi,
> nit:
> - if (hasindex)
> + if (nindexes > 0)
>
> It seems hasindex is no longer needed since nindexes is checked.
It's still used to call vac_update_relstats(). We want nindexes to be 0
for partitioned tables, but still pass true when there are indexes.
Hi,
In that case, I wonder whether nindexes can be negated following the call to vac_open_indexes().
vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
+ nindexes = -nindexes;
+ nindexes = -nindexes;
That way, hasindex can be dropped.
vac_update_relstats() call would become:
vac_update_relstats(onerel, -1, totalrows,
- 0, false, InvalidTransactionId,
+ 0, nindexes != 0, InvalidTransactionId,
- 0, false, InvalidTransactionId,
+ 0, nindexes != 0, InvalidTransactionId,
My thinking is that without hasindex, the code is easier to maintain.
Thanks
Please don't forget to trim the text of the email you're replying to.
--
Álvaro Herrera Valdivia, Chile
https://www.EnterpriseDB.com/
On 2021-Jun-30, Zhihong Yu wrote: > Hi, > In that case, I wonder whether nindexes can be negated following the call > to vac_open_indexes(). > > vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel); > + nindexes = -nindexes; > > That way, hasindex can be dropped. > vac_update_relstats() call would become: > > vac_update_relstats(onerel, -1, totalrows, > - 0, false, InvalidTransactionId, > + 0, nindexes != 0, InvalidTransactionId, Perhaps this works, but I don't think it's a readability improvement. > My thinking is that without hasindex, the code is easier to maintain. You have one less variable but one additional concept (negative nindexes). It doesn't seem easier to me, TBH, rather the opposite. -- Álvaro Herrera 39°49'30"S 73°17'W https://www.EnterpriseDB.com/
On 2021-Jun-30, Alexander Pyhalov wrote: > Looks good. It seems this comment refers to line 455. Ah, so it does. I realized that we don't need to do vac_open_indexes for partitioned tables: it is sufficient to know whether there are any indexes at all. So I replaced that with RelationGetIndexList() and checking if the list is nonempty, specifically for the partitioned table case. Pushed now. Thanks for reporting and fixing this, and to Zhihong Yu for reviewing. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "Investigación es lo que hago cuando no sé lo que estoy haciendo" (Wernher von Braun)