Thread: Correct docs re: rewriting indexes when table rewrite is skipped

Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

29 March 2022, 14:03:45

Back in 367bc42 (for 9.2!) we "avoid[ed] index rebuild[ing] for
no-rewrite ALTER TABLE
.. ALTER TYPE." However the docs still claim that "a table rewrite is
not needed; but any indexes on the affected columns must still be
rebuilt."

I've attached a simple patch to update the docs to match the current behavior.

Thanks,
James Coleman

Attachment

v1-0001-Docs-When-table-rewriting-is-skipped-indexes-are-.patch

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Matthias van de Meent

Date:

29 March 2022, 15:29:32

On Tue, 29 Mar 2022 at 16:04, James Coleman <jtc331@gmail.com> wrote:
>
> Back in 367bc42 (for 9.2!) we "avoid[ed] index rebuild[ing] for
> no-rewrite ALTER TABLE
> .. ALTER TYPE." However the docs still claim that "a table rewrite is
> not needed; but any indexes on the affected columns must still be
> rebuilt."

Although indexes might indeed not need a rebuild, in many cases they
still do (e.g. when the type changes between text and a domain of text
with a different collation).

I think that the current state of the docs is better in that regard;
as it explicitly warns for index rebuilds, even when the letter of the
docs is incorrect: there are indeed cases we don't need to rebuild the
indexes; but that would require more elaboration.

Kind regards,

Matthias van de Meent

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

30 March 2022, 14:04:21

On Tue, Mar 29, 2022 at 11:29 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> On Tue, 29 Mar 2022 at 16:04, James Coleman <jtc331@gmail.com> wrote:
> >
> > Back in 367bc42 (for 9.2!) we "avoid[ed] index rebuild[ing] for
> > no-rewrite ALTER TABLE
> > .. ALTER TYPE." However the docs still claim that "a table rewrite is
> > not needed; but any indexes on the affected columns must still be
> > rebuilt."
>
> Although indexes might indeed not need a rebuild, in many cases they
> still do (e.g. when the type changes between text and a domain of text
> with a different collation).
>
> I think that the current state of the docs is better in that regard;
> as it explicitly warns for index rebuilds, even when the letter of the
> docs is incorrect: there are indeed cases we don't need to rebuild the
> indexes; but that would require more elaboration.

Admittedly I hadn't thought of that case. But isn't it already covered
in the existing docs by the phrase "or an unconstrained domain over
the new type"? I don't love the word "or" there because there's a
sense in which the first clause "binary coercible to the new type" is
still accurate for your example unless you narrowly separate "domain"
and "type", but I think that narrow distinction is what's technically
there already.

That being said, I could instead of removing the clause entirely
replace it with something like "indexes may still need to be rebuilt
when the new type is a constrained domain".

Thoughts?

James Coleman

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Robert Haas

Date:

30 March 2022, 15:41:03

On Wed, Mar 30, 2022 at 10:04 AM James Coleman <jtc331@gmail.com> wrote:
> Admittedly I hadn't thought of that case. But isn't it already covered
> in the existing docs by the phrase "or an unconstrained domain over
> the new type"? I don't love the word "or" there because there's a
> sense in which the first clause "binary coercible to the new type" is
> still accurate for your example unless you narrowly separate "domain"
> and "type", but I think that narrow distinction is what's technically
> there already.
>
> That being said, I could instead of removing the clause entirely
> replace it with something like "indexes may still need to be rebuilt
> when the new type is a constrained domain".

You're talking about this as if the normal cases is that indexes don't
need to rebuilt, but sometimes they do. However, if I recall
correctly, the way the code is structured, it supposes that the
indexes do need a rebuild, and then tries to prove that they actually
don't. That disconnect makes me nervous, because it seems to me that
it could easily lead to a situation where our documentation contains
subtle errors. I think somebody should go through and study the
algorithm as it exists in the code, and then write documentation to
match it. And I think that when they do that, they should approach it
from the same point of view that the code does i.e. "these are the
conditions for skipping the index rebuild" rather than "these are the
conditions for performing an index rebuild." By doing it that way, I
think we minimize the likelihood of disconnects between code and
documentation, and also make it easier to update in future if the
algorithm gets changed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

30 March 2022, 20:33:18

On Wed, Mar 30, 2022 at 11:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:04 AM James Coleman <jtc331@gmail.com> wrote:
> > Admittedly I hadn't thought of that case. But isn't it already covered
> > in the existing docs by the phrase "or an unconstrained domain over
> > the new type"? I don't love the word "or" there because there's a
> > sense in which the first clause "binary coercible to the new type" is
> > still accurate for your example unless you narrowly separate "domain"
> > and "type", but I think that narrow distinction is what's technically
> > there already.
> >
> > That being said, I could instead of removing the clause entirely
> > replace it with something like "indexes may still need to be rebuilt
> > when the new type is a constrained domain".
>
> You're talking about this as if the normal cases is that indexes don't
> need to rebuilt, but sometimes they do. However, if I recall
> correctly, the way the code is structured, it supposes that the
> indexes do need a rebuild, and then tries to prove that they actually
> don't. That disconnect makes me nervous, because it seems to me that
> it could easily lead to a situation where our documentation contains
> subtle errors. I think somebody should go through and study the
> algorithm as it exists in the code, and then write documentation to
> match it. And I think that when they do that, they should approach it
> from the same point of view that the code does i.e. "these are the
> conditions for skipping the index rebuild" rather than "these are the
> conditions for performing an index rebuild." By doing it that way, I
> think we minimize the likelihood of disconnects between code and
> documentation, and also make it easier to update in future if the
> algorithm gets changed.

Hmm, having it match the way it works makes sense. Would you feel
comfortable with an intermediate step (queueing up that as a larger
change) changing the clause to something like "indexes will still have
to be rebuilt unless the system can guarantee that the sort order is
proven to be unchanged" (with appropriate wordsmithing to be a bit
less verbose if possible)?

Thanks,
James Coleman

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Robert Haas

Date:

30 March 2022, 21:41:36

On Wed, Mar 30, 2022 at 4:33 PM James Coleman <jtc331@gmail.com> wrote:
> Hmm, having it match the way it works makes sense. Would you feel
> comfortable with an intermediate step (queueing up that as a larger
> change) changing the clause to something like "indexes will still have
> to be rebuilt unless the system can guarantee that the sort order is
> proven to be unchanged" (with appropriate wordsmithing to be a bit
> less verbose if possible)?

Yeah, that seems fine. It's arguable how much detail we should go into
here - but a statement of the form you propose is not misleading, and
that's what seems most important to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

31 March 2022, 13:17:46

On Wed, Mar 30, 2022 at 5:41 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 30, 2022 at 4:33 PM James Coleman <jtc331@gmail.com> wrote:
> > Hmm, having it match the way it works makes sense. Would you feel
> > comfortable with an intermediate step (queueing up that as a larger
> > change) changing the clause to something like "indexes will still have
> > to be rebuilt unless the system can guarantee that the sort order is
> > proven to be unchanged" (with appropriate wordsmithing to be a bit
> > less verbose if possible)?
>
> Yeah, that seems fine. It's arguable how much detail we should go into
> here - but a statement of the form you propose is not misleading, and
> that's what seems most important to me.

All right, thanks for feedback. Attached is v2 with such a change.
I've not included examples, and I'm about 50/50 on doing so. What are
your thoughts on adding in parens "e.g., changing from varchar to text
avoids rebuilding indexes while changing from text to a domain of text
with a different collation will require rebuilding indexes"?

Thanks,
James Coleman

Attachment

v2-0001-Docs-Index-rebuilding-is-sometimes-skipped-along-.patch

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Robert Haas

Date:

31 March 2022, 13:43:41

On Thu, Mar 31, 2022 at 9:17 AM James Coleman <jtc331@gmail.com> wrote:
> All right, thanks for feedback. Attached is v2 with such a change.
> I've not included examples, and I'm about 50/50 on doing so. What are
> your thoughts on adding in parens "e.g., changing from varchar to text
> avoids rebuilding indexes while changing from text to a domain of text
> with a different collation will require rebuilding indexes"?

On the patch, I suggest that instead of saying "can verify that sort
order and/or hashing semantics are unchanged" you say something like
"can verify that the new index would be logically equivalent to the
current one", mostly because I do not think that "and/or" looks very
good in formal writing.

I think it would be fine to include examples, but I think that the
phrasing you suggest here doesn't seem great. I'm not sure how to fix
it exactly. Maybe it needs a little more explanation?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

31 March 2022, 14:13:57

On Thu, Mar 31, 2022 at 9:43 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 31, 2022 at 9:17 AM James Coleman <jtc331@gmail.com> wrote:
> > All right, thanks for feedback. Attached is v2 with such a change.
> > I've not included examples, and I'm about 50/50 on doing so. What are
> > your thoughts on adding in parens "e.g., changing from varchar to text
> > avoids rebuilding indexes while changing from text to a domain of text
> > with a different collation will require rebuilding indexes"?
>
> On the patch, I suggest that instead of saying "can verify that sort
> order and/or hashing semantics are unchanged" you say something like
> "can verify that the new index would be logically equivalent to the
> current one", mostly because I do not think that "and/or" looks very
> good in formal writing.

Agreed re: "and/or".

> I think it would be fine to include examples, but I think that the
> phrasing you suggest here doesn't seem great. I'm not sure how to fix
> it exactly. Maybe it needs a little more explanation?

Is the attached more along the lines of what you were thinking?

Thanks,
James Coleman

Attachment

v3-0001-Docs-Index-rebuilding-is-sometimes-skipped-along-.patch

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Robert Haas

Date:

31 March 2022, 14:28:45

On Thu, Mar 31, 2022 at 10:14 AM James Coleman <jtc331@gmail.com> wrote:
> Is the attached more along the lines of what you were thinking?

Yeah. Maybe this would be a little clearer: "For example, if the
collation for a column has been changed, an index rebuild is always
required, because the new sort order might be different. However, in
the absence of a collation change, a column can be changed from text
to varchar or vice versa without rebuilding the indexes, because these
data types sort identically."

We don't seem to be very consistent about whether we write type names
like VARCHAR in upper case or lower case in the documentation. I'd
vote for using lower-case, but it probably doesn't matter much.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

31 March 2022, 14:51:28

On Thu, Mar 31, 2022 at 10:29 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 31, 2022 at 10:14 AM James Coleman <jtc331@gmail.com> wrote:
> > Is the attached more along the lines of what you were thinking?
>
> Yeah. Maybe this would be a little clearer: "For example, if the
> collation for a column has been changed, an index rebuild is always
> required, because the new sort order might be different. However, in
> the absence of a collation change, a column can be changed from text
> to varchar or vice versa without rebuilding the indexes, because these
> data types sort identically."

Updated.

> We don't seem to be very consistent about whether we write type names
> like VARCHAR in upper case or lower case in the documentation. I'd
> vote for using lower-case, but it probably doesn't matter much.

Grepping suggests lower case is more common (2 to 1) so I switched to
that. Interestingly for "text" it's 700+ to 1 :)

Thanks,
James Coleman

Attachment

v4-0001-Docs-Index-rebuilding-is-sometimes-skipped-along-.patch

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Robert Haas

Date:

31 March 2022, 19:24:57

On Thu, Mar 31, 2022 at 10:51 AM James Coleman <jtc331@gmail.com> wrote:
> Updated.

This version looks fine to me. If nobody objects I will commit it and
credit myself as a co-author.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

31 March 2022, 20:19:08

On Thu, Mar 31, 2022 at 3:25 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 31, 2022 at 10:51 AM James Coleman <jtc331@gmail.com> wrote:
> > Updated.
>
> This version looks fine to me. If nobody objects I will commit it and
> credit myself as a co-author.

Sounds great; thanks again for the review.

James Coleman

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

Robert Haas

Date:

01 April 2022, 12:58:27

On Thu, Mar 31, 2022 at 4:19 PM James Coleman <jtc331@gmail.com> wrote:
> On Thu, Mar 31, 2022 at 3:25 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, Mar 31, 2022 at 10:51 AM James Coleman <jtc331@gmail.com> wrote:
> > > Updated.
> >
> > This version looks fine to me. If nobody objects I will commit it and
> > credit myself as a co-author.
>
> Sounds great; thanks again for the review.

Done.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Correct docs re: rewriting indexes when table rewrite is skipped

From

James Coleman

Date:

01 April 2022, 13:01:49

On Fri, Apr 1, 2022 at 8:58 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Mar 31, 2022 at 4:19 PM James Coleman <jtc331@gmail.com> wrote:
> > On Thu, Mar 31, 2022 at 3:25 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > > On Thu, Mar 31, 2022 at 10:51 AM James Coleman <jtc331@gmail.com> wrote:
> > > > Updated.
> > >
> > > This version looks fine to me. If nobody objects I will commit it and
> > > credit myself as a co-author.
> >
> > Sounds great; thanks again for the review.
>
> Done.

Thanks. I marked the CF entry as committed.

James Coleman