Thread: Allowing REINDEX to have an optional name

Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

10 May 2022, 09:13:13

A minor issue, and patch.

REINDEX DATABASE currently requires you to write REINDEX DATABASE
dbname, which makes this a little less usable than we might like.

REINDEX on the catalog can cause deadlocks, which also makes REINDEX
DATABASE not much use in practice, and is the reason there is no test
for REINDEX DATABASE. Another reason why it is a little less usable
than we might like.

Seems we should do something about these historic issues in the name
of product usability.

Attached patch allows new syntax for REINDEX DATABASE, without needing
to specify dbname. That version of the command skips catalog tables,
as a way of avoiding the known deadlocks. Patch also adds a test.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

reindex_not_require_database_name.v2.patch

Re: Allowing REINDEX to have an optional name

From

Bernd Helmle

Date:

10 May 2022, 12:29:31

Hi,

Am Dienstag, dem 10.05.2022 um 10:13 +0100 schrieb Simon Riggs:
> A minor issue, and patch.
>
> REINDEX DATABASE currently requires you to write REINDEX DATABASE
> dbname, which makes this a little less usable than we might like.
>
> REINDEX on the catalog can cause deadlocks, which also makes REINDEX
> DATABASE not much use in practice, and is the reason there is no test
> for REINDEX DATABASE. Another reason why it is a little less usable
> than we might like.
>
> Seems we should do something about these historic issues in the name
> of product usability.
>

Wow, i just recently had a look into that code and talked with my
colleagues on how the current behavior annoyed me last week....and
there you are! This community rocks ;)

> Attached patch allows new syntax for REINDEX DATABASE, without
> needing
> to specify dbname. That version of the command skips catalog tables,
> as a way of avoiding the known deadlocks. Patch also adds a test.
>

+        /* Unqualified REINDEX DATABASE will skip catalog
tables */
+        if (objectKind == REINDEX_OBJECT_DATABASE &&
+            objectName == NULL &&
+            IsSystemClass(relid, classtuple))
+            continue;

Hmm, shouldn't we just print a NOTICE or something like this in
addition to this check to tell the user that we are *not* really
reindexing all things (and probably give him a hint to use REINDEX
SYSTEM to cover them)?

    Thanks,
        Bernd

Re: Allowing REINDEX to have an optional name

From

Ashutosh Bapat

Date:

10 May 2022, 13:47:35

On Tue, May 10, 2022 at 2:43 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> A minor issue, and patch.
>
> REINDEX DATABASE currently requires you to write REINDEX DATABASE
> dbname, which makes this a little less usable than we might like.
>
> REINDEX on the catalog can cause deadlocks, which also makes REINDEX
> DATABASE not much use in practice, and is the reason there is no test
> for REINDEX DATABASE. Another reason why it is a little less usable
> than we might like.
>
> Seems we should do something about these historic issues in the name
> of product usability.
>
> Attached patch allows new syntax for REINDEX DATABASE, without needing
> to specify dbname. That version of the command skips catalog tables,
> as a way of avoiding the known deadlocks. Patch also adds a test.
>

From the patch it looks like with the patch applied running REINDEX
DATABASE is equivalent to running REINDEX DATABASE <current database>
except reindexing the shared catalogs. Is that correct?

Though the patch adds following change
+      Indexes on shared system catalogs are also processed, unless the
+      database name is omitted, in which case system catalog indexes
are skipped.

the syntax looks unintuitive.

I think REINDEX DATABASE reindexing the current database is a good
usability improvement in itself. But skipping the shared catalogs
needs an explicity syntax. Not sure how feasible it is but something
like REINDEX DATABASE skip SHARED/SYSTEM.

-- 
Best Wishes,
Ashutosh Bapat

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

10 May 2022, 14:00:11

On Tue, 10 May 2022 at 14:47, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, May 10, 2022 at 2:43 PM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
> >
> > A minor issue, and patch.
> >
> > REINDEX DATABASE currently requires you to write REINDEX DATABASE
> > dbname, which makes this a little less usable than we might like.
> >
> > REINDEX on the catalog can cause deadlocks, which also makes REINDEX
> > DATABASE not much use in practice, and is the reason there is no test
> > for REINDEX DATABASE. Another reason why it is a little less usable
> > than we might like.
> >
> > Seems we should do something about these historic issues in the name
> > of product usability.
> >
> > Attached patch allows new syntax for REINDEX DATABASE, without needing
> > to specify dbname. That version of the command skips catalog tables,
> > as a way of avoiding the known deadlocks. Patch also adds a test.
> >
>
> From the patch it looks like with the patch applied running REINDEX
> DATABASE is equivalent to running REINDEX DATABASE <current database>
> except reindexing the shared catalogs. Is that correct?

Yes

> Though the patch adds following change
> +      Indexes on shared system catalogs are also processed, unless the
> +      database name is omitted, in which case system catalog indexes
> are skipped.
>
> the syntax looks unintuitive.
>
> I think REINDEX DATABASE reindexing the current database is a good
> usability improvement in itself. But skipping the shared catalogs
> needs an explicity syntax. Not sure how feasible it is but something
> like REINDEX DATABASE skip SHARED/SYSTEM.

There are two commands:

REINDEX DATABASE does every except system catalogs
REINDEX SYSTEM does system catalogs only

So taken together, the two commands seem intuitive to me.

It is designed like this because it is dangerous to REINDEX the system
catalogs because of potential deadlocks, so we want a way to avoid
that problem.

Perhaps I can improve the docs more, will look.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Ashutosh Bapat

Date:

11 May 2022, 04:24:17

On Tue, May 10, 2022 at 7:30 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:

On Tue, 10 May 2022 at 14:47, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, May 10, 2022 at 2:43 PM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
> >
> > A minor issue, and patch.
> >
> > REINDEX DATABASE currently requires you to write REINDEX DATABASE
> > dbname, which makes this a little less usable than we might like.
> >
> > REINDEX on the catalog can cause deadlocks, which also makes REINDEX
> > DATABASE not much use in practice, and is the reason there is no test
> > for REINDEX DATABASE. Another reason why it is a little less usable
> > than we might like.
> >
> > Seems we should do something about these historic issues in the name
> > of product usability.
> >
> > Attached patch allows new syntax for REINDEX DATABASE, without needing
> > to specify dbname. That version of the command skips catalog tables,
> > as a way of avoiding the known deadlocks. Patch also adds a test.
> >
>
> From the patch it looks like with the patch applied running REINDEX
> DATABASE is equivalent to running REINDEX DATABASE <current database>
> except reindexing the shared catalogs. Is that correct?

Yes

> Though the patch adds following change
> + Indexes on shared system catalogs are also processed, unless the
> + database name is omitted, in which case system catalog indexes
> are skipped.
>
> the syntax looks unintuitive.
>
> I think REINDEX DATABASE reindexing the current database is a good
> usability improvement in itself. But skipping the shared catalogs
> needs an explicity syntax. Not sure how feasible it is but something
> like REINDEX DATABASE skip SHARED/SYSTEM.

There are two commands:

REINDEX DATABASE does every except system catalogs
REINDEX SYSTEM does system catalogs only

IIUC

REINDEX DATABASE <database name> does system catalogs as well

REINDEX DATABASE does everything except system catalogs

That's confusing and unintuitive.

Not providing the database name leads to ignoring system catalogs. I won't expect that from this syntax.

So taken together, the two commands seem intuitive to me.

It is designed like this because it is dangerous to REINDEX the system
catalogs because of potential deadlocks, so we want a way to avoid
that problem.

It's more clear if we add SKIP SYSTEM CATALOGS or some such explicit syntax.

Best Wishes,

Ashutosh

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

11 May 2022, 05:42:26

On Wed, May 11, 2022 at 09:54:17AM +0530, Ashutosh Bapat wrote:
> REINDEX DATABASE <database name> does system catalogs as well
> REINDEX DATABASE does everything except system catalogs
>
> That's confusing and unintuitive.

Agreed.  Nobody is going to remember the difference.  REINDEX's
parsing grammar is designed to be extensible because we have the
parenthesized flavor.  Why don't you add an option there to skip the
catalogs, like a SKIP_CATALOG?

> Not providing the database name leads to ignoring system catalogs. I won't
> expect that from this syntax.

I don't disagree with having a shortened grammar where the database
name is not required because one cannot reindex a database different
than the one connected to, but changing a behavior based on such a
grammar difference is not a good user experience.
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Bernd Helmle

Date:

11 May 2022, 08:00:56

Am Mittwoch, dem 11.05.2022 um 14:42 +0900 schrieb Michael Paquier:
> Agreed.  Nobody is going to remember the difference.  REINDEX's
> parsing grammar is designed to be extensible because we have the
> parenthesized flavor.  Why don't you add an option there to skip the
> catalogs, like a SKIP_CATALOG?

+1

Having an option is probably the best idea. Though we have REINDEX
SYSTEM, so i throw SKIP_SYSTEM into the ring as an alternative. This
would be consistent with the meaning of both commands/options.

    Thanks,
        Bernd

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

11 May 2022, 14:33:52

On Wed, 11 May 2022 at 05:24, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

>> It is designed like this because it is dangerous to REINDEX the system
>> catalogs because of potential deadlocks, so we want a way to avoid
>> that problem.
>
> It's more clear if we add SKIP SYSTEM CATALOGS or some such explicit syntax.

Clarity is not the issue. I am opposed to a default mode that does
something bad and non-useful.

If you want to reindex the system catalogs then we already have REINDEX SYSTEM.
So REINDEX (SKIP_SYSTEM_CATALOGS OFF) DATABASE would do the same thing.
But you don't want to run either of them because of deadlocking.

The only action that makes sense is to reindex the database, skipping
the catalog tables.

So I'm proposing a command that has useful default behavior.
i.e. REINDEX DATABASE is the same as REINDEX (SKIP_SYSTEM_CATALOGS ON) DATABASE.

If you make REINDEX DATABASE the same as REINDEX (SKIP_SYSTEM_CATALOGS
OFF) DATABASE then it is just dangerous and annoying, i.e. a POLA
violation.

The point of this was a usability improvement, not just new syntax.

--
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Cary Huang

Date:

27 May 2022, 19:08:51

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hello

The patch applies and tests fine and I think this patch has good intentions to prevent the default behavior of REINDEX
DATABASEto cause a deadlock. However, I am not in favor of simply omitting the database name after DATABASE clause
becauseof consistency. Almost all other queries involving the DATABASE clause require database name to be given
followingafter. For example, ALTER DATABASE [dbname].  
 

Being able to omit database name for REINDEX DATABASE seems inconsistent to me.

The documentation states that REINDEX DATABASE only works on the current database, but it still requires the user to
providea database name and require that it must match the current database. Not very useful option, isn’t it? But it is
stillrequired from the user to stay consistent with other DATABASE clauses.
 

Maybe the best way is to keep the query clause as is (with the database name still required) and simply don’t let it
reindexsystem catalog to prevent deadlock. At the end, give user a notification that system catalogs have not been
reindexed,and tell them to use REINDEX SYSTEM instead.
 

Cary Huang
-----------------
HighGo Software Canada
www.highgo.ca

Re: Allowing REINDEX to have an optional name

From

Bernd Helmle

Date:

31 May 2022, 08:51:39

Am Freitag, dem 27.05.2022 um 19:08 +0000 schrieb Cary Huang:


[...]

> The patch applies and tests fine and I think this patch has good
> intentions to prevent the default behavior of REINDEX DATABASE to
> cause a deadlock. However, I am not in favor of simply omitting the
> database name after DATABASE clause because of consistency. Almost
> all other queries involving the DATABASE clause require database name
> to be given following after. For example, ALTER DATABASE [dbname]. 
>
> Being able to omit database name for REINDEX DATABASE seems
> inconsistent to me.
>
> The documentation states that REINDEX DATABASE only works on the
> current database, but it still requires the user to provide a
> database name and require that it must match the current database.
> Not very useful option, isn’t it? But it is still required from the
> user to stay consistent with other DATABASE clauses.
>

Hmm, right, but you can see this from another perspective, too: For
example, ALTER DATABASE works by adjusting properties of other
databases very well, SET TABLESPACE can be used when not connected to
the target database only, so you are required to specify its name in
that case.
REINDEX DATABASE cannot reindex other databases than the one we're
connected to. Seen from that point, i currently can't see the logical
justification to have the database name there, besides of "yes, i
really meant that database i am connected to" or consistency.

> Maybe the best way is to keep the query clause as is (with the
> database name still required) and simply don’t let it reindex system
> catalog to prevent deadlock. At the end, give user a notification
> that system catalogs have not been reindexed, and tell them to use
> REINDEX SYSTEM instead.

+1

Re: Allowing REINDEX to have an optional name

From

Bernd Helmle

Date:

31 May 2022, 09:04:58

Am Dienstag, dem 10.05.2022 um 15:00 +0100 schrieb Simon Riggs:

[...]

>
> > I think REINDEX DATABASE reindexing the current database is a good
> > usability improvement in itself. But skipping the shared catalogs
> > needs an explicity syntax. Not sure how feasible it is but
> > something
> > like REINDEX DATABASE skip SHARED/SYSTEM.
>
> There are two commands:
>
> REINDEX DATABASE does every except system catalogs
> REINDEX SYSTEM does system catalogs only
>
> So taken together, the two commands seem intuitive to me.
>
> It is designed like this because it is dangerous to REINDEX the
> system
> catalogs because of potential deadlocks, so we want a way to avoid
> that problem.
>
> Perhaps I can improve the docs more, will look.
>

And we already have a situation where this already happens with REINDEX
DATABASE: if you use CONCURRENTLY, it skips system catalogs already and
prints a warning. In both cases there are good technical reasons to
skip catalog indexes and to change the workflow to use separate
commands.

    Bernd

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

31 May 2022, 12:09:29

On Tue, May 31, 2022 at 11:04:58AM +0200, Bernd Helmle wrote:
> And we already have a situation where this already happens with REINDEX
> DATABASE: if you use CONCURRENTLY, it skips system catalogs already and
> prints a warning. In both cases there are good technical reasons to
> skip catalog indexes and to change the workflow to use separate
> commands.

The case with CONCURRENTLY is different though: the option will never
work on system catalogs so we have to skip them.  Echoing with others
on this thread, I don't think that we should introduce a different
behavior on what's basically the same grammar.  That's just going to
lead to more confusion.  So REINDEX DATABASE with or without a
database name appended to it should always mean to reindex the
catalogs on top of the existing relations.
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Alvaro Herrera

Date:

31 May 2022, 12:30:32

On 2022-May-31, Michael Paquier wrote:

> The case with CONCURRENTLY is different though: the option will never
> work on system catalogs so we have to skip them.  Echoing with others
> on this thread, I don't think that we should introduce a different
> behavior on what's basically the same grammar.  That's just going to
> lead to more confusion.  So REINDEX DATABASE with or without a
> database name appended to it should always mean to reindex the
> catalogs on top of the existing relations.

I was thinking the opposite: REINDEX DATABASE with or without a database
name should always process the user relations and skip system catalogs.
If the user wants to do both, then they can use REINDEX SYSTEM in
addition.

The reason for doing it like this is that there is no way to process
only user tables and skip catalogs.  So this is better for
composability.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."                               (Nathaniel Smith)

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

02 June 2022, 00:02:33

On Tue, May 31, 2022 at 02:30:32PM +0200, Alvaro Herrera wrote:
> I was thinking the opposite: REINDEX DATABASE with or without a database
> name should always process the user relations and skip system catalogs.
> If the user wants to do both, then they can use REINDEX SYSTEM in
> addition.
>
> The reason for doing it like this is that there is no way to process
> only user tables and skip catalogs.  So this is better for
> composability.

No objections from me to keep this distinction at the end, as long as
the the database name in the command has no impact on the chosen
behavior.  Could there be a point in having a REINDEX ALL though that
would process both the user relations and the catalogs, doing the same
thing as REINDEX DATABASE today?

By the way, the patch had better avoid putting a global REINDEX
command that would process everything.  As far as I recall, we've
avoided such things on purpose because they are expensive, keeping
around only cases that generate errors or skip all the relations.
So having that in a TAP test would be better, I assume, for
isolation.
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Justin Pryzby

Date:

28 June 2022, 00:18:07

Maybe the answer is to 1) add a parenthesized option REINDEX(SYSTEM) (to allow
the current behavior); and 2) make REINDEX DATABASE an alias which implies
"SYSTEM false"; 3) prohibit REINDEX (SYSTEM true) SYSTEM, or consider removing
"REINDEX SYSTEM;".

That avoids the opaque and surprising behavior that "REINDEX DATABASE" skips
system stuff but "REINDEX DATABASE foo" doesn't, while allowing the old
behavior (disabled by default).

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

28 June 2022, 07:29:07

On Thu, 2 Jun 2022 at 01:02, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, May 31, 2022 at 02:30:32PM +0200, Alvaro Herrera wrote:
> > I was thinking the opposite: REINDEX DATABASE with or without a database
> > name should always process the user relations and skip system catalogs.
> > If the user wants to do both, then they can use REINDEX SYSTEM in
> > addition.
> >
> > The reason for doing it like this is that there is no way to process
> > only user tables and skip catalogs.  So this is better for
> > composability.
>
> No objections from me to keep this distinction at the end, as long as
> the the database name in the command has no impact on the chosen
> behavior.

OK, that's clear. Will progress.

> Could there be a point in having a REINDEX ALL though that
> would process both the user relations and the catalogs, doing the same
> thing as REINDEX DATABASE today?

A key point is that REINDEX SYSTEM has problems, so should be avoided.
Hence, including both database and system together in a new command
would not be great idea, at this time.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

28 June 2022, 10:02:25

On Tue, 28 Jun 2022 at 08:29, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Thu, 2 Jun 2022 at 01:02, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Tue, May 31, 2022 at 02:30:32PM +0200, Alvaro Herrera wrote:
> > > I was thinking the opposite: REINDEX DATABASE with or without a database
> > > name should always process the user relations and skip system catalogs.
> > > If the user wants to do both, then they can use REINDEX SYSTEM in
> > > addition.
> > >
> > > The reason for doing it like this is that there is no way to process
> > > only user tables and skip catalogs.  So this is better for
> > > composability.
> >
> > No objections from me to keep this distinction at the end, as long as
> > the the database name in the command has no impact on the chosen
> > behavior.
>
> OK, that's clear. Will progress.

Attached patch is tested, documented and imho ready to be committed,
so I will mark it so in CFapp.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

reindex_not_require_database_name.v3.patch

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

29 June 2022, 04:35:44

On Tue, Jun 28, 2022 at 11:02:25AM +0100, Simon Riggs wrote:
> Attached patch is tested, documented and imho ready to be committed,
> so I will mark it so in CFapp.

The behavior introduced by this patch should be reflected in
reindexdb.  See in particular reindex_one_database(), where a
REINDEX_SYSTEM is enforced first on the catalogs for the
non-concurrent mode when running the reindex on a database.

+-- unqualified reindex database
+-- if you want to test REINDEX DATABASE, uncomment the following line,
+-- but note that this adds about 0.5s to the regression tests and the
+-- results are volatile when run in parallel to other tasks. Note also
+-- that REINDEX SYSTEM is specifically not tested because it can deadlock.
+-- REINDEX (VERBOSE) DATABASE;

No need to add that IMHO.

 REINDEX [ ( <replaceable class="parameter">option</replaceable> [,
 ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [
 CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
+REINDEX [ ( <replaceable class="parameter">option</replaceable> [,
 ...] ) ] { DATABASE | SYSTEM } [ CONCURRENTLY ] [ <replaceable
 class="parameter">name</replaceable> ]

Shouldn't you remove DATABASE and SYSTEM from the first line, keeping
only INDEX. TABLE and SCHEMA?  The second line, with its optional
"name" would cover the DATABASE and SYSTEM cases at 100%.

-        if (strcmp(objectName, get_database_name(objectOid)) != 0)
+        if (objectName && strcmp(objectName, get_database_name(objectOid)) != 0)
             ereport(ERROR,
                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                      errmsg("can only reindex the currently open database")));
         if (!pg_database_ownercheck(objectOid, GetUserId()))
             aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE,
-                           objectName);
+                           get_database_name(objectOid));

This could call get_database_name() just once.

+            * You might think it would be good to include catalogs,
+            * but doing that can deadlock, so isn't much use in real world,
+            * nor can we safely test that it even works.

Not sure what you mean here exactly.
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

29 June 2022, 14:02:11

On Wed, 29 Jun 2022 at 05:35, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jun 28, 2022 at 11:02:25AM +0100, Simon Riggs wrote:
> > Attached patch is tested, documented and imho ready to be committed,
> > so I will mark it so in CFapp.

Thanks for the review Michael.

> The behavior introduced by this patch should be reflected in
> reindexdb.  See in particular reindex_one_database(), where a
> REINDEX_SYSTEM is enforced first on the catalogs for the
> non-concurrent mode when running the reindex on a database.

Originally, I was trying to avoid changing prior behavior, but now
that we have agreed to do so, this makes sense.

That section of code has been removed, tests updated. No changes to
docs seem to be required.

> +-- unqualified reindex database
> +-- if you want to test REINDEX DATABASE, uncomment the following line,
> +-- but note that this adds about 0.5s to the regression tests and the
> +-- results are volatile when run in parallel to other tasks. Note also
> +-- that REINDEX SYSTEM is specifically not tested because it can deadlock.
> +-- REINDEX (VERBOSE) DATABASE;
>
> No need to add that IMHO.

That was more a comment to reviewer, but I think something should be
said for later developers.

>  REINDEX [ ( <replaceable class="parameter">option</replaceable> [,
>  ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [
>  CONCURRENTLY ] <replaceable class="parameter">name</replaceable>
> +REINDEX [ ( <replaceable class="parameter">option</replaceable> [,
>  ...] ) ] { DATABASE | SYSTEM } [ CONCURRENTLY ] [ <replaceable
>  class="parameter">name</replaceable> ]
>
> Shouldn't you remove DATABASE and SYSTEM from the first line, keeping
> only INDEX. TABLE and SCHEMA?  The second line, with its optional
> "name" would cover the DATABASE and SYSTEM cases at 100%.

I agree that your proposal is clearer. Done.

> -        if (strcmp(objectName, get_database_name(objectOid)) != 0)
> +        if (objectName && strcmp(objectName, get_database_name(objectOid)) != 0)
>              ereport(ERROR,
>                      (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                       errmsg("can only reindex the currently open database")));
>          if (!pg_database_ownercheck(objectOid, GetUserId()))
>              aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE,
> -                           objectName);
> +                           get_database_name(objectOid));
>
> This could call get_database_name() just once.

It could, but I couldn't see any benefit in changing that for the code
under discussion.

If calling get_database_name() multiple times is an issue, I've added
a cache for that - another patch attached, if you think its worth it.

> +            * You might think it would be good to include catalogs,
> +            * but doing that can deadlock, so isn't much use in real world,
> +            * nor can we safely test that it even works.
>
> Not sure what you mean here exactly.

REINDEX SYSTEM can deadlock, which is why we are avoiding it.

This was a comment to later developers as to why things are done that
way. Feel free to update the wording or location, but something should
be mentioned to avoid later discussion.

Thanks for the review, new version attached.


--
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: Allowing REINDEX to have an optional name

From

Tom Lane

Date:

03 July 2022, 21:41:31

Simon Riggs <simon.riggs@enterprisedb.com> writes:
> Thanks for the review, new version attached.

This is marked as Ready for Committer, but that seems unduly
optimistic.  The cfbot shows that it's failing on all platforms ---
and not in the same way on each, suggesting there are multiple
problems.

            regards, tom lane

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

04 July 2022, 06:59:55

On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote:
> This is marked as Ready for Committer, but that seems unduly
> optimistic.

Please note that patch authors should not switch a patch as RfC by
themselves.  This is something that a reviewer should do.

> The cfbot shows that it's failing on all platforms ---
> and not in the same way on each, suggesting there are multiple
> problems.

A wild guess is that this comes from the patch that manipulates
get_database_name(), something that there is no need for as long as
the routine is called once in ReindexMultipleTables().
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Alvaro Herrera

Date:

04 July 2022, 08:26:19

On 2022-Jun-29, Simon Riggs wrote:

> > -        if (strcmp(objectName, get_database_name(objectOid)) != 0)
> > +        if (objectName && strcmp(objectName, get_database_name(objectOid)) != 0)
> >              ereport(ERROR,
> >                      (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> >                       errmsg("can only reindex the currently open database")));
> >          if (!pg_database_ownercheck(objectOid, GetUserId()))
> >              aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE,
> > -                           objectName);
> > +                           get_database_name(objectOid));
> >
> > This could call get_database_name() just once.
> 
> It could, but I couldn't see any benefit in changing that for the code
> under discussion.
> 
> If calling get_database_name() multiple times is an issue, I've added
> a cache for that - another patch attached, if you think its worth it.

TBH I doubt that this is an issue: since we're throwing an error anyway,
the memory would be released, and error cases are not considered worth
of performance optimization anyway.

Putting that thought aside, if we were to think that this is an issue, I
don't think the cache as implemented here is a good idea, because then
caller is responsible for tracking whether to free or not the return
value.

I think that Michaël's idea could be implemented more easily by having a
local variable that receives the return value from get_database_name.
But I think the coding as Simon had it was all right.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Las navajas y los monos deben estar siempre distantes"   (Germán Poo)

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

15 July 2022, 03:44:16

On Mon, Jul 04, 2022 at 03:59:55PM +0900, Michael Paquier wrote:
> Please note that patch authors should not switch a patch as RfC by
> themselves.  This is something that a reviewer should do.

This patch has been marked as waiting for a review, however the CF bot
is completely red:
http://commitfest.cputube.org/simon-riggs.html

Could you take care of those issues first?
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

15 July 2022, 07:47:58

On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Jul 04, 2022 at 03:59:55PM +0900, Michael Paquier wrote:
> > Please note that patch authors should not switch a patch as RfC by
> > themselves.  This is something that a reviewer should do.
>
> http://commitfest.cputube.org/simon-riggs.html

Thanks for showing me that, it is very helpful.

> This patch has been marked as waiting for a review, however the CF bot
> is completely red:

Yes, it is failing, but so is current HEAD, with some kind of libpq
pipelining error.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Alvaro Herrera

Date:

15 July 2022, 08:12:12

On 2022-Jul-15, Simon Riggs wrote:

> On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote:

> > This patch has been marked as waiting for a review, however the CF bot
> > is completely red:
> 
> Yes, it is failing, but so is current HEAD, with some kind of libpq
> pipelining error.

Hmm, is it?  Where can I see that?  The buildfarm looks OK ...

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

15 July 2022, 11:15:32

On Fri, 15 Jul 2022 at 09:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Jul-15, Simon Riggs wrote:
>
> > On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote:
>
> > > This patch has been marked as waiting for a review, however the CF bot
> > > is completely red:
> >
> > Yes, it is failing, but so is current HEAD, with some kind of libpq
> > pipelining error.
>
> Hmm, is it?  Where can I see that?  The buildfarm looks OK ...

I got this... so cleaning up and retesting.

# Looks like you failed 2 tests of 20.
t/001_libpq_pipeline.pl .. Dubious, test returned 2 (wstat 512, 0x200)

Failed 2/20 subtests

Test Summary Report
-------------------
t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2)
  Failed tests:  9-10
  Non-zero exit status: 2
Files=1, Tests=20,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.66 cusr
 0.64 csys =  1.32 CPU)
Result: FAIL

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

15 July 2022, 11:19:37

On Fri, 15 Jul 2022 at 12:15, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Fri, 15 Jul 2022 at 09:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2022-Jul-15, Simon Riggs wrote:
> >
> > > On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > > > This patch has been marked as waiting for a review, however the CF bot
> > > > is completely red:
> > >
> > > Yes, it is failing, but so is current HEAD, with some kind of libpq
> > > pipelining error.
> >
> > Hmm, is it?  Where can I see that?  The buildfarm looks OK ...
>
> I got this... so cleaning up and retesting.
>
> # Looks like you failed 2 tests of 20.
> t/001_libpq_pipeline.pl .. Dubious, test returned 2 (wstat 512, 0x200)
>
> Failed 2/20 subtests
>
> Test Summary Report
> -------------------
> t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2)
>   Failed tests:  9-10
>   Non-zero exit status: 2
> Files=1, Tests=20,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.66 cusr
>  0.64 csys =  1.32 CPU)
> Result: FAIL

Yeh, repeated failures on MacOS Catalina with HEAD.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

15 July 2022, 11:25:33

On Mon, 4 Jul 2022 at 08:00, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote:
> > This is marked as Ready for Committer, but that seems unduly
> > optimistic.
>
> Please note that patch authors should not switch a patch as RfC by
> themselves.  This is something that a reviewer should do.
>
> > The cfbot shows that it's failing on all platforms ---
> > and not in the same way on each, suggesting there are multiple
> > problems.
>
> A wild guess is that this comes from the patch that manipulates
> get_database_name(), something that there is no need for as long as
> the routine is called once in ReindexMultipleTables().

OK, let me repost the new patch and see if CFbot likes that better.

This includes Michael's other requested changes for reindexdb.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

reindex_not_require_database_name.v5.patch

Re: Allowing REINDEX to have an optional name

From

Alvaro Herrera

Date:

15 July 2022, 11:29:33

On 2022-Jul-15, Simon Riggs wrote:

> > Test Summary Report
> > -------------------
> > t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2)
> >   Failed tests:  9-10
> >   Non-zero exit status: 2
> > Files=1, Tests=20,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.66 cusr
> >  0.64 csys =  1.32 CPU)
> > Result: FAIL
> 
> Yeh, repeated failures on MacOS Catalina with HEAD.

Can you share the contents of src/test/modules/libpq_pipeline/tmp_check?
Since these failures are not visible in the buildfarm (where we do have
11.0 as well as some 10.X versions of macOS), this is surprising.

Anyway, the errors shown for your patch in the cfbot are in the
pg_upgrade test, so I suggest to have a look at the logs for those
anyway.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Y una voz del caos me habló y me dijo
"Sonríe y sé feliz, podría ser peor".
Y sonreí. Y fui feliz.
Y fue peor.

Re: Allowing REINDEX to have an optional name

From

Alvaro Herrera

Date:

15 July 2022, 11:58:56

On 2022-Jul-15, Alvaro Herrera wrote:

> > Yeh, repeated failures on MacOS Catalina with HEAD.
> 
> Can you share the contents of src/test/modules/libpq_pipeline/tmp_check?
> Since these failures are not visible in the buildfarm (where we do have
> 11.0 as well as some 10.X versions of macOS), this is surprising.

So I got the log files, and the error is clear:

: # Running: libpq_pipeline -r 700 -t
/Users/sriggs/pg/pg-git/postgresql/src/test/modules/libpq_pipeline/tmp_check/traces/pipeline_idle.tracepipeline_idle
port=57444host=/var/folders/rd/kxv86w7567z9jk5qt7lhxfwr0000gn/T/Od6QFSH7TB dbname='postgres'

: 
: pipeline idle...
: NOTICE 1: message type 0x33 arrived from server while idle
: 
: libpq_pipeline:1037: got 1 notice(s)
: [12:17:00.181](0.016s) not ok 9 - libpq_pipeline pipeline_idle

then the trace test also fails, but only because it is truncated at the
point where the notice is reported; up to that point, the trace matches
correctly.

Not sure what to make of this.  Maybe the fix in 054325c5eeb3 is not
right, but I don't understand why it doesn't fail in the macOS machines
in the buildfarm.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio)

Re: Allowing REINDEX to have an optional name

From

Alvaro Herrera

Date:

15 July 2022, 13:03:34

On 2022-Jul-15, Alvaro Herrera wrote:

> Not sure what to make of this.  Maybe the fix in 054325c5eeb3 is not
> right, but I don't understand why it doesn't fail in the macOS machines
> in the buildfarm.

Ah, one theory is that the libpq_pipeline program is getting linked to
an installed version of libpq that doesn't contain the fixes.  Maybe you
can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it
is picking up?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

Re: Allowing REINDEX to have an optional name

From

Tom Lane

Date:

15 July 2022, 14:03:24

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Ah, one theory is that the libpq_pipeline program is getting linked to
> an installed version of libpq that doesn't contain the fixes.  Maybe you
> can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it
> is picking up?

That's pronounced "otool -L" on macOS.  But in any case, it's going
to point at the installation directory.  One of the moving parts here
is that "make check" will try to override the rpath that otool tells
you about to make test programs use the libpq.dylib from the build tree.
I say "try" because if you've got SIP enabled (see what "csrutil status"
tells you), it will fail to do so and the installed libpq will be used.
Maybe that's old.

Standard recommendation on macOS with SIP on is to always do "make
install" before "make check".

            regards, tom lane

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

15 July 2022, 17:20:16

On Fri, 15 Jul 2022 at 15:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > Ah, one theory is that the libpq_pipeline program is getting linked to
> > an installed version of libpq that doesn't contain the fixes.  Maybe you
> > can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it
> > is picking up?
>
> That's pronounced "otool -L" on macOS.  But in any case, it's going
> to point at the installation directory.  One of the moving parts here
> is that "make check" will try to override the rpath that otool tells
> you about to make test programs use the libpq.dylib from the build tree.
> I say "try" because if you've got SIP enabled (see what "csrutil status"
> tells you), it will fail to do so and the installed libpq will be used.
> Maybe that's old.
>
> Standard recommendation on macOS with SIP on is to always do "make
> install" before "make check".

Thanks, will investigate.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

15 July 2022, 17:21:22

On Fri, 15 Jul 2022 at 12:25, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Mon, 4 Jul 2022 at 08:00, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote:
> > > This is marked as Ready for Committer, but that seems unduly
> > > optimistic.
> >
> > Please note that patch authors should not switch a patch as RfC by
> > themselves.  This is something that a reviewer should do.
> >
> > > The cfbot shows that it's failing on all platforms ---
> > > and not in the same way on each, suggesting there are multiple
> > > problems.
> >
> > A wild guess is that this comes from the patch that manipulates
> > get_database_name(), something that there is no need for as long as
> > the routine is called once in ReindexMultipleTables().
>
> OK, let me repost the new patch and see if CFbot likes that better.
>
> This includes Michael's other requested changes for reindexdb.

That's fixed it on the CFbot. Over to you, Michael. Thanks.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

17 July 2022, 06:19:47

On Fri, Jul 15, 2022 at 06:21:22PM +0100, Simon Riggs wrote:
> That's fixed it on the CFbot. Over to you, Michael. Thanks.

Sure.  I have looked over that, and this looks fine overall.  I have
made two changes though.

        if (objectKind == REINDEX_OBJECT_SYSTEM &&
-           !IsSystemClass(relid, classtuple))
+           !IsCatalogRelationOid(relid))
+           continue;
+       else if (objectKind == REINDEX_OBJECT_DATABASE &&
+                IsCatalogRelationOid(relid))

The patch originally relied on IsSystemClass() to decide if a relation
is a catalog table or not.  This is not wrong in itself because
ReindexMultipleTables() discards RELKIND_TOAST a couple of lines
above, but I think that we should switch to IsCatalogRelationOid() as
that's the line drawn to check for the catalog-ness of a relation.

The second thing is test coverage.  Using a REINDEX DATABASE/SYSTEM
within the main regression test suite is not a good idea, but we
already have those commands running in the reindexdb suite so I could
not resist expanding the test section to track and check relfilenode
changes through four relations for these cases:
- Catalog index.
- Catalog toast index.
- User table index.
- User toast index.
The relfilenodes of those relations are saved in a table and
cross-checked with the contents of pg_class after each REINDEX, on
SYSTEM or DATABASE.  There are no new heavy commands, so it does not
make the test longer.

With all that, I finish with the attached.  Does that look fine to
you?
--
Michael

Attachment

Re: Allowing REINDEX to have an optional name

From

Simon Riggs

Date:

17 July 2022, 09:58:26

On Sun, 17 Jul 2022 at 07:19, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Jul 15, 2022 at 06:21:22PM +0100, Simon Riggs wrote:
> > That's fixed it on the CFbot. Over to you, Michael. Thanks.
>
> Sure.  I have looked over that, and this looks fine overall.  I have
> made two changes though.
>
>         if (objectKind == REINDEX_OBJECT_SYSTEM &&
> -           !IsSystemClass(relid, classtuple))
> +           !IsCatalogRelationOid(relid))
> +           continue;
> +       else if (objectKind == REINDEX_OBJECT_DATABASE &&
> +                IsCatalogRelationOid(relid))
>
> The patch originally relied on IsSystemClass() to decide if a relation
> is a catalog table or not.  This is not wrong in itself because
> ReindexMultipleTables() discards RELKIND_TOAST a couple of lines
> above, but I think that we should switch to IsCatalogRelationOid() as
> that's the line drawn to check for the catalog-ness of a relation.
>
> The second thing is test coverage.  Using a REINDEX DATABASE/SYSTEM
> within the main regression test suite is not a good idea, but we
> already have those commands running in the reindexdb suite so I could
> not resist expanding the test section to track and check relfilenode
> changes through four relations for these cases:
> - Catalog index.
> - Catalog toast index.
> - User table index.
> - User toast index.
> The relfilenodes of those relations are saved in a table and
> cross-checked with the contents of pg_class after each REINDEX, on
> SYSTEM or DATABASE.  There are no new heavy commands, so it does not
> make the test longer.
>
> With all that, I finish with the attached.  Does that look fine to
> you?

Sounds great, looks fine. Thanks for your review.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Re: Allowing REINDEX to have an optional name

From

Justin Pryzby

Date:

19 July 2022, 02:26:53

Sorry, I meant to send this earlier..

On Sun, Jul 17, 2022 at 03:19:47PM +0900, Michael Paquier wrote:
> The second thing is test coverage.  Using a REINDEX DATABASE/SYSTEM

> +my $catalog_toast_index = $node->safe_psql('postgres',
> +    "SELECT indexrelid::regclass FROM pg_index WHERE indrelid = '$toast_table'::regclass;"
> +);
> +
> +# Set of SQL queries to cross-check the state of relfilenodes across
> +# REINDEX operations.  A set of relfilenodes is saved from the catalogs
> +# and then compared with pg_class.
> +$node->safe_psql('postgres',
> +    'CREATE TABLE toast_relfilenodes (relname regclass, relfilenode oid);');

It looks like you named the table "toast_relfilenodes", but then also store
to it data for non-toast tables.

It's also a bit weird to call the column "relname" but use it to store the
::regclass.  You later need to cast the column to text, so you may as well
store it as text, either relname or oid::regclass.

It seems like cluster.sql does this more succinctly.

-- Check that clustering sets new relfilenodes:
CREATE TEMP TABLE old_cluster_info AS SELECT relname, level, relfilenode, relkind FROM
pg_partition_tree('clstrpart'::regclass)AS tree JOIN pg_class c ON c.oid=tree.relid ;
 
CLUSTER clstrpart USING clstrpart_idx;
CREATE TEMP TABLE new_cluster_info AS SELECT relname, level, relfilenode, relkind FROM
pg_partition_tree('clstrpart'::regclass)AS tree JOIN pg_class c ON c.oid=tree.relid ;
 
SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM old_cluster_info AS old JOIN
new_cluster_infoAS new USING (relname) ORDER BY relname COLLATE "C";
 

> +# Save the relfilenode of a set of toast indexes, one from the catalog
> +# pg_constraint and one from the test table.  This data is used for checks
> +# after some of the REINDEX operations done below, checking if they are
> +# changed.
> +my $fetch_toast_relfilenodes = qq{SELECT c.oid::regclass, c.relfilenode
> +  FROM pg_class a
> +    JOIN pg_class b ON (a.oid = b.reltoastrelid)
> +    JOIN pg_index i on (a.oid = i.indrelid)
> +    JOIN pg_class c on (i.indexrelid = c.oid)
> +  WHERE b.oid IN ('pg_constraint'::regclass, 'test1'::regclass)};
> +# Same for relfilenodes of normal indexes.  This saves the relfilenode
> +# from a catalog of pg_constraint, and the one from the test table.
> +my $fetch_index_relfilenodes = qq{SELECT oid, relfilenode
> +  FROM pg_class
> +  WHERE relname IN ('pg_constraint_oid_index', 'test1x')};
> +my $save_relfilenodes =
> +    "INSERT INTO toast_relfilenodes $fetch_toast_relfilenodes;"
> +  . "INSERT INTO toast_relfilenodes $fetch_index_relfilenodes;";
> +
> +# Query to compare a set of relfilenodes saved with the contents of pg_class.
> +# Note that this does not join using OIDs, as CONCURRENTLY would change them
> +# when reindexing.  A filter is applied on the toast index names, even if this
> +# does not make a difference between the catalog and normal ones, the ordering
> +# based on the name is enough to ensure a fixed output.
> +my $compare_relfilenodes =
> +  qq(SELECT regexp_replace(b.relname::text, '(pg_toast.pg_toast_)\\d{4,5}(_index)', '\\1<oid>\\2'),

Why {4,5} ?

> +  CASE WHEN a.relfilenode = b.relfilenode THEN 'relfilenode is unchanged'
> +    ELSE 'relfilenode has changed' END
> +  FROM toast_relfilenodes b
> +    JOIN pg_class a ON b.relname::text = a.oid::regclass::text
> +  ORDER BY b.relname::text);
> +
> +# Save the set of relfilenodes and compare them.
> +$node->safe_psql('postgres', $save_relfilenodes);
> +$node->issues_sql_like(
> +    [ 'reindexdb', 'postgres' ],
> +    qr/statement: REINDEX DATABASE postgres;/,
> +    'SQL REINDEX run');
> +my $relnode_info = $node->safe_psql('postgres', $compare_relfilenodes);
> +is( $relnode_info, qq(pg_constraint_oid_index|relfilenode is unchanged
> +pg_toast.pg_toast_<oid>_index|relfilenode has changed
> +pg_toast.pg_toast_<oid>_index|relfilenode is unchanged
> +test1x|relfilenode has changed), 'relfilenode change after REINDEX DATABASE');
> +
> +# Re-save and run the second one.
> +$node->safe_psql('postgres',
> +    "TRUNCATE toast_relfilenodes; $save_relfilenodes");
> +$node->issues_sql_like(
> +    [ 'reindexdb', '-s', 'postgres' ],
> +    qr/statement: REINDEX SYSTEM postgres;/,
> +    'reindex system tables');
> +$relnode_info = $node->safe_psql('postgres', $compare_relfilenodes);
> +is( $relnode_info, qq(pg_constraint_oid_index|relfilenode has changed
> +pg_toast.pg_toast_<oid>_index|relfilenode is unchanged
> +pg_toast.pg_toast_<oid>_index|relfilenode has changed
> +test1x|relfilenode is unchanged), 'relfilenode change after REINDEX SYSTEM');

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

19 July 2022, 03:43:56

On Sun, Jul 17, 2022 at 10:58:26AM +0100, Simon Riggs wrote:
> Sounds great, looks fine. Thanks for your review.

Ok, cool.  At the end, I have decided to split the tests and the main
patch into two different commits, as each is useful on its own.  Doing
so also helps in seeing the difference of behavior when issuing a
REINDEX DATABASE.  Another thing that was itching me with the test is
that it was not possible to make the difference between the toast
index of the catalog and of the user table, so I have added the parent
table name as an extra thing stored in the table storing the
relfilenodes.
--
Michael

Attachment

signature.asc

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

19 July 2022, 04:13:34

On Mon, Jul 18, 2022 at 09:26:53PM -0500, Justin Pryzby wrote:
> Sorry, I meant to send this earlier..

No problem.

> It looks like you named the table "toast_relfilenodes", but then also store
> to it data for non-toast tables.

How about naming that index_relfilenodes?  One difference with what I
posted previously and 5fb5b6 is the addition of an extra regclass that
stores the parent table, for reference in the output.

> It's also a bit weird to call the column "relname" but use it to store the
> ::regclass.  You later need to cast the column to text, so you may as well
> store it as text, either relname or oid::regclass.

I have used "indname" at the end.

> It seems like cluster.sql does this more succinctly.

Except that this does not include the relfilenodes from the toast
indexes, which is something I wanted to add a check for when it comes
to both user tables and catalogs.

> Why {4,5} ?

Looks like a brain fade from here, while looking the relation names
this generated.  This could just match with an integer.
--
Michael

Attachment

Re: Allowing REINDEX to have an optional name

From

Justin Pryzby

Date:

20 July 2022, 12:27:07

On Tue, Jul 19, 2022 at 01:13:34PM +0900, Michael Paquier wrote:
> > It looks like you named the table "toast_relfilenodes", but then also store
> > to it data for non-toast tables.
> 
> How about naming that index_relfilenodes?  One difference with what I
> posted previously and 5fb5b6 is the addition of an extra regclass that
> stores the parent table, for reference in the output.

Looks fine

> -    'CREATE TABLE toast_relfilenodes (parent regclass, indname regclass, relfilenode oid);'
> +    'CREATE TABLE index_relfilenodes (parent regclass, indname regclass, relfilenode oid);'

-- 
Justi

Re: Allowing REINDEX to have an optional name

From

Michael Paquier

Date:

21 July 2022, 02:02:14

On Wed, Jul 20, 2022 at 07:27:07AM -0500, Justin Pryzby wrote:
> Looks fine

Thanks for checking, applied.
--
Michael

Attachment

signature.asc