Thread: Allowing REINDEX to have an optional name
A minor issue, and patch. REINDEX DATABASE currently requires you to write REINDEX DATABASE dbname, which makes this a little less usable than we might like. REINDEX on the catalog can cause deadlocks, which also makes REINDEX DATABASE not much use in practice, and is the reason there is no test for REINDEX DATABASE. Another reason why it is a little less usable than we might like. Seems we should do something about these historic issues in the name of product usability. Attached patch allows new syntax for REINDEX DATABASE, without needing to specify dbname. That version of the command skips catalog tables, as a way of avoiding the known deadlocks. Patch also adds a test. -- Simon Riggs http://www.EnterpriseDB.com/
Attachment
Hi, Am Dienstag, dem 10.05.2022 um 10:13 +0100 schrieb Simon Riggs: > A minor issue, and patch. > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > dbname, which makes this a little less usable than we might like. > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > DATABASE not much use in practice, and is the reason there is no test > for REINDEX DATABASE. Another reason why it is a little less usable > than we might like. > > Seems we should do something about these historic issues in the name > of product usability. > Wow, i just recently had a look into that code and talked with my colleagues on how the current behavior annoyed me last week....and there you are! This community rocks ;) > Attached patch allows new syntax for REINDEX DATABASE, without > needing > to specify dbname. That version of the command skips catalog tables, > as a way of avoiding the known deadlocks. Patch also adds a test. > + /* Unqualified REINDEX DATABASE will skip catalog tables */ + if (objectKind == REINDEX_OBJECT_DATABASE && + objectName == NULL && + IsSystemClass(relid, classtuple)) + continue; Hmm, shouldn't we just print a NOTICE or something like this in addition to this check to tell the user that we are *not* really reindexing all things (and probably give him a hint to use REINDEX SYSTEM to cover them)? Thanks, Bernd
On Tue, May 10, 2022 at 2:43 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > A minor issue, and patch. > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > dbname, which makes this a little less usable than we might like. > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > DATABASE not much use in practice, and is the reason there is no test > for REINDEX DATABASE. Another reason why it is a little less usable > than we might like. > > Seems we should do something about these historic issues in the name > of product usability. > > Attached patch allows new syntax for REINDEX DATABASE, without needing > to specify dbname. That version of the command skips catalog tables, > as a way of avoiding the known deadlocks. Patch also adds a test. > From the patch it looks like with the patch applied running REINDEX DATABASE is equivalent to running REINDEX DATABASE <current database> except reindexing the shared catalogs. Is that correct? Though the patch adds following change + Indexes on shared system catalogs are also processed, unless the + database name is omitted, in which case system catalog indexes are skipped. the syntax looks unintuitive. I think REINDEX DATABASE reindexing the current database is a good usability improvement in itself. But skipping the shared catalogs needs an explicity syntax. Not sure how feasible it is but something like REINDEX DATABASE skip SHARED/SYSTEM. -- Best Wishes, Ashutosh Bapat
On Tue, 10 May 2022 at 14:47, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Tue, May 10, 2022 at 2:43 PM Simon Riggs > <simon.riggs@enterprisedb.com> wrote: > > > > A minor issue, and patch. > > > > REINDEX DATABASE currently requires you to write REINDEX DATABASE > > dbname, which makes this a little less usable than we might like. > > > > REINDEX on the catalog can cause deadlocks, which also makes REINDEX > > DATABASE not much use in practice, and is the reason there is no test > > for REINDEX DATABASE. Another reason why it is a little less usable > > than we might like. > > > > Seems we should do something about these historic issues in the name > > of product usability. > > > > Attached patch allows new syntax for REINDEX DATABASE, without needing > > to specify dbname. That version of the command skips catalog tables, > > as a way of avoiding the known deadlocks. Patch also adds a test. > > > > From the patch it looks like with the patch applied running REINDEX > DATABASE is equivalent to running REINDEX DATABASE <current database> > except reindexing the shared catalogs. Is that correct? Yes > Though the patch adds following change > + Indexes on shared system catalogs are also processed, unless the > + database name is omitted, in which case system catalog indexes > are skipped. > > the syntax looks unintuitive. > > I think REINDEX DATABASE reindexing the current database is a good > usability improvement in itself. But skipping the shared catalogs > needs an explicity syntax. Not sure how feasible it is but something > like REINDEX DATABASE skip SHARED/SYSTEM. There are two commands: REINDEX DATABASE does every except system catalogs REINDEX SYSTEM does system catalogs only So taken together, the two commands seem intuitive to me. It is designed like this because it is dangerous to REINDEX the system catalogs because of potential deadlocks, so we want a way to avoid that problem. Perhaps I can improve the docs more, will look. -- Simon Riggs http://www.EnterpriseDB.com/
On Tue, May 10, 2022 at 7:30 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Tue, 10 May 2022 at 14:47, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, May 10, 2022 at 2:43 PM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
> >
> > A minor issue, and patch.
> >
> > REINDEX DATABASE currently requires you to write REINDEX DATABASE
> > dbname, which makes this a little less usable than we might like.
> >
> > REINDEX on the catalog can cause deadlocks, which also makes REINDEX
> > DATABASE not much use in practice, and is the reason there is no test
> > for REINDEX DATABASE. Another reason why it is a little less usable
> > than we might like.
> >
> > Seems we should do something about these historic issues in the name
> > of product usability.
> >
> > Attached patch allows new syntax for REINDEX DATABASE, without needing
> > to specify dbname. That version of the command skips catalog tables,
> > as a way of avoiding the known deadlocks. Patch also adds a test.
> >
>
> From the patch it looks like with the patch applied running REINDEX
> DATABASE is equivalent to running REINDEX DATABASE <current database>
> except reindexing the shared catalogs. Is that correct?
Yes
> Though the patch adds following change
> + Indexes on shared system catalogs are also processed, unless the
> + database name is omitted, in which case system catalog indexes
> are skipped.
>
> the syntax looks unintuitive.
>
> I think REINDEX DATABASE reindexing the current database is a good
> usability improvement in itself. But skipping the shared catalogs
> needs an explicity syntax. Not sure how feasible it is but something
> like REINDEX DATABASE skip SHARED/SYSTEM.
There are two commands:
REINDEX DATABASE does every except system catalogs
REINDEX SYSTEM does system catalogs only
IIUC
REINDEX DATABASE <database name> does system catalogs as well
REINDEX DATABASE does everything except system catalogs
That's confusing and unintuitive.
Not providing the database name leads to ignoring system catalogs. I won't expect that from this syntax.
So taken together, the two commands seem intuitive to me.
It is designed like this because it is dangerous to REINDEX the system
catalogs because of potential deadlocks, so we want a way to avoid
that problem.
It's more clear if we add SKIP SYSTEM CATALOGS or some such explicit syntax.
Best Wishes,
Ashutosh
On Wed, May 11, 2022 at 09:54:17AM +0530, Ashutosh Bapat wrote: > REINDEX DATABASE <database name> does system catalogs as well > REINDEX DATABASE does everything except system catalogs > > That's confusing and unintuitive. Agreed. Nobody is going to remember the difference. REINDEX's parsing grammar is designed to be extensible because we have the parenthesized flavor. Why don't you add an option there to skip the catalogs, like a SKIP_CATALOG? > Not providing the database name leads to ignoring system catalogs. I won't > expect that from this syntax. I don't disagree with having a shortened grammar where the database name is not required because one cannot reindex a database different than the one connected to, but changing a behavior based on such a grammar difference is not a good user experience. -- Michael
Attachment
Am Mittwoch, dem 11.05.2022 um 14:42 +0900 schrieb Michael Paquier: > Agreed. Nobody is going to remember the difference. REINDEX's > parsing grammar is designed to be extensible because we have the > parenthesized flavor. Why don't you add an option there to skip the > catalogs, like a SKIP_CATALOG? +1 Having an option is probably the best idea. Though we have REINDEX SYSTEM, so i throw SKIP_SYSTEM into the ring as an alternative. This would be consistent with the meaning of both commands/options. Thanks, Bernd
On Wed, 11 May 2022 at 05:24, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: >> It is designed like this because it is dangerous to REINDEX the system >> catalogs because of potential deadlocks, so we want a way to avoid >> that problem. > > It's more clear if we add SKIP SYSTEM CATALOGS or some such explicit syntax. Clarity is not the issue. I am opposed to a default mode that does something bad and non-useful. If you want to reindex the system catalogs then we already have REINDEX SYSTEM. So REINDEX (SKIP_SYSTEM_CATALOGS OFF) DATABASE would do the same thing. But you don't want to run either of them because of deadlocking. The only action that makes sense is to reindex the database, skipping the catalog tables. So I'm proposing a command that has useful default behavior. i.e. REINDEX DATABASE is the same as REINDEX (SKIP_SYSTEM_CATALOGS ON) DATABASE. If you make REINDEX DATABASE the same as REINDEX (SKIP_SYSTEM_CATALOGS OFF) DATABASE then it is just dangerous and annoying, i.e. a POLA violation. The point of this was a usability improvement, not just new syntax. -- Simon Riggs http://www.EnterpriseDB.com/
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed Hello The patch applies and tests fine and I think this patch has good intentions to prevent the default behavior of REINDEX DATABASEto cause a deadlock. However, I am not in favor of simply omitting the database name after DATABASE clause becauseof consistency. Almost all other queries involving the DATABASE clause require database name to be given followingafter. For example, ALTER DATABASE [dbname]. Being able to omit database name for REINDEX DATABASE seems inconsistent to me. The documentation states that REINDEX DATABASE only works on the current database, but it still requires the user to providea database name and require that it must match the current database. Not very useful option, isn’t it? But it is stillrequired from the user to stay consistent with other DATABASE clauses. Maybe the best way is to keep the query clause as is (with the database name still required) and simply don’t let it reindexsystem catalog to prevent deadlock. At the end, give user a notification that system catalogs have not been reindexed,and tell them to use REINDEX SYSTEM instead. Cary Huang ----------------- HighGo Software Canada www.highgo.ca
Am Freitag, dem 27.05.2022 um 19:08 +0000 schrieb Cary Huang: [...] > The patch applies and tests fine and I think this patch has good > intentions to prevent the default behavior of REINDEX DATABASE to > cause a deadlock. However, I am not in favor of simply omitting the > database name after DATABASE clause because of consistency. Almost > all other queries involving the DATABASE clause require database name > to be given following after. For example, ALTER DATABASE [dbname]. > > Being able to omit database name for REINDEX DATABASE seems > inconsistent to me. > > The documentation states that REINDEX DATABASE only works on the > current database, but it still requires the user to provide a > database name and require that it must match the current database. > Not very useful option, isn’t it? But it is still required from the > user to stay consistent with other DATABASE clauses. > Hmm, right, but you can see this from another perspective, too: For example, ALTER DATABASE works by adjusting properties of other databases very well, SET TABLESPACE can be used when not connected to the target database only, so you are required to specify its name in that case. REINDEX DATABASE cannot reindex other databases than the one we're connected to. Seen from that point, i currently can't see the logical justification to have the database name there, besides of "yes, i really meant that database i am connected to" or consistency. > Maybe the best way is to keep the query clause as is (with the > database name still required) and simply don’t let it reindex system > catalog to prevent deadlock. At the end, give user a notification > that system catalogs have not been reindexed, and tell them to use > REINDEX SYSTEM instead. +1
Am Dienstag, dem 10.05.2022 um 15:00 +0100 schrieb Simon Riggs: [...] > > > I think REINDEX DATABASE reindexing the current database is a good > > usability improvement in itself. But skipping the shared catalogs > > needs an explicity syntax. Not sure how feasible it is but > > something > > like REINDEX DATABASE skip SHARED/SYSTEM. > > There are two commands: > > REINDEX DATABASE does every except system catalogs > REINDEX SYSTEM does system catalogs only > > So taken together, the two commands seem intuitive to me. > > It is designed like this because it is dangerous to REINDEX the > system > catalogs because of potential deadlocks, so we want a way to avoid > that problem. > > Perhaps I can improve the docs more, will look. > And we already have a situation where this already happens with REINDEX DATABASE: if you use CONCURRENTLY, it skips system catalogs already and prints a warning. In both cases there are good technical reasons to skip catalog indexes and to change the workflow to use separate commands. Bernd
On Tue, May 31, 2022 at 11:04:58AM +0200, Bernd Helmle wrote: > And we already have a situation where this already happens with REINDEX > DATABASE: if you use CONCURRENTLY, it skips system catalogs already and > prints a warning. In both cases there are good technical reasons to > skip catalog indexes and to change the workflow to use separate > commands. The case with CONCURRENTLY is different though: the option will never work on system catalogs so we have to skip them. Echoing with others on this thread, I don't think that we should introduce a different behavior on what's basically the same grammar. That's just going to lead to more confusion. So REINDEX DATABASE with or without a database name appended to it should always mean to reindex the catalogs on top of the existing relations. -- Michael
Attachment
On 2022-May-31, Michael Paquier wrote: > The case with CONCURRENTLY is different though: the option will never > work on system catalogs so we have to skip them. Echoing with others > on this thread, I don't think that we should introduce a different > behavior on what's basically the same grammar. That's just going to > lead to more confusion. So REINDEX DATABASE with or without a > database name appended to it should always mean to reindex the > catalogs on top of the existing relations. I was thinking the opposite: REINDEX DATABASE with or without a database name should always process the user relations and skip system catalogs. If the user wants to do both, then they can use REINDEX SYSTEM in addition. The reason for doing it like this is that there is no way to process only user tables and skip catalogs. So this is better for composability. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Most hackers will be perfectly comfortable conceptualizing users as entropy sources, so let's move on." (Nathaniel Smith)
On Tue, May 31, 2022 at 02:30:32PM +0200, Alvaro Herrera wrote: > I was thinking the opposite: REINDEX DATABASE with or without a database > name should always process the user relations and skip system catalogs. > If the user wants to do both, then they can use REINDEX SYSTEM in > addition. > > The reason for doing it like this is that there is no way to process > only user tables and skip catalogs. So this is better for > composability. No objections from me to keep this distinction at the end, as long as the the database name in the command has no impact on the chosen behavior. Could there be a point in having a REINDEX ALL though that would process both the user relations and the catalogs, doing the same thing as REINDEX DATABASE today? By the way, the patch had better avoid putting a global REINDEX command that would process everything. As far as I recall, we've avoided such things on purpose because they are expensive, keeping around only cases that generate errors or skip all the relations. So having that in a TAP test would be better, I assume, for isolation. -- Michael
Attachment
Maybe the answer is to 1) add a parenthesized option REINDEX(SYSTEM) (to allow the current behavior); and 2) make REINDEX DATABASE an alias which implies "SYSTEM false"; 3) prohibit REINDEX (SYSTEM true) SYSTEM, or consider removing "REINDEX SYSTEM;". That avoids the opaque and surprising behavior that "REINDEX DATABASE" skips system stuff but "REINDEX DATABASE foo" doesn't, while allowing the old behavior (disabled by default).
On Thu, 2 Jun 2022 at 01:02, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, May 31, 2022 at 02:30:32PM +0200, Alvaro Herrera wrote: > > I was thinking the opposite: REINDEX DATABASE with or without a database > > name should always process the user relations and skip system catalogs. > > If the user wants to do both, then they can use REINDEX SYSTEM in > > addition. > > > > The reason for doing it like this is that there is no way to process > > only user tables and skip catalogs. So this is better for > > composability. > > No objections from me to keep this distinction at the end, as long as > the the database name in the command has no impact on the chosen > behavior. OK, that's clear. Will progress. > Could there be a point in having a REINDEX ALL though that > would process both the user relations and the catalogs, doing the same > thing as REINDEX DATABASE today? A key point is that REINDEX SYSTEM has problems, so should be avoided. Hence, including both database and system together in a new command would not be great idea, at this time. -- Simon Riggs http://www.EnterpriseDB.com/
On Tue, 28 Jun 2022 at 08:29, Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > On Thu, 2 Jun 2022 at 01:02, Michael Paquier <michael@paquier.xyz> wrote: > > > > On Tue, May 31, 2022 at 02:30:32PM +0200, Alvaro Herrera wrote: > > > I was thinking the opposite: REINDEX DATABASE with or without a database > > > name should always process the user relations and skip system catalogs. > > > If the user wants to do both, then they can use REINDEX SYSTEM in > > > addition. > > > > > > The reason for doing it like this is that there is no way to process > > > only user tables and skip catalogs. So this is better for > > > composability. > > > > No objections from me to keep this distinction at the end, as long as > > the the database name in the command has no impact on the chosen > > behavior. > > OK, that's clear. Will progress. Attached patch is tested, documented and imho ready to be committed, so I will mark it so in CFapp. -- Simon Riggs http://www.EnterpriseDB.com/
Attachment
On Tue, Jun 28, 2022 at 11:02:25AM +0100, Simon Riggs wrote: > Attached patch is tested, documented and imho ready to be committed, > so I will mark it so in CFapp. The behavior introduced by this patch should be reflected in reindexdb. See in particular reindex_one_database(), where a REINDEX_SYSTEM is enforced first on the catalogs for the non-concurrent mode when running the reindex on a database. +-- unqualified reindex database +-- if you want to test REINDEX DATABASE, uncomment the following line, +-- but note that this adds about 0.5s to the regression tests and the +-- results are volatile when run in parallel to other tasks. Note also +-- that REINDEX SYSTEM is specifically not tested because it can deadlock. +-- REINDEX (VERBOSE) DATABASE; No need to add that IMHO. REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable> +REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { DATABASE | SYSTEM } [ CONCURRENTLY ] [ <replaceable class="parameter">name</replaceable> ] Shouldn't you remove DATABASE and SYSTEM from the first line, keeping only INDEX. TABLE and SCHEMA? The second line, with its optional "name" would cover the DATABASE and SYSTEM cases at 100%. - if (strcmp(objectName, get_database_name(objectOid)) != 0) + if (objectName && strcmp(objectName, get_database_name(objectOid)) != 0) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("can only reindex the currently open database"))); if (!pg_database_ownercheck(objectOid, GetUserId())) aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, - objectName); + get_database_name(objectOid)); This could call get_database_name() just once. + * You might think it would be good to include catalogs, + * but doing that can deadlock, so isn't much use in real world, + * nor can we safely test that it even works. Not sure what you mean here exactly. -- Michael
Attachment
On Wed, 29 Jun 2022 at 05:35, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Jun 28, 2022 at 11:02:25AM +0100, Simon Riggs wrote: > > Attached patch is tested, documented and imho ready to be committed, > > so I will mark it so in CFapp. Thanks for the review Michael. > The behavior introduced by this patch should be reflected in > reindexdb. See in particular reindex_one_database(), where a > REINDEX_SYSTEM is enforced first on the catalogs for the > non-concurrent mode when running the reindex on a database. Originally, I was trying to avoid changing prior behavior, but now that we have agreed to do so, this makes sense. That section of code has been removed, tests updated. No changes to docs seem to be required. > +-- unqualified reindex database > +-- if you want to test REINDEX DATABASE, uncomment the following line, > +-- but note that this adds about 0.5s to the regression tests and the > +-- results are volatile when run in parallel to other tasks. Note also > +-- that REINDEX SYSTEM is specifically not tested because it can deadlock. > +-- REINDEX (VERBOSE) DATABASE; > > No need to add that IMHO. That was more a comment to reviewer, but I think something should be said for later developers. > REINDEX [ ( <replaceable class="parameter">option</replaceable> [, > ...] ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ > CONCURRENTLY ] <replaceable class="parameter">name</replaceable> > +REINDEX [ ( <replaceable class="parameter">option</replaceable> [, > ...] ) ] { DATABASE | SYSTEM } [ CONCURRENTLY ] [ <replaceable > class="parameter">name</replaceable> ] > > Shouldn't you remove DATABASE and SYSTEM from the first line, keeping > only INDEX. TABLE and SCHEMA? The second line, with its optional > "name" would cover the DATABASE and SYSTEM cases at 100%. I agree that your proposal is clearer. Done. > - if (strcmp(objectName, get_database_name(objectOid)) != 0) > + if (objectName && strcmp(objectName, get_database_name(objectOid)) != 0) > ereport(ERROR, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("can only reindex the currently open database"))); > if (!pg_database_ownercheck(objectOid, GetUserId())) > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, > - objectName); > + get_database_name(objectOid)); > > This could call get_database_name() just once. It could, but I couldn't see any benefit in changing that for the code under discussion. If calling get_database_name() multiple times is an issue, I've added a cache for that - another patch attached, if you think its worth it. > + * You might think it would be good to include catalogs, > + * but doing that can deadlock, so isn't much use in real world, > + * nor can we safely test that it even works. > > Not sure what you mean here exactly. REINDEX SYSTEM can deadlock, which is why we are avoiding it. This was a comment to later developers as to why things are done that way. Feel free to update the wording or location, but something should be mentioned to avoid later discussion. Thanks for the review, new version attached. -- Simon Riggs http://www.EnterpriseDB.com/
Attachment
Simon Riggs <simon.riggs@enterprisedb.com> writes: > Thanks for the review, new version attached. This is marked as Ready for Committer, but that seems unduly optimistic. The cfbot shows that it's failing on all platforms --- and not in the same way on each, suggesting there are multiple problems. regards, tom lane
On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote: > This is marked as Ready for Committer, but that seems unduly > optimistic. Please note that patch authors should not switch a patch as RfC by themselves. This is something that a reviewer should do. > The cfbot shows that it's failing on all platforms --- > and not in the same way on each, suggesting there are multiple > problems. A wild guess is that this comes from the patch that manipulates get_database_name(), something that there is no need for as long as the routine is called once in ReindexMultipleTables(). -- Michael
Attachment
On 2022-Jun-29, Simon Riggs wrote: > > - if (strcmp(objectName, get_database_name(objectOid)) != 0) > > + if (objectName && strcmp(objectName, get_database_name(objectOid)) != 0) > > ereport(ERROR, > > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > > errmsg("can only reindex the currently open database"))); > > if (!pg_database_ownercheck(objectOid, GetUserId())) > > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_DATABASE, > > - objectName); > > + get_database_name(objectOid)); > > > > This could call get_database_name() just once. > > It could, but I couldn't see any benefit in changing that for the code > under discussion. > > If calling get_database_name() multiple times is an issue, I've added > a cache for that - another patch attached, if you think its worth it. TBH I doubt that this is an issue: since we're throwing an error anyway, the memory would be released, and error cases are not considered worth of performance optimization anyway. Putting that thought aside, if we were to think that this is an issue, I don't think the cache as implemented here is a good idea, because then caller is responsible for tracking whether to free or not the return value. I think that Michaël's idea could be implemented more easily by having a local variable that receives the return value from get_database_name. But I think the coding as Simon had it was all right. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Las navajas y los monos deben estar siempre distantes" (Germán Poo)
On Mon, Jul 04, 2022 at 03:59:55PM +0900, Michael Paquier wrote: > Please note that patch authors should not switch a patch as RfC by > themselves. This is something that a reviewer should do. This patch has been marked as waiting for a review, however the CF bot is completely red: http://commitfest.cputube.org/simon-riggs.html Could you take care of those issues first? -- Michael
Attachment
On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Jul 04, 2022 at 03:59:55PM +0900, Michael Paquier wrote: > > Please note that patch authors should not switch a patch as RfC by > > themselves. This is something that a reviewer should do. > > http://commitfest.cputube.org/simon-riggs.html Thanks for showing me that, it is very helpful. > This patch has been marked as waiting for a review, however the CF bot > is completely red: Yes, it is failing, but so is current HEAD, with some kind of libpq pipelining error. -- Simon Riggs http://www.EnterpriseDB.com/
On 2022-Jul-15, Simon Riggs wrote: > On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote: > > This patch has been marked as waiting for a review, however the CF bot > > is completely red: > > Yes, it is failing, but so is current HEAD, with some kind of libpq > pipelining error. Hmm, is it? Where can I see that? The buildfarm looks OK ... -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Before you were born your parents weren't as boring as they are now. They got that way paying your bills, cleaning up your room and listening to you tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers
On Fri, 15 Jul 2022 at 09:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Jul-15, Simon Riggs wrote: > > > On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote: > > > > This patch has been marked as waiting for a review, however the CF bot > > > is completely red: > > > > Yes, it is failing, but so is current HEAD, with some kind of libpq > > pipelining error. > > Hmm, is it? Where can I see that? The buildfarm looks OK ... I got this... so cleaning up and retesting. # Looks like you failed 2 tests of 20. t/001_libpq_pipeline.pl .. Dubious, test returned 2 (wstat 512, 0x200) Failed 2/20 subtests Test Summary Report ------------------- t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2) Failed tests: 9-10 Non-zero exit status: 2 Files=1, Tests=20, 4 wallclock secs ( 0.02 usr 0.00 sys + 0.66 cusr 0.64 csys = 1.32 CPU) Result: FAIL -- Simon Riggs http://www.EnterpriseDB.com/
On Fri, 15 Jul 2022 at 12:15, Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > On Fri, 15 Jul 2022 at 09:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2022-Jul-15, Simon Riggs wrote: > > > > > On Fri, 15 Jul 2022 at 04:44, Michael Paquier <michael@paquier.xyz> wrote: > > > > > > This patch has been marked as waiting for a review, however the CF bot > > > > is completely red: > > > > > > Yes, it is failing, but so is current HEAD, with some kind of libpq > > > pipelining error. > > > > Hmm, is it? Where can I see that? The buildfarm looks OK ... > > I got this... so cleaning up and retesting. > > # Looks like you failed 2 tests of 20. > t/001_libpq_pipeline.pl .. Dubious, test returned 2 (wstat 512, 0x200) > > Failed 2/20 subtests > > Test Summary Report > ------------------- > t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2) > Failed tests: 9-10 > Non-zero exit status: 2 > Files=1, Tests=20, 4 wallclock secs ( 0.02 usr 0.00 sys + 0.66 cusr > 0.64 csys = 1.32 CPU) > Result: FAIL Yeh, repeated failures on MacOS Catalina with HEAD. -- Simon Riggs http://www.EnterpriseDB.com/
On Mon, 4 Jul 2022 at 08:00, Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote: > > This is marked as Ready for Committer, but that seems unduly > > optimistic. > > Please note that patch authors should not switch a patch as RfC by > themselves. This is something that a reviewer should do. > > > The cfbot shows that it's failing on all platforms --- > > and not in the same way on each, suggesting there are multiple > > problems. > > A wild guess is that this comes from the patch that manipulates > get_database_name(), something that there is no need for as long as > the routine is called once in ReindexMultipleTables(). OK, let me repost the new patch and see if CFbot likes that better. This includes Michael's other requested changes for reindexdb. -- Simon Riggs http://www.EnterpriseDB.com/
Attachment
On 2022-Jul-15, Simon Riggs wrote: > > Test Summary Report > > ------------------- > > t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2) > > Failed tests: 9-10 > > Non-zero exit status: 2 > > Files=1, Tests=20, 4 wallclock secs ( 0.02 usr 0.00 sys + 0.66 cusr > > 0.64 csys = 1.32 CPU) > > Result: FAIL > > Yeh, repeated failures on MacOS Catalina with HEAD. Can you share the contents of src/test/modules/libpq_pipeline/tmp_check? Since these failures are not visible in the buildfarm (where we do have 11.0 as well as some 10.X versions of macOS), this is surprising. Anyway, the errors shown for your patch in the cfbot are in the pg_upgrade test, so I suggest to have a look at the logs for those anyway. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ Y una voz del caos me habló y me dijo "Sonríe y sé feliz, podría ser peor". Y sonreí. Y fui feliz. Y fue peor.
On 2022-Jul-15, Alvaro Herrera wrote: > > Yeh, repeated failures on MacOS Catalina with HEAD. > > Can you share the contents of src/test/modules/libpq_pipeline/tmp_check? > Since these failures are not visible in the buildfarm (where we do have > 11.0 as well as some 10.X versions of macOS), this is surprising. So I got the log files, and the error is clear: : # Running: libpq_pipeline -r 700 -t /Users/sriggs/pg/pg-git/postgresql/src/test/modules/libpq_pipeline/tmp_check/traces/pipeline_idle.tracepipeline_idle port=57444host=/var/folders/rd/kxv86w7567z9jk5qt7lhxfwr0000gn/T/Od6QFSH7TB dbname='postgres' : : pipeline idle... : NOTICE 1: message type 0x33 arrived from server while idle : : libpq_pipeline:1037: got 1 notice(s) : [12:17:00.181](0.016s) not ok 9 - libpq_pipeline pipeline_idle then the trace test also fails, but only because it is truncated at the point where the notice is reported; up to that point, the trace matches correctly. Not sure what to make of this. Maybe the fix in 054325c5eeb3 is not right, but I don't understand why it doesn't fail in the macOS machines in the buildfarm. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio)
On 2022-Jul-15, Alvaro Herrera wrote: > Not sure what to make of this. Maybe the fix in 054325c5eeb3 is not > right, but I don't understand why it doesn't fail in the macOS machines > in the buildfarm. Ah, one theory is that the libpq_pipeline program is getting linked to an installed version of libpq that doesn't contain the fixes. Maybe you can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it is picking up? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "No deja de ser humillante para una persona de ingenio saber que no hay tonto que no le pueda enseñar algo." (Jean B. Say)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Ah, one theory is that the libpq_pipeline program is getting linked to > an installed version of libpq that doesn't contain the fixes. Maybe you > can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it > is picking up? That's pronounced "otool -L" on macOS. But in any case, it's going to point at the installation directory. One of the moving parts here is that "make check" will try to override the rpath that otool tells you about to make test programs use the libpq.dylib from the build tree. I say "try" because if you've got SIP enabled (see what "csrutil status" tells you), it will fail to do so and the installed libpq will be used. Maybe that's old. Standard recommendation on macOS with SIP on is to always do "make install" before "make check". regards, tom lane
On Fri, 15 Jul 2022 at 15:03, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Ah, one theory is that the libpq_pipeline program is getting linked to > > an installed version of libpq that doesn't contain the fixes. Maybe you > > can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it > > is picking up? > > That's pronounced "otool -L" on macOS. But in any case, it's going > to point at the installation directory. One of the moving parts here > is that "make check" will try to override the rpath that otool tells > you about to make test programs use the libpq.dylib from the build tree. > I say "try" because if you've got SIP enabled (see what "csrutil status" > tells you), it will fail to do so and the installed libpq will be used. > Maybe that's old. > > Standard recommendation on macOS with SIP on is to always do "make > install" before "make check". Thanks, will investigate. -- Simon Riggs http://www.EnterpriseDB.com/
On Fri, 15 Jul 2022 at 12:25, Simon Riggs <simon.riggs@enterprisedb.com> wrote: > > On Mon, 4 Jul 2022 at 08:00, Michael Paquier <michael@paquier.xyz> wrote: > > > > On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote: > > > This is marked as Ready for Committer, but that seems unduly > > > optimistic. > > > > Please note that patch authors should not switch a patch as RfC by > > themselves. This is something that a reviewer should do. > > > > > The cfbot shows that it's failing on all platforms --- > > > and not in the same way on each, suggesting there are multiple > > > problems. > > > > A wild guess is that this comes from the patch that manipulates > > get_database_name(), something that there is no need for as long as > > the routine is called once in ReindexMultipleTables(). > > OK, let me repost the new patch and see if CFbot likes that better. > > This includes Michael's other requested changes for reindexdb. That's fixed it on the CFbot. Over to you, Michael. Thanks. -- Simon Riggs http://www.EnterpriseDB.com/
On Fri, Jul 15, 2022 at 06:21:22PM +0100, Simon Riggs wrote: > That's fixed it on the CFbot. Over to you, Michael. Thanks. Sure. I have looked over that, and this looks fine overall. I have made two changes though. if (objectKind == REINDEX_OBJECT_SYSTEM && - !IsSystemClass(relid, classtuple)) + !IsCatalogRelationOid(relid)) + continue; + else if (objectKind == REINDEX_OBJECT_DATABASE && + IsCatalogRelationOid(relid)) The patch originally relied on IsSystemClass() to decide if a relation is a catalog table or not. This is not wrong in itself because ReindexMultipleTables() discards RELKIND_TOAST a couple of lines above, but I think that we should switch to IsCatalogRelationOid() as that's the line drawn to check for the catalog-ness of a relation. The second thing is test coverage. Using a REINDEX DATABASE/SYSTEM within the main regression test suite is not a good idea, but we already have those commands running in the reindexdb suite so I could not resist expanding the test section to track and check relfilenode changes through four relations for these cases: - Catalog index. - Catalog toast index. - User table index. - User toast index. The relfilenodes of those relations are saved in a table and cross-checked with the contents of pg_class after each REINDEX, on SYSTEM or DATABASE. There are no new heavy commands, so it does not make the test longer. With all that, I finish with the attached. Does that look fine to you? -- Michael
Attachment
On Sun, 17 Jul 2022 at 07:19, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jul 15, 2022 at 06:21:22PM +0100, Simon Riggs wrote: > > That's fixed it on the CFbot. Over to you, Michael. Thanks. > > Sure. I have looked over that, and this looks fine overall. I have > made two changes though. > > if (objectKind == REINDEX_OBJECT_SYSTEM && > - !IsSystemClass(relid, classtuple)) > + !IsCatalogRelationOid(relid)) > + continue; > + else if (objectKind == REINDEX_OBJECT_DATABASE && > + IsCatalogRelationOid(relid)) > > The patch originally relied on IsSystemClass() to decide if a relation > is a catalog table or not. This is not wrong in itself because > ReindexMultipleTables() discards RELKIND_TOAST a couple of lines > above, but I think that we should switch to IsCatalogRelationOid() as > that's the line drawn to check for the catalog-ness of a relation. > > The second thing is test coverage. Using a REINDEX DATABASE/SYSTEM > within the main regression test suite is not a good idea, but we > already have those commands running in the reindexdb suite so I could > not resist expanding the test section to track and check relfilenode > changes through four relations for these cases: > - Catalog index. > - Catalog toast index. > - User table index. > - User toast index. > The relfilenodes of those relations are saved in a table and > cross-checked with the contents of pg_class after each REINDEX, on > SYSTEM or DATABASE. There are no new heavy commands, so it does not > make the test longer. > > With all that, I finish with the attached. Does that look fine to > you? Sounds great, looks fine. Thanks for your review. -- Simon Riggs http://www.EnterpriseDB.com/
Sorry, I meant to send this earlier.. On Sun, Jul 17, 2022 at 03:19:47PM +0900, Michael Paquier wrote: > The second thing is test coverage. Using a REINDEX DATABASE/SYSTEM > +my $catalog_toast_index = $node->safe_psql('postgres', > + "SELECT indexrelid::regclass FROM pg_index WHERE indrelid = '$toast_table'::regclass;" > +); > + > +# Set of SQL queries to cross-check the state of relfilenodes across > +# REINDEX operations. A set of relfilenodes is saved from the catalogs > +# and then compared with pg_class. > +$node->safe_psql('postgres', > + 'CREATE TABLE toast_relfilenodes (relname regclass, relfilenode oid);'); It looks like you named the table "toast_relfilenodes", but then also store to it data for non-toast tables. It's also a bit weird to call the column "relname" but use it to store the ::regclass. You later need to cast the column to text, so you may as well store it as text, either relname or oid::regclass. It seems like cluster.sql does this more succinctly. -- Check that clustering sets new relfilenodes: CREATE TEMP TABLE old_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass)AS tree JOIN pg_class c ON c.oid=tree.relid ; CLUSTER clstrpart USING clstrpart_idx; CREATE TEMP TABLE new_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass)AS tree JOIN pg_class c ON c.oid=tree.relid ; SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM old_cluster_info AS old JOIN new_cluster_infoAS new USING (relname) ORDER BY relname COLLATE "C"; > +# Save the relfilenode of a set of toast indexes, one from the catalog > +# pg_constraint and one from the test table. This data is used for checks > +# after some of the REINDEX operations done below, checking if they are > +# changed. > +my $fetch_toast_relfilenodes = qq{SELECT c.oid::regclass, c.relfilenode > + FROM pg_class a > + JOIN pg_class b ON (a.oid = b.reltoastrelid) > + JOIN pg_index i on (a.oid = i.indrelid) > + JOIN pg_class c on (i.indexrelid = c.oid) > + WHERE b.oid IN ('pg_constraint'::regclass, 'test1'::regclass)}; > +# Same for relfilenodes of normal indexes. This saves the relfilenode > +# from a catalog of pg_constraint, and the one from the test table. > +my $fetch_index_relfilenodes = qq{SELECT oid, relfilenode > + FROM pg_class > + WHERE relname IN ('pg_constraint_oid_index', 'test1x')}; > +my $save_relfilenodes = > + "INSERT INTO toast_relfilenodes $fetch_toast_relfilenodes;" > + . "INSERT INTO toast_relfilenodes $fetch_index_relfilenodes;"; > + > +# Query to compare a set of relfilenodes saved with the contents of pg_class. > +# Note that this does not join using OIDs, as CONCURRENTLY would change them > +# when reindexing. A filter is applied on the toast index names, even if this > +# does not make a difference between the catalog and normal ones, the ordering > +# based on the name is enough to ensure a fixed output. > +my $compare_relfilenodes = > + qq(SELECT regexp_replace(b.relname::text, '(pg_toast.pg_toast_)\\d{4,5}(_index)', '\\1<oid>\\2'), Why {4,5} ? > + CASE WHEN a.relfilenode = b.relfilenode THEN 'relfilenode is unchanged' > + ELSE 'relfilenode has changed' END > + FROM toast_relfilenodes b > + JOIN pg_class a ON b.relname::text = a.oid::regclass::text > + ORDER BY b.relname::text); > + > +# Save the set of relfilenodes and compare them. > +$node->safe_psql('postgres', $save_relfilenodes); > +$node->issues_sql_like( > + [ 'reindexdb', 'postgres' ], > + qr/statement: REINDEX DATABASE postgres;/, > + 'SQL REINDEX run'); > +my $relnode_info = $node->safe_psql('postgres', $compare_relfilenodes); > +is( $relnode_info, qq(pg_constraint_oid_index|relfilenode is unchanged > +pg_toast.pg_toast_<oid>_index|relfilenode has changed > +pg_toast.pg_toast_<oid>_index|relfilenode is unchanged > +test1x|relfilenode has changed), 'relfilenode change after REINDEX DATABASE'); > + > +# Re-save and run the second one. > +$node->safe_psql('postgres', > + "TRUNCATE toast_relfilenodes; $save_relfilenodes"); > +$node->issues_sql_like( > + [ 'reindexdb', '-s', 'postgres' ], > + qr/statement: REINDEX SYSTEM postgres;/, > + 'reindex system tables'); > +$relnode_info = $node->safe_psql('postgres', $compare_relfilenodes); > +is( $relnode_info, qq(pg_constraint_oid_index|relfilenode has changed > +pg_toast.pg_toast_<oid>_index|relfilenode is unchanged > +pg_toast.pg_toast_<oid>_index|relfilenode has changed > +test1x|relfilenode is unchanged), 'relfilenode change after REINDEX SYSTEM');
On Sun, Jul 17, 2022 at 10:58:26AM +0100, Simon Riggs wrote: > Sounds great, looks fine. Thanks for your review. Ok, cool. At the end, I have decided to split the tests and the main patch into two different commits, as each is useful on its own. Doing so also helps in seeing the difference of behavior when issuing a REINDEX DATABASE. Another thing that was itching me with the test is that it was not possible to make the difference between the toast index of the catalog and of the user table, so I have added the parent table name as an extra thing stored in the table storing the relfilenodes. -- Michael
Attachment
On Mon, Jul 18, 2022 at 09:26:53PM -0500, Justin Pryzby wrote: > Sorry, I meant to send this earlier.. No problem. > It looks like you named the table "toast_relfilenodes", but then also store > to it data for non-toast tables. How about naming that index_relfilenodes? One difference with what I posted previously and 5fb5b6 is the addition of an extra regclass that stores the parent table, for reference in the output. > It's also a bit weird to call the column "relname" but use it to store the > ::regclass. You later need to cast the column to text, so you may as well > store it as text, either relname or oid::regclass. I have used "indname" at the end. > It seems like cluster.sql does this more succinctly. Except that this does not include the relfilenodes from the toast indexes, which is something I wanted to add a check for when it comes to both user tables and catalogs. > Why {4,5} ? Looks like a brain fade from here, while looking the relation names this generated. This could just match with an integer. -- Michael
Attachment
On Tue, Jul 19, 2022 at 01:13:34PM +0900, Michael Paquier wrote: > > It looks like you named the table "toast_relfilenodes", but then also store > > to it data for non-toast tables. > > How about naming that index_relfilenodes? One difference with what I > posted previously and 5fb5b6 is the addition of an extra regclass that > stores the parent table, for reference in the output. Looks fine > - 'CREATE TABLE toast_relfilenodes (parent regclass, indname regclass, relfilenode oid);' > + 'CREATE TABLE index_relfilenodes (parent regclass, indname regclass, relfilenode oid);' -- Justi
On Wed, Jul 20, 2022 at 07:27:07AM -0500, Justin Pryzby wrote: > Looks fine Thanks for checking, applied. -- Michael