Thread: reindex concurrently and two toast indexes
Forking old, long thread: https://www.postgresql.org/message-id/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote: > About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes? > I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index. > Reproduce: > session 1: begin; select from test_toast ... for update; > session 2: reindex table CONCURRENTLY test_toast ; > session 2: interrupt by ctrl+C > session 1: commit > session 2: reindex table test_toast ; > and now we have two toast indexes. DROP INDEX is able to remove only invalid ones. Valid index gives "ERROR: permissiondenied: "pg_toast_16426_index_ccnew" is a system catalog" > [1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com It looks like this was never addressed. I noticed a ccnew toast index sitting around since October - what do I do with it ? ts=# DROP INDEX pg_toast.pg_toast_463881620_index_ccnew; ERROR: permission denied: "pg_toast_463881620_index_ccnew" is a system catalog -- Justin
On Sun, Feb 16, 2020 at 01:08:35PM -0600, Justin Pryzby wrote: > Forking old, long thread: > https://www.postgresql.org/message-id/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net > On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote: >> About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes? >> I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index. >> Reproduce: >> session 1: begin; select from test_toast ... for update; >> session 2: reindex table CONCURRENTLY test_toast ; >> session 2: interrupt by ctrl+C >> session 1: commit >> session 2: reindex table test_toast ; >> and now we have two toast indexes. DROP INDEX is able to remove >> only invalid ones. Valid index gives "ERROR: permission denied: >> "pg_toast_16426_index_ccnew" is a system catalog" >> [1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com > > It looks like this was never addressed. On HEAD, this exact scenario leads to the presence of an old toast index pg_toast.pg_toast_*_index_ccold, causing the index to be skipped on a follow-up concurrent reindex: =# reindex table CONCURRENTLY test_toast ; WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16385_index_ccold" concurrently, skipping LOCATION: ReindexRelationConcurrently, indexcmds.c:2863 REINDEX And this toast index can be dropped while it remains invalid: =# drop index pg_toast.pg_toast_16385_index_ccold; DROP INDEX I recall testing that stuff for all the interrupts which could be triggered and in this case, this waits at step 5 within WaitForLockersMultiple(). Now, in your case you take an extra step with a plain REINDEX, which forces a rebuild of the invalid toast index, making it per se valid, and not droppable. Hmm. There could be an argument here for skipping invalid toast indexes within reindex_index(), because we are sure about having at least one valid toast index at anytime, and these are not concerned with CIC. Any thoughts? -- Michael
Attachment
On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Feb 16, 2020 at 01:08:35PM -0600, Justin Pryzby wrote: > > Forking old, long thread: > > https://www.postgresql.org/message-id/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net > > On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote: > >> About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes? > >> I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index. > >> Reproduce: > >> session 1: begin; select from test_toast ... for update; > >> session 2: reindex table CONCURRENTLY test_toast ; > >> session 2: interrupt by ctrl+C > >> session 1: commit > >> session 2: reindex table test_toast ; > >> and now we have two toast indexes. DROP INDEX is able to remove > >> only invalid ones. Valid index gives "ERROR: permission denied: > >> "pg_toast_16426_index_ccnew" is a system catalog" > >> [1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com > > > > It looks like this was never addressed. > > On HEAD, this exact scenario leads to the presence of an old toast > index pg_toast.pg_toast_*_index_ccold, causing the index to be skipped > on a follow-up concurrent reindex: > =# reindex table CONCURRENTLY test_toast ; > WARNING: XX002: cannot reindex invalid index > "pg_toast.pg_toast_16385_index_ccold" concurrently, skipping > LOCATION: ReindexRelationConcurrently, indexcmds.c:2863 > REINDEX > > And this toast index can be dropped while it remains invalid: > =# drop index pg_toast.pg_toast_16385_index_ccold; > DROP INDEX > > I recall testing that stuff for all the interrupts which could be > triggered and in this case, this waits at step 5 within > WaitForLockersMultiple(). Now, in your case you take an extra step > with a plain REINDEX, which forces a rebuild of the invalid toast > index, making it per se valid, and not droppable. > > Hmm. There could be an argument here for skipping invalid toast > indexes within reindex_index(), because we are sure about having at > least one valid toast index at anytime, and these are not concerned > with CIC. Or even automatically drop any invalid index on toast relation in reindex_relation, since those can't be due to a failed CIC?
On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote: > On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote: >> Hmm. There could be an argument here for skipping invalid toast >> indexes within reindex_index(), because we are sure about having at >> least one valid toast index at anytime, and these are not concerned >> with CIC. > > Or even automatically drop any invalid index on toast relation in > reindex_relation, since those can't be due to a failed CIC? No, I don't like much outsmarting REINDEX with more index drops than it needs to do. And this would not take care of the case with REINDEX INDEX done directly on a toast index. -- Michael
Attachment
On Tue, Feb 18, 2020 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote: > > On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote: > >> Hmm. There could be an argument here for skipping invalid toast > >> indexes within reindex_index(), because we are sure about having at > >> least one valid toast index at anytime, and these are not concerned > >> with CIC. > > > > Or even automatically drop any invalid index on toast relation in > > reindex_relation, since those can't be due to a failed CIC? > > No, I don't like much outsmarting REINDEX with more index drops than > it needs to do. And this would not take care of the case with REINDEX > INDEX done directly on a toast index. Well, we could still do both but I get the objection. Then skipping invalid toast indexes in reindex_relation looks like the best fix.
On Tue, Feb 18, 2020 at 07:39:49AM +0100, Julien Rouhaud wrote: > On Tue, Feb 18, 2020 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote: > > > On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote: > > >> Hmm. There could be an argument here for skipping invalid toast > > >> indexes within reindex_index(), because we are sure about having at > > >> least one valid toast index at anytime, and these are not concerned > > >> with CIC. > > > > > > Or even automatically drop any invalid index on toast relation in > > > reindex_relation, since those can't be due to a failed CIC? > > > > No, I don't like much outsmarting REINDEX with more index drops than > > it needs to do. And this would not take care of the case with REINDEX > > INDEX done directly on a toast index. > > Well, we could still do both but I get the objection. Then skipping > invalid toast indexes in reindex_relation looks like the best fix. PFA a patch to fix the problem using this approach. I also added isolation tester regression tests. The failure is simulated using a pg_cancel_backend() on top of pg_stat_activity, using filters on a specifically set application name and the query text to avoid any unwanted interaction. I also added a 1s locking delay, to ensure that even slow/CCA machines can consistently reproduce the failure. Maybe that's not enough, or maybe testing this scenario is not worth the extra time.
Attachment
On Tue, Feb 18, 2020 at 02:29:33PM +0900, Michael Paquier wrote: > On Sun, Feb 16, 2020 at 01:08:35PM -0600, Justin Pryzby wrote: > > Forking old, long thread: > > https://www.postgresql.org/message-id/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net > > On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote: > >> About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes? > >> I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index. > >> Reproduce: > >> session 1: begin; select from test_toast ... for update; > >> session 2: reindex table CONCURRENTLY test_toast ; > >> session 2: interrupt by ctrl+C > >> session 1: commit > >> session 2: reindex table test_toast ; > >> and now we have two toast indexes. DROP INDEX is able to remove > >> only invalid ones. Valid index gives "ERROR: permission denied: > >> "pg_toast_16426_index_ccnew" is a system catalog" > >> [1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com > > > > It looks like this was never addressed. > > On HEAD, this exact scenario leads to the presence of an old toast > index pg_toast.pg_toast_*_index_ccold, causing the index to be skipped > on a follow-up concurrent reindex: > =# reindex table CONCURRENTLY test_toast ; > WARNING: XX002: cannot reindex invalid index > "pg_toast.pg_toast_16385_index_ccold" concurrently, skipping > LOCATION: ReindexRelationConcurrently, indexcmds.c:2863 > REINDEX > > And this toast index can be dropped while it remains invalid: > =# drop index pg_toast.pg_toast_16385_index_ccold; > DROP INDEX > > I recall testing that stuff for all the interrupts which could be > triggered and in this case, this waits at step 5 within > WaitForLockersMultiple(). Now, in your case you take an extra step > with a plain REINDEX, which forces a rebuild of the invalid toast > index, making it per se valid, and not droppable. > > Hmm. There could be an argument here for skipping invalid toast > indexes within reindex_index(), because we are sure about having at > least one valid toast index at anytime, and these are not concerned > with CIC. Julien sent a patch for that, but here are my ideas (which you are free to reject): Could you require an AEL for that case, or something which will preclude reindex table test_toast from working ? Could you use atomic updates to ensure that exactly one index in an {old,new} pair is invalid at any given time ? Could you make the new (invalid) toast index not visible to other transactions? -- Justin Pryzby
On Sat, Feb 22, 2020 at 08:09:24AM +0100, Julien Rouhaud wrote: > On Tue, Feb 18, 2020 at 07:39:49AM +0100, Julien Rouhaud wrote: > > On Tue, Feb 18, 2020 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote: > > > > On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote: > > > >> Hmm. There could be an argument here for skipping invalid toast > > > >> indexes within reindex_index(), because we are sure about having at > > > >> least one valid toast index at anytime, and these are not concerned > > > >> with CIC. > > PFA a patch to fix the problem using this approach. > > I also added isolation tester regression tests. The failure is simulated using > a pg_cancel_backend() on top of pg_stat_activity, using filters on a > specifically set application name and the query text to avoid any unwanted > interaction. I also added a 1s locking delay, to ensure that even slow/CCA > machines can consistently reproduce the failure. Maybe that's not enough, or > maybe testing this scenario is not worth the extra time. Sorry, I just realized that I forgot to commit the last changes before sending the patch, so here's the correct v2.
Attachment
On Sat, Feb 22, 2020 at 04:06:57PM +0100, Julien Rouhaud wrote: > Sorry, I just realized that I forgot to commit the last changes before sending > the patch, so here's the correct v2. Thanks for the patch. > + if (skipit) > + { > + ereport(NOTICE, > + (errmsg("skipping invalid index \"%s.%s\"", > + get_namespace_name(get_rel_namespace(indexOid)), > + get_rel_name(indexOid)))); ReindexRelationConcurrently() issues a WARNING when bumping on an invalid index, shouldn't the same log level be used? Even with this patch, it is possible to reindex an invalid toast index with REINDEX INDEX (with and without CONCURRENTLY), which is the problem I mentioned upthread (Er, actually only for the non-concurrent case as told about reindex_index). Shouldn't both cases be prevented as well with an ERROR? -- Michael
Attachment
On Thu, Feb 27, 2020 at 04:32:11PM +0900, Michael Paquier wrote: > On Sat, Feb 22, 2020 at 04:06:57PM +0100, Julien Rouhaud wrote: > > Sorry, I just realized that I forgot to commit the last changes before sending > > the patch, so here's the correct v2. > > Thanks for the patch. > > > + if (skipit) > > + { > > + ereport(NOTICE, > > + (errmsg("skipping invalid index \"%s.%s\"", > > + get_namespace_name(get_rel_namespace(indexOid)), > > + get_rel_name(indexOid)))); > > ReindexRelationConcurrently() issues a WARNING when bumping on an > invalid index, shouldn't the same log level be used? For ReindexRelationConcurrently, the index is skipped because the feature isn't supported, thus a warning. In this case that would work, it's just that we don't want to process such indexes, so I used a notice instead. I'm not opposed to use a warning instead if you prefer. What errcode should be used though, ERRCODE_WARNING? ERRCODE_FEATURE_NOT_SUPPORTED doesn't feel right. > Even with this patch, it is possible to reindex an invalid toast index > with REINDEX INDEX (with and without CONCURRENTLY), which is the > problem I mentioned upthread (Er, actually only for the non-concurrent > case as told about reindex_index). Shouldn't both cases be prevented > as well with an ERROR? Ah indeed, sorry I missed that. While looking at it, I see that invalid indexes seem to leaked when the table is dropped, with no way to get rid of them: s1: CREATE TABLE t1(val text); CREATE INDEX ON t1 (val); BEGIN; SELECT * FROM t1 FOR UPDATE; s2: REINDEX TABLE CONCURRENTLY t1; [stucked and canceled] SELECT indexrelid::regclass, indrelid::regclass FROM pg_index WHERE NOT indisvalid; indexrelid | indrelid -------------------------------------+------------------------- t1_val_idx_ccold | t1 pg_toast.pg_toast_16385_index_ccold | pg_toast.pg_toast_16385 (2 rows) s1: ROLLBACK; DROP TABLE t1; SELECT indexrelid::regclass, indrelid::regclass FROM pg_index WHERE NOT indisvalid; indexrelid | indrelid -------------------------------------+---------- t1_val_idx_ccold | 16385 pg_toast.pg_toast_16385_index_ccold | 16388 (2 rows) REINDEX INDEX t1_val_idx_ccold; ERROR: XX000: could not open relation with OID 16385 LOCATION: relation_open, relation.c:62 DROP INDEX t1_val_idx_ccold; ERROR: XX000: could not open relation with OID 16385 LOCATION: relation_open, relation.c:62 REINDEX INDEX pg_toast.pg_toast_16385_index_ccold; ERROR: XX000: could not open relation with OID 16388 LOCATION: relation_open, relation.c:62 DROP INDEX pg_toast.pg_toast_16385_index_ccold; ERROR: XX000: could not open relation with OID 16388 LOCATION: relation_open, relation.c:62 REINDEX DATABASE rjuju; REINDEX SELECT indexrelid::regclass, indrelid::regclass FROM pg_index WHERE NOT indisvalid; indexrelid | indrelid -------------------------------------+---------- t1_val_idx_ccold | 16385 pg_toast.pg_toast_16385_index_ccold | 16388 (2 rows) Shouldn't DROP TABLE be fixed to also drop invalid indexes?
On Thu, Feb 27, 2020 at 09:07:35AM +0100, Julien Rouhaud wrote: > While looking at it, I see that invalid indexes seem to leaked when the table > is dropped, with no way to get rid of them: > > Shouldn't DROP TABLE be fixed to also drop invalid indexes? Hmm. The problem here is that I think that we don't have the correct correct interface to handle the dependency switching between the old and new indexes from the start, and 68ac9cf made things better in some aspects (like non-cancellation and old index drop) but not in others (like yours, or even a column drop). changeDependenciesOf/On() have been added especially for REINDEX CONCURRENTLY, but they are not actually able to handle the case we want them to handle: do a switch for both relations within the same scan. It is possible to use three times the existing routines with a couple of CCIs in-between and what I would call a fake placeholder OID to switch all the records cleanly, but it would be actually cleaner to do a single scan of pg_depend and switch the dependencies of both objects at once. Attached is a draft patch to take care of that problem for HEAD. It still needs a lot of polishing (variable names are not actually old or new anymore, etc.) but that's enough to show the idea. If a version reaches PG12, we would need to keep around the past routines to avoid an ABI breakage, even if I doubt there are callers of it, but who knows.. -- Michael
Attachment
On Tue, Mar 03, 2020 at 05:06:42PM +0900, Michael Paquier wrote: > Attached is a draft patch to take care of that problem for HEAD. It > still needs a lot of polishing (variable names are not actually old > or new anymore, etc.) but that's enough to show the idea. If a version > reaches PG12, we would need to keep around the past routines to avoid > an ABI breakage, even if I doubt there are callers of it, but who > knows.. Or actually, a more simple solution is to abuse of the two existing routines so as the dependency switch is done the other way around, from the new index to the old one. That would visibly work because there is no CCI between each scan, and that's faster because the scan of pg_depend is done only on the entries in need of an update. I'll look at that again tomorrow, it is late here and I may be missing something obvious. -- Michael
Attachment
On Tue, Mar 03, 2020 at 06:25:51PM +0900, Michael Paquier wrote: > Or actually, a more simple solution is to abuse of the two existing > routines so as the dependency switch is done the other way around, > from the new index to the old one. That would visibly work because > there is no CCI between each scan, and that's faster because the scan > of pg_depend is done only on the entries in need of an update. I'll > look at that again tomorrow, it is late here and I may be missing > something obvious. It was a good inspiration. I have been torturing this patch today and played with it by injecting elog(ERROR) calls in the middle of reindex concurrently for all the phases, and checked manually the handling of entries in pg_depend for the new and old indexes, and these correctly map. So this is taking care of your problem. Attached is an updated patch with an updated comment about the dependency of this code with CCIs. I'd like to go fix this issue first. -- Michael
Attachment
On Wed, Mar 4, 2020 at 6:15 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Mar 03, 2020 at 06:25:51PM +0900, Michael Paquier wrote: > > Or actually, a more simple solution is to abuse of the two existing > > routines so as the dependency switch is done the other way around, > > from the new index to the old one. That would visibly work because > > there is no CCI between each scan, and that's faster because the scan > > of pg_depend is done only on the entries in need of an update. I'll > > look at that again tomorrow, it is late here and I may be missing > > something obvious. > > It was a good inspiration. I have been torturing this patch today and > played with it by injecting elog(ERROR) calls in the middle of reindex > concurrently for all the phases, and checked manually the handling of > entries in pg_depend for the new and old indexes, and these correctly > map. So this is taking care of your problem. Attached is an updated > patch with an updated comment about the dependency of this code with > CCIs. I'd like to go fix this issue first. Thanks for the patch! I started to look at it during the weekend, but I got interrupted and unfortunately didn't had time to look at it since. The fix looks good to me. I also tried multiple failure scenario and it's unsurprisingly working just fine. Should we add some regression tests for that? I guess most of it could be borrowed from the patch to fix the toast index issue I sent last week.
On Wed, Mar 04, 2020 at 09:21:45AM +0100, Julien Rouhaud wrote: > Thanks for the patch! I started to look at it during the weekend, but > I got interrupted and unfortunately didn't had time to look at it > since. No problem, thanks for looking at it. I have looked at it again this morning, and applied it. > The fix looks good to me. I also tried multiple failure scenario and > it's unsurprisingly working just fine. Should we add some regression > tests for that? I guess most of it could be borrowed from the patch > to fix the toast index issue I sent last week. I have doubts when it comes to use a strategy based on pg_cancel_backend() and a match of application_name (see for example 5ad72ce but I cannot find the associated thread). I think that we could design something more robust here and usable by all tests, with two things coming into my mind: - A new meta-command for isolation tests to be able to cancel a session with PQcancel(). - Fault injection in the backend. For the case of this thread, the cancellation command would be a better match. -- Michael
Attachment
On Thu, Mar 05, 2020 at 12:53:54PM +0900, Michael Paquier wrote: > On Wed, Mar 04, 2020 at 09:21:45AM +0100, Julien Rouhaud wrote: > > Thanks for the patch! I started to look at it during the weekend, but > > I got interrupted and unfortunately didn't had time to look at it > > since. > > No problem, thanks for looking at it. I have looked at it again this > morning, and applied it. > > > The fix looks good to me. I also tried multiple failure scenario and > > it's unsurprisingly working just fine. Should we add some regression > > tests for that? I guess most of it could be borrowed from the patch > > to fix the toast index issue I sent last week. > > I have doubts when it comes to use a strategy based on > pg_cancel_backend() and a match of application_name (see for example > 5ad72ce but I cannot find the associated thread). I think that we > could design something more robust here and usable by all tests, with > two things coming into my mind: > - A new meta-command for isolation tests to be able to cancel a > session with PQcancel(). > - Fault injection in the backend. > For the case of this thread, the cancellation command would be a better > match. I agree that the approach wasn't quite robust. I'll try to look at adding a new command for isolationtester, but that's probably not something we want to put in pg13? Here's a v3 that takes address the various comments you previously noted, and for which I also removed the regression tests. Note that while looking at it, I noticed another bug in RIC: # create table t1(id integer, val text); create index on t1(val); CREATE TABLE CREATE INDEX # reindex table concurrently t1; ^CCancel request sent ERROR: 57014: canceling statement due to user request LOCATION: ProcessInterrupts, postgres.c:3171 # select indexrelid::regclass, indrelid::regclass, indexrelid, indrelid from pg_index where not indisvalid; indexrelid | indrelid | indexrelid | indrelid -------------------------------------+-------------------------+------------+---------- t1_val_idx_ccold | t1 | 16401 | 16395 pg_toast.pg_toast_16395_index_ccold | pg_toast.pg_toast_16395 | 16400 | 16398 (2 rows) # reindex table concurrently t1; WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping LOCATION: ReindexRelationConcurrently, indexcmds.c:2821 WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16395_index_ccold" concurrently, skipping LOCATION: ReindexRelationConcurrently, indexcmds.c:2867 REINDEX # reindex index concurrently t1_val_idx_ccold; REINDEX That case is also fixed in this patch.
Attachment
On Thu, Mar 05, 2020 at 05:57:07PM +0100, Julien Rouhaud wrote: > I agree that the approach wasn't quite robust. I'll try to look at adding a > new command for isolationtester, but that's probably not something we want to > put in pg13? Yes, that's too late. > Note that while looking at it, I noticed another bug in RIC: > > [...] > > # reindex table concurrently t1; > WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping > LOCATION: ReindexRelationConcurrently, indexcmds.c:2821 > WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16395_index_ccold" concurrently, skipping > LOCATION: ReindexRelationConcurrently, indexcmds.c:2867 > REINDEX > # reindex index concurrently t1_val_idx_ccold; > REINDEX > > That case is also fixed in this patch. This choice is intentional. The idea about bypassing invalid indexes for table-level REINDEX is that this would lead to a bloat in the number of relations to handling if multiple runs are failing, leading to more and more invalid indexes to handle each time. Allowing a single invalid non-toast index to be reindexed with CONCURRENTLY can be helpful in some cases, like for example a CIC for a unique index that failed and was invalid, where the relation already defined can be reused. -- Michael
Attachment
On Fri, Mar 06, 2020 at 10:38:44AM +0900, Michael Paquier wrote: > On Thu, Mar 05, 2020 at 05:57:07PM +0100, Julien Rouhaud wrote: > > I agree that the approach wasn't quite robust. I'll try to look at adding a > > new command for isolationtester, but that's probably not something we want to > > put in pg13? > > Yes, that's too late. > > > Note that while looking at it, I noticed another bug in RIC: > > > > [...] > > > > # reindex table concurrently t1; > > WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping > > LOCATION: ReindexRelationConcurrently, indexcmds.c:2821 > > WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16395_index_ccold" concurrently, skipping > > LOCATION: ReindexRelationConcurrently, indexcmds.c:2867 > > REINDEX > > # reindex index concurrently t1_val_idx_ccold; > > REINDEX > > > > That case is also fixed in this patch. > > This choice is intentional. The idea about bypassing invalid indexes > for table-level REINDEX is that this would lead to a bloat in the > number of relations to handling if multiple runs are failing, leading > to more and more invalid indexes to handle each time. Allowing a > single invalid non-toast index to be reindexed with CONCURRENTLY can > be helpful in some cases, like for example a CIC for a unique index > that failed and was invalid, where the relation already defined can be > reused. Ah I see, thanks for the clarification. I guess there's room for improvement in the comments about that, since the ERRCODE_FEATURE_NOT_SUPPORTED usage is quite misleading there. v4 attached, which doesn't prevent a REINDEX INDEX CONCURRENTLY on any invalid non-TOAST index anymore.
Attachment
On Thu, Mar 05, 2020 at 12:53:54PM +0900, Michael Paquier wrote: > On Wed, Mar 04, 2020 at 09:21:45AM +0100, Julien Rouhaud wrote: > > > Should we add some regression > > tests for that? I guess most of it could be borrowed from the patch > > to fix the toast index issue I sent last week. > > I have doubts when it comes to use a strategy based on > pg_cancel_backend() and a match of application_name (see for example > 5ad72ce but I cannot find the associated thread). I think that we > could design something more robust here and usable by all tests, with > two things coming into my mind: > - A new meta-command for isolation tests to be able to cancel a > session with PQcancel(). > - Fault injection in the backend. > For the case of this thread, the cancellation command would be a better > match. Here's a patch to add an optional "timeout val" clause to isolationtester's step definition. When used, isolationtester will actively wait on the query rather than continuing with the permutation next step, and will issue a cancel once the defined timeout is reached. I also added as a POC the previous regression tests for invalid TOAST indexes, updated to use this new infrastructure (which won't pass as long as the original bug for invalid TOAST indexes isn't fixed). I'll park that in the next commitfest, with a v14 target version.
Attachment
On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote: > Here's a patch to add an optional "timeout val" clause to isolationtester's > step definition. When used, isolationtester will actively wait on the query > rather than continuing with the permutation next step, and will issue a cancel > once the defined timeout is reached. I also added as a POC the previous > regression tests for invalid TOAST indexes, updated to use this new > infrastructure (which won't pass as long as the original bug for invalid TOAST > indexes isn't fixed). One problem with this approach is that it does address the stability of the test on very slow machines, and there are some of them in the buildfarm. Taking your patch, I can make the test fail by applying the following sleep because the query would be cancelled before some of the indexes are marked as invalid: --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -3046,6 +3046,8 @@ ReindexRelationConcurrently(Oid relationOid, int options) CommitTransactionCommand(); StartTransactionCommand(); + pg_usleep(100000L * 10); /* 10s */ + /* * Phase 2 of REINDEX CONCURRENTLY Another problem is that on faster machines the test is slow because of the timeout used. What are your thoughts about having instead a cancel meta-command instead? -- Michael
Attachment
On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote: > On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote: > > Here's a patch to add an optional "timeout val" clause to isolationtester's > > step definition. When used, isolationtester will actively wait on the query > > rather than continuing with the permutation next step, and will issue a cancel > > once the defined timeout is reached. I also added as a POC the previous > > regression tests for invalid TOAST indexes, updated to use this new > > infrastructure (which won't pass as long as the original bug for invalid TOAST > > indexes isn't fixed). > > One problem with this approach is that it does address the stability > of the test on very slow machines, and there are some of them in the > buildfarm. Taking your patch, I can make the test fail by applying > the following sleep because the query would be cancelled before some > of the indexes are marked as invalid: > --- a/src/backend/commands/indexcmds.c > +++ b/src/backend/commands/indexcmds.c > @@ -3046,6 +3046,8 @@ ReindexRelationConcurrently(Oid relationOid, int > options) > CommitTransactionCommand(); > StartTransactionCommand(); > > + pg_usleep(100000L * 10); /* 10s */ > + > /* > * Phase 2 of REINDEX CONCURRENTLY > > Another problem is that on faster machines the test is slow because of > the timeout used. What are your thoughts about having instead a > cancel meta-command instead? Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose to fix this problem by having a timeout long enough to statisfy the slower buildfarm members, even when running on fast machines, so I assumed that the same approach could be used here. I agree that the 1s timeout I used is maybe too low, but that's easy enough to change. Another point is that it's possible to have a close behavior without this patch by using a statement_timeout (the active wait does change things though), but the spec files would be more verbose.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote: >> On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote: >>> Here's a patch to add an optional "timeout val" clause to isolationtester's >>> step definition. When used, isolationtester will actively wait on the query >>> rather than continuing with the permutation next step, and will issue a cancel >>> once the defined timeout is reached. >> One problem with this approach is that it does address the stability >> of the test on very slow machines, and there are some of them in the >> buildfarm. > Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose > to fix this problem by having a timeout long enough to statisfy the slower > buildfarm members, even when running on fast machines, so I assumed that the > same approach could be used here. The arbitrarily-set timeouts that exist in some of the isolation tests are horrid kluges that have caused us lots of headaches in the past and no doubt will again in the future. Aside from occasionally failing when a machine is particularly overloaded, they cause the tests to take far longer than necessary on decently-fast machines. So ideally we'd get rid of those entirely in favor of some more-dynamic approach. Admittedly, I have no proposal for what that would be. But adding yet more ways to set a (guaranteed-to-be-wrong) timeout seems like the wrong direction to be going in. What's the actual need that you're trying to deal with? regards, tom lane
On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote: > >> On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote: > >>> Here's a patch to add an optional "timeout val" clause to isolationtester's > >>> step definition. When used, isolationtester will actively wait on the query > >>> rather than continuing with the permutation next step, and will issue a cancel > >>> once the defined timeout is reached. > > >> One problem with this approach is that it does address the stability > >> of the test on very slow machines, and there are some of them in the > >> buildfarm. > > > Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose > > to fix this problem by having a timeout long enough to statisfy the slower > > buildfarm members, even when running on fast machines, so I assumed that the > > same approach could be used here. > > The arbitrarily-set timeouts that exist in some of the isolation tests > are horrid kluges that have caused us lots of headaches in the past > and no doubt will again in the future. Aside from occasionally failing > when a machine is particularly overloaded, they cause the tests to > take far longer than necessary on decently-fast machines. Yeah, I have no doubt that it has been a pain, and this patch is clearly not a bullet-proof solution. > So ideally > we'd get rid of those entirely in favor of some more-dynamic approach. > Admittedly, I have no proposal for what that would be. The fault injection framework that was previously discussed would cover most of the usecase I can think of, but that's a way bigger project. > But adding yet > more ways to set a (guaranteed-to-be-wrong) timeout seems like the > wrong direction to be going in. Fair enough, I'll mark the patch as rejected then. > What's the actual need that you're trying to deal with? Testing the correct behavior of non trivial commands, such as CIC/reindex concurrently, that fails during the execution.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: >> What's the actual need that you're trying to deal with? > Testing the correct behavior of non trivial commands, such as CIC/reindex > concurrently, that fails during the execution. Hmm ... don't see how a timeout helps with that? regards, tom lane
On Sat, Mar 07, 2020 at 04:09:31PM -0500, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: > >> What's the actual need that you're trying to deal with? > > > Testing the correct behavior of non trivial commands, such as CIC/reindex > > concurrently, that fails during the execution. > > Hmm ... don't see how a timeout helps with that? For reindex concurrently, a SELECT FOR UPDATE on a different connection can ensure that the reindex will be stuck at some point, so canceling the command after a long enough timeout reproduces the original faulty behavior.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Sat, Mar 07, 2020 at 04:09:31PM -0500, Tom Lane wrote: >> Julien Rouhaud <rjuju123@gmail.com> writes: >>> On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: >>>> What's the actual need that you're trying to deal with? >>> Testing the correct behavior of non trivial commands, such as CIC/reindex >>> concurrently, that fails during the execution. >> Hmm ... don't see how a timeout helps with that? > For reindex concurrently, a SELECT FOR UPDATE on a different connection can > ensure that the reindex will be stuck at some point, so canceling the command > after a long enough timeout reproduces the original faulty behavior. Hmm, seems like a pretty arbitrary (and slow) way to test that. I'd envision testing that by setting up a case with an expression index where the expression is designed to fail at some point partway through the build -- say, with a divide-by-zero triggered by one of the tuples to be indexed. regards, tom lane
On Sat, Mar 07, 2020 at 04:23:58PM -0500, Tom Lane wrote: > Hmm, seems like a pretty arbitrary (and slow) way to test that. I'd > envision testing that by setting up a case with an expression index > where the expression is designed to fail at some point partway through > the build -- say, with a divide-by-zero triggered by one of the tuples > to be indexed. I am not sure that I think that's very tricky to get an invalid index _ccold after the swap phase with what the existing test facility provides, because the new index is already built at the point where the dependencies are switched so you cannot rely on a failure when building the index. Note also that some tests of CREATE INDEX CONCURRENTLY rely on the uniqueness to create invalid index entries (division by zero is fine as well). And, actually, if you rely on that, you can get invalid _ccnew entries easily: create table aa (a int); insert into aa values (1),(1); create unique index concurrently aai on aa (a); reindex index concurrently aai; =# \d aa Table "public.aa" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | Indexes: "aai" UNIQUE, btree (a) INVALID "aai_ccnew" UNIQUE, btree (a) INVALID That's before the dependency swapping is done though... With a fault injection facility, it would be possible to test the stability of the operation by enforcing for example failures after the start of each inner transaction of REINDEX CONCURRENTLY. -- Michael
Attachment
On Fri, Mar 06, 2020 at 01:36:48PM +0100, Julien Rouhaud wrote: > Ah I see, thanks for the clarification. I guess there's room for improvement > in the comments about that, since the ERRCODE_FEATURE_NOT_SUPPORTED usage is > quite misleading there. > > v4 attached, which doesn't prevent a REINDEX INDEX CONCURRENTLY on any invalid > non-TOAST index anymore. Thanks. The position of the error check in reindex_relation() is correct, but as it opens a relation for the cache lookup let's invent a new routine in lsyscache.c to grab pg_index.indisvalid. It is possible to make use of this routine with all the other checks: - WARNING for REINDEX TABLE (non-conurrent) - ERROR for REINDEX INDEX (non-conurrent) - ERROR for REINDEX INDEX CONCURRENTLY (There is already a WARNING for REINDEX TABLE CONCURRENTLY.) I did not find the addition of an error check in ReindexIndex() consistent with the existing practice to check the state of the relation reindexed in reindex_index() (for the non-concurrent case) and ReindexRelationConcurrently() (for the concurrent case). Okay, this leads to the introduction of two new ERROR messages related to invalid toast indexes for the concurrent and the non-concurrent cases when using REINDEX INDEX instead of one, but having two messages leads to something much more consistent with the rest, and all checks remain centralized in the same routines. For the index-level operation, issuing a WARNING is not consistent with the existing practice to use an ERROR, which is more adapted as the operation is done on a single index at a time. For the check in reindex_relation, it is more consistent to check the namespace of the index instead of the parent relation IMO (the previous patch used "rel", which refers to the parent table). This has in practice no consequence though. It would have been nice to test this stuff. However, this requires an invalid toast index and we cannot create that except by canceling a concurrent reindex, leading us back to the upthread discussion about isolation tests, timeouts and fault injection :/ Any opinions? -- Michael
Attachment
On Mon, Mar 09, 2020 at 02:52:31PM +0900, Michael Paquier wrote: > On Fri, Mar 06, 2020 at 01:36:48PM +0100, Julien Rouhaud wrote: > > > > v4 attached, which doesn't prevent a REINDEX INDEX CONCURRENTLY on any invalid > > non-TOAST index anymore. > > Thanks. The position of the error check in reindex_relation() is > correct, but as it opens a relation for the cache lookup let's invent > a new routine in lsyscache.c to grab pg_index.indisvalid. It is > possible to make use of this routine with all the other checks: > - WARNING for REINDEX TABLE (non-conurrent) > - ERROR for REINDEX INDEX (non-conurrent) > - ERROR for REINDEX INDEX CONCURRENTLY > (There is already a WARNING for REINDEX TABLE CONCURRENTLY.) > > I did not find the addition of an error check in ReindexIndex() > consistent with the existing practice to check the state of the > relation reindexed in reindex_index() (for the non-concurrent case) > and ReindexRelationConcurrently() (for the concurrent case). Okay, > this leads to the introduction of two new ERROR messages related to > invalid toast indexes for the concurrent and the non-concurrent cases > when using REINDEX INDEX instead of one, but having two messages leads > to something much more consistent with the rest, and all checks remain > centralized in the same routines. I wanted to go this way at first but hesitated and finally chose to add less checks, so I'm fine with this approach, and patch looks good to me. > For the index-level operation, issuing a WARNING is not consistent > with the existing practice to use an ERROR, which is more adapted as > the operation is done on a single index at a time. Agreed. > For the check in reindex_relation, it is more consistent to check the > namespace of the index instead of the parent relation IMO (the > previous patch used "rel", which refers to the parent table). This > has in practice no consequence though. Oops yes. > It would have been nice to test this stuff. However, this requires an > invalid toast index and we cannot create that except by canceling a > concurrent reindex, leading us back to the upthread discussion about > isolation tests, timeouts and fault injection :/ Yes, unfortunately I don't see an acceptable way to add tests for that without some kind of fault injection, so this will have to wait :(
On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: > The arbitrarily-set timeouts that exist in some of the isolation tests > are horrid kluges that have caused us lots of headaches in the past > and no doubt will again in the future. Aside from occasionally failing > when a machine is particularly overloaded, they cause the tests to > take far longer than necessary on decently-fast machines. So ideally > we'd get rid of those entirely in favor of some more-dynamic approach. > Admittedly, I have no proposal for what that would be. But adding yet > more ways to set a (guaranteed-to-be-wrong) timeout seems like the > wrong direction to be going in. What's the actual need that you're > trying to deal with? As a matter of fact, the buildfarm member petalura just reported a failure with the isolation test "timeouts", the machine being extremely slow: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-08%2011%3A20%3A05 test timeouts ... FAILED 60330 ms [...] -step update: DELETE FROM accounts WHERE accountid = 'checking'; <waiting ...> -step update: <... completed> +step update: DELETE FROM accounts WHERE accountid = 'checking'; ERROR: canceling statement due to statement timeout -- Michael
Attachment
On Mon, Mar 09, 2020 at 04:47:27PM +0900, Michael Paquier wrote: > On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote: > > The arbitrarily-set timeouts that exist in some of the isolation tests > > are horrid kluges that have caused us lots of headaches in the past > > and no doubt will again in the future. Aside from occasionally failing > > when a machine is particularly overloaded, they cause the tests to > > take far longer than necessary on decently-fast machines. So ideally > > we'd get rid of those entirely in favor of some more-dynamic approach. > > Admittedly, I have no proposal for what that would be. But adding yet > > more ways to set a (guaranteed-to-be-wrong) timeout seems like the > > wrong direction to be going in. What's the actual need that you're > > trying to deal with? > > As a matter of fact, the buildfarm member petalura just reported a > failure with the isolation test "timeouts", the machine being > extremely slow: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-08%2011%3A20%3A05 > > test timeouts ... FAILED 60330 ms > [...] > -step update: DELETE FROM accounts WHERE accountid = 'checking'; <waiting ...> > -step update: <... completed> > +step update: DELETE FROM accounts WHERE accountid = 'checking'; > ERROR: canceling statement due to statement timeout Indeed. I guess we could add some kind of environment variable facility in isolationtester to let slow machine owner put a way bigger timeout without making the test super slow for everyone else, but that seems overkill for just one test, and given the other thread about deploying REL_11 build-farm client, that wouldn't be an immediate fix either.
Hi, On 2020-03-07 22:17:09 +0100, Julien Rouhaud wrote: > For reindex concurrently, a SELECT FOR UPDATE on a different connection can > ensure that the reindex will be stuck at some point, so canceling the command > after a long enough timeout reproduces the original faulty behavior. That kind of thing can already be done using statement_timeout or lock_timeout, no? Greetings, Andres Freund
On Mon, Mar 09, 2020 at 03:15:58PM -0700, Andres Freund wrote: > On 2020-03-07 22:17:09 +0100, Julien Rouhaud wrote: >> For reindex concurrently, a SELECT FOR UPDATE on a different connection can >> ensure that the reindex will be stuck at some point, so canceling the command >> after a long enough timeout reproduces the original faulty behavior. > > That kind of thing can already be done using statement_timeout or > lock_timeout, no? Yep, still that's not something I would recommend to commit in the tree as that's a double-edged sword as you already know. For slower machines, you need a statement_timeout large enough so as you make sure that the state you want the query to wait for is reached, which has a cost on all other faster machines as it makes the tests slower. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Mon, Mar 09, 2020 at 03:15:58PM -0700, Andres Freund wrote: >> That kind of thing can already be done using statement_timeout or >> lock_timeout, no? > Yep, still that's not something I would recommend to commit in the > tree as that's a double-edged sword as you already know. For slower > machines, you need a statement_timeout large enough so as you make > sure that the state you want the query to wait for is reached, which > has a cost on all other faster machines as it makes the tests slower. It strikes me to wonder whether we could improve matters by teaching isolationtester to watch for particular values in a connected backend's pg_stat_activity.wait_event_type/wait_event columns. Those columns didn't exist when isolationtester was designed, IIRC, so it's not surprising that they're not used in the current design. But we could use them perhaps to detect that a backend has arrived at some state that's not a heavyweight-lock-wait state. regards, tom lane
On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote: > It strikes me to wonder whether we could improve matters by teaching > isolationtester to watch for particular values in a connected backend's > pg_stat_activity.wait_event_type/wait_event columns. Those columns > didn't exist when isolationtester was designed, IIRC, so it's not > surprising that they're not used in the current design. But we could > use them perhaps to detect that a backend has arrived at some state > that's not a heavyweight-lock-wait state. Interesting idea. So that would be basically an equivalent of PostgresNode::poll_query_until but for the isolation tester? In short we gain a meta-command that runs a SELECT query that waits until the query defined in the command returns true. The polling interval may be tricky to set though. If set too low it would consume resources for nothing, and if set too large it would make the tests using this meta-command slower than they actually need to be. Perhaps something like 100ms may be fine.. -- Michael
Attachment
On Mon, Mar 09, 2020 at 08:04:27AM +0100, Julien Rouhaud wrote: > On Mon, Mar 09, 2020 at 02:52:31PM +0900, Michael Paquier wrote: >> For the index-level operation, issuing a WARNING is not consistent >> with the existing practice to use an ERROR, which is more adapted as >> the operation is done on a single index at a time. > > Agreed. Thanks for checking the patch. >> It would have been nice to test this stuff. However, this requires an >> invalid toast index and we cannot create that except by canceling a >> concurrent reindex, leading us back to the upthread discussion about >> isolation tests, timeouts and fault injection :/ > > Yes, unfortunately I don't see an acceptable way to add tests for that without > some kind of fault injection, so this will have to wait :( Let's discuss that separately. I have also been reviewing the isolation test you have added upthread about the dependency handling of invalid indexes, and one thing that we cannot really do is attempting to do a reindex at index or table-level with invalid toast indexes as this leads to unstable ERROR or WARNING messages. But at least one thing we can do is to extend the query you sent directly so as it exposes the toast relation name filtered with regex_replace(). This gives us a stable output, and this way the test makes sure that the query cancellation happened after the dependencies are swapped, and not at build or validation time (indisvalid got appended to the end of the output): +pg_toast.pg_toast_<oid>_index_ccoldf +pg_toast.pg_toast_<oid>_indext Please feel free to see the attached for reference, that's not something for commit in upstream, but I am going to keep that around in my own plugin tree. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote: >> It strikes me to wonder whether we could improve matters by teaching >> isolationtester to watch for particular values in a connected backend's >> pg_stat_activity.wait_event_type/wait_event columns. Those columns >> didn't exist when isolationtester was designed, IIRC, so it's not >> surprising that they're not used in the current design. But we could >> use them perhaps to detect that a backend has arrived at some state >> that's not a heavyweight-lock-wait state. > Interesting idea. So that would be basically an equivalent of > PostgresNode::poll_query_until but for the isolation tester? No, more like the existing isolationtester wait query, which watches for something being blocked on a heavyweight lock. Right now, that one depends on a bespoke function pg_isolation_test_session_is_blocked(), but it used to be a query on pg_stat_activity/pg_locks. > In short > we gain a meta-command that runs a SELECT query that waits until the > query defined in the command returns true. The polling interval may > be tricky to set though. I think it'd be just the same as the polling interval for the existing wait query. We'd have to have some way to mark a script step to say what to check to decide that it's blocked ... regards, tom lane
On Tue, Mar 10, 2020 at 12:09:42PM +0900, Michael Paquier wrote: > On Mon, Mar 09, 2020 at 08:04:27AM +0100, Julien Rouhaud wrote: >> Agreed. > > Thanks for checking the patch. And applied as 61d7c7b. Regarding the isolation tests, let's brainstorm on what we can do, but I am afraid that it is too late for 13. -- Michael
Attachment
On Tue, Mar 10, 2020 at 12:09:12AM -0400, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: > > On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote: > >> It strikes me to wonder whether we could improve matters by teaching > >> isolationtester to watch for particular values in a connected backend's > >> pg_stat_activity.wait_event_type/wait_event columns. Those columns > >> didn't exist when isolationtester was designed, IIRC, so it's not > >> surprising that they're not used in the current design. But we could > >> use them perhaps to detect that a backend has arrived at some state > >> that's not a heavyweight-lock-wait state. > > > Interesting idea. So that would be basically an equivalent of > > PostgresNode::poll_query_until but for the isolation tester? > > No, more like the existing isolationtester wait query, which watches > for something being blocked on a heavyweight lock. Right now, that > one depends on a bespoke function pg_isolation_test_session_is_blocked(), > but it used to be a query on pg_stat_activity/pg_locks. Ah interesting indeed! > > In short > > we gain a meta-command that runs a SELECT query that waits until the > > query defined in the command returns true. The polling interval may > > be tricky to set though. > > I think it'd be just the same as the polling interval for the existing > wait query. We'd have to have some way to mark a script step to say > what to check to decide that it's blocked ... So basically we could just change pg_isolation_test_session_is_blocked() to also return the wait_event_type and wait_event, and adding something like step "<name>" { SQL } [ cancel on "<wait_event_type>" "<wait_event>" ] to the step definition should be enough. I'm attaching a POC patch for that. On my laptop, the full test now complete in about 400ms. FTR the REINDEX TABLE CONCURRENTLY case is eventually locked on a virtualxid, I'm not sure if that's could lead to too early cancellation.
Attachment
On Tue, Mar 10, 2020 at 02:53:36PM +0100, Julien Rouhaud wrote: > So basically we could just change pg_isolation_test_session_is_blocked() to > also return the wait_event_type and wait_event, and adding something like Hmm. I think that Tom has in mind the reasons behind 511540d here. > step "<name>" { SQL } [ cancel on "<wait_event_type>" "<wait_event>" ] > > to the step definition should be enough. I'm attaching a POC patch for that. > On my laptop, the full test now complete in about 400ms. Not much a fan of that per the lack of flexibility, but we have a single function to avoid a huge performance impact when using CLOBBER_CACHE_ALWAYS, so we cannot really use a SQL-based logic either... > FTR the REINDEX TABLE CONCURRENTLY case is eventually locked on a virtualxid, > I'm not sure if that's could lead to too early cancellation. WaitForLockersMultiple() is called three times in this case, but your test case is waiting on a lock to be released for the old index which REINDEX CONCURRENTLY would like to drop at the beginning of step 5, so this should work reliably here. > + TupleDescInitEntry(tupdesc, (AttrNumber) 3, "wait_even", > + TEXTOID, -1, 0); Guess who is missing a 't' here. pg_isolation_test_session_is_blocked() is not documented and it is only used internally in the isolation test suite, so breaking its compatibility should be fine in practice.. Now you are actually changing it so as we get a more complex state of the blocked session, so I think that we should use a different function name, and a different function. Like pg_isolation_test_session_state? -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Tue, Mar 10, 2020 at 02:53:36PM +0100, Julien Rouhaud wrote: >> So basically we could just change pg_isolation_test_session_is_blocked() to >> also return the wait_event_type and wait_event, and adding something like > Hmm. I think that Tom has in mind the reasons behind 511540d here. Yeah, that history suggests that we need to be very protective of the performance of the wait-checking query, especially in CLOBBER_CACHE_ALWAYS builds. That being the case, I'm hesitant to consider changing the test function to return a tuple. That'll add quite a lot of overhead due to the cache lookups involved, or so my gut says. I'm also finding the proposed semantics (issue a cancel if wait state X is reached) to be odd and special-purpose. I was envisioning something more like "if wait state X is reached, consider the session to be blocked, the same as if it had reached a heavyweight-lock wait". Then isolationtester would move on to issue another step, which is where I'd envision putting the cancel for that particular test usage. So that idea leads to thinking that the wait-state specification is an input to pg_isolation_test_session_is_blocked, not an output. We could re-use Julien's ideas about the isolation spec syntax by making it be, roughly, step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ] and then those items would need to be passed as parameters of the prepared query. Or maybe we should use two different prepared queries depending on whether there's a BLOCKED IF spec. We probably don't need lock-wait detection if we're expecting a wait-state-based block, so maybe we should invent a separate backend function "is this process waiting with this type of wait state" and use that to check the state of a step that has this type of annotation. Just eyeing the proposed test case, I'm wondering whether this will actually be sufficiently fine-grained. It seems like "REINDEX has reached a wait on a virtual XID" is not really all that specific; it could match on other situations, such as blocking on a concurrent tuple update. Maybe it's okay given the restrictive context that we don't expect anything to be happening that the isolation test didn't ask for. I'd like to see an attempt to rewrite some of the existing timeout-dependent test cases to use this facility instead of long timeouts. If we could get rid of the timeouts in the deadlock tests, that'd go a long way towards showing that this idea is actually any good. regards, tom lane
On 2020-Mar-11, Tom Lane wrote: > We could re-use Julien's ideas about the isolation spec syntax by > making it be, roughly, > > step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ] > > and then those items would need to be passed as parameters of the prepared > query. I think for test readability's sake, it'd be better to put the BLOCKED IF clause ahead of the SQL, so you can write it in the same line and let the SQL flow to the next one: STEP "long_select" BLOCKED IF "lwlock" "ClogControlLock" { select foo from pg_class where ... some more long clauses ... } otherwise I think a step would require more lines to write. > I'd like to see an attempt to rewrite some of the existing > timeout-dependent test cases to use this facility instead of > long timeouts. If we could get rid of the timeouts in the > deadlock tests, that'd go a long way towards showing that this > idea is actually any good. +1. Those long timeouts are annoying enough that infrastructure to make a run shorter in normal circumstances might be sufficient justification for this patch ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Mar 11, 2020 at 05:52:54PM -0300, Alvaro Herrera wrote: > On 2020-Mar-11, Tom Lane wrote: >> We could re-use Julien's ideas about the isolation spec syntax by >> making it be, roughly, >> >> step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ] >> >> and then those items would need to be passed as parameters of the prepared >> query. > > I think for test readability's sake, it'd be better to put the BLOCKED > IF clause ahead of the SQL, so you can write it in the same line and let > the SQL flow to the next one: > > STEP "long_select" BLOCKED IF "lwlock" "ClogControlLock" > { select foo from pg_class where ... some more long clauses ... } > > otherwise I think a step would require more lines to write. I prefer this version. >> I'd like to see an attempt to rewrite some of the existing >> timeout-dependent test cases to use this facility instead of >> long timeouts. If we could get rid of the timeouts in the >> deadlock tests, that'd go a long way towards showing that this >> idea is actually any good. > > +1. Those long timeouts are annoying enough that infrastructure to make > a run shorter in normal circumstances might be sufficient justification > for this patch ... +1. A patch does not seem to be that complicated. Now isn't it too late for v13? -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > +1. A patch does not seem to be that complicated. Now isn't it too > late for v13? I think we've generally given new tests more slack than new features so far as schedule goes. If the patch ends up being complicated/invasive, I might vote to hold it for v14, but let's see it first. regards, tom lane
On Wed, Mar 11, 2020 at 05:52:54PM -0300, Alvaro Herrera wrote: > On 2020-Mar-11, Tom Lane wrote: > > > We could re-use Julien's ideas about the isolation spec syntax by > > making it be, roughly, > > > > step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ] > > > > and then those items would need to be passed as parameters of the prepared > > query. > > I think for test readability's sake, it'd be better to put the BLOCKED > IF clause ahead of the SQL, so you can write it in the same line and let > the SQL flow to the next one: > > STEP "long_select" BLOCKED IF "lwlock" "ClogControlLock" > { select foo from pg_class where ... some more long clauses ... } > > otherwise I think a step would require more lines to write. > > > I'd like to see an attempt to rewrite some of the existing > > timeout-dependent test cases to use this facility instead of > > long timeouts. If we could get rid of the timeouts in the > > deadlock tests, that'd go a long way towards showing that this > > idea is actually any good. > > +1. Those long timeouts are annoying enough that infrastructure to make > a run shorter in normal circumstances might be sufficient justification > for this patch ... I'm not familiar with those test so I'm probably missing something, but looks like all isolation tests that setup a timeout are doing so to test server side features (deadlock detection, statement and lock timeout). I'm not sure how adding a client-side facility to detect locks earlier is going to help reducing the server side timeouts? For the REINDEX CONCURRENTLY failure test, the problem that needs to be solved isn't detecting that the command is blocked as it's already getting blocked on a heavyweight lock, but being able to reliably cancel a specific query as early as possible, which AFAICS isn't possible with current isolation tester: - either we reliably cancel the query using a statement timeout, but we'll make it slow for everyone - either we send a blind pg_cancel_backend() hoping that we don't catch anything else (and also make it slower than required to make sure that it's not canceled to early) So we would actually only need something like this to make it work: step "<name>" [ CANCEL IF BLOCKED ] { <SQL }
Julien Rouhaud <rjuju123@gmail.com> writes: > On Wed, Mar 11, 2020 at 05:52:54PM -0300, Alvaro Herrera wrote: >> On 2020-Mar-11, Tom Lane wrote: >>> I'd like to see an attempt to rewrite some of the existing >>> timeout-dependent test cases to use this facility instead of >>> long timeouts. >> +1. Those long timeouts are annoying enough that infrastructure to make >> a run shorter in normal circumstances might be sufficient justification >> for this patch ... > I'm not familiar with those test so I'm probably missing something, but looks > like all isolation tests that setup a timeout are doing so to test server side > features (deadlock detection, statement and lock timeout). I'm not sure how > adding a client-side facility to detect locks earlier is going to help reducing > the server side timeouts? The point is that those timeouts have to be set long enough for even a very slow machine to reach a desired state before the timeout happens; on faster machines the test is just uselessly sleeping for a long time, because of the fixed timeout. My thought was that maybe the tests could be recast as "watch for session to reach $expected_state and then do the next thing", allowing them to be automatically adaptive to the machine's speed. This might require some rather subtle test redesign and/or addition of more infrastructure (to allow recognition of the desired state and/or taking an appropriate next action). I'm prepared to believe that not much can be done about timeouts.spec in particular, but it seems to me that the long delays in the deadlock tests are not inherent in what we need to test. > For the REINDEX CONCURRENTLY failure test, the problem that needs to be solved > isn't detecting that the command is blocked as it's already getting blocked on > a heavyweight lock, but being able to reliably cancel a specific query as early > as possible, which AFAICS isn't possible with current isolation tester: Right, it's the same thing of needing to wait till the backend has reached a particular state before you do the next thing. > So we would actually only need something like this to make it work: > step "<name>" [ CANCEL IF BLOCKED ] { <SQL } I continue to resist the idea of hard-wiring this feature to query cancel as the action-to-take. That will more or less guarantee that it's not good for anything but this one test case. I think that the feature should have the behavior of "treat this step as blocked once it's reached state X", and then you make the next step in the permutation be one that issues a query cancel. (Possibly, using pg_stat_activity and pg_cancel_backend for that will be painful enough that we'd want to invent separate script syntax that says "send a cancel to session X". But that's a separate discussion.) regards, tom lane
On Fri, Mar 13, 2020 at 10:12:20AM -0400, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > > I'm not familiar with those test so I'm probably missing something, but looks > > like all isolation tests that setup a timeout are doing so to test server side > > features (deadlock detection, statement and lock timeout). I'm not sure how > > adding a client-side facility to detect locks earlier is going to help reducing > > the server side timeouts? > > The point is that those timeouts have to be set long enough for even a > very slow machine to reach a desired state before the timeout happens; > on faster machines the test is just uselessly sleeping for a long time, > because of the fixed timeout. My thought was that maybe the tests could > be recast as "watch for session to reach $expected_state and then do > the next thing", allowing them to be automatically adaptive to the > machine's speed. This might require some rather subtle test redesign > and/or addition of more infrastructure (to allow recognition of the > desired state and/or taking an appropriate next action). I'm prepared > to believe that not much can be done about timeouts.spec in particular, > but it seems to me that the long delays in the deadlock tests are not > inherent in what we need to test. Ah I see. I'll try to see if that could help the deadlock tests, but for sure such feature would allow us to get rid of the two pg_sleep(5) in tuplelock-update. It seems that for all the possibly interesting cases, what we want to wait on is an heavyweight lock, which is already what isolationtester detects. Maybe we could simply implement something like step "<name>" [ WAIT UNTIL BLOCKED ] { <SQL> } without any change to the blocking detection function? > > For the REINDEX CONCURRENTLY failure test, the problem that needs to be solved > > isn't detecting that the command is blocked as it's already getting blocked on > > a heavyweight lock, but being able to reliably cancel a specific query as early > > as possible, which AFAICS isn't possible with current isolation tester: > > Right, it's the same thing of needing to wait till the backend has reached > a particular state before you do the next thing. > > > So we would actually only need something like this to make it work: > > step "<name>" [ CANCEL IF BLOCKED ] { <SQL } > > I continue to resist the idea of hard-wiring this feature to query cancel > as the action-to-take. That will more or less guarantee that it's not > good for anything but this one test case. I think that the feature > should have the behavior of "treat this step as blocked once it's reached > state X", and then you make the next step in the permutation be one that > issues a query cancel. (Possibly, using pg_stat_activity and > pg_cancel_backend for that will be painful enough that we'd want to > invent separate script syntax that says "send a cancel to session X". > But that's a separate discussion.) I agree. A new step option to kill a session rather than executing sql would go perfectly with the above new active-wait-for-blocking-state feature.
Julien Rouhaud <rjuju123@gmail.com> writes: > It seems that for all the possibly interesting cases, what we want to wait on > is an heavyweight lock, which is already what isolationtester detects. Maybe > we could simply implement something like > step "<name>" [ WAIT UNTIL BLOCKED ] { <SQL> } > without any change to the blocking detection function? Um, isn't that the existing built-in behavior? I could actually imagine some uses for the reverse option, *don't* wait for it to become blocked but just immediately continue with issuing the next step. regards, tom lane