Thread: pgsql: Fix deadlock hazard in CREATE INDEX CONCURRENTLY
Fix deadlock hazard in CREATE INDEX CONCURRENTLY Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit c3d09b3bd23f specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by 8aa3e47510b9 keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/54eff5311d7c8e3d309774713b91e78067d2ad42 Modified Files -------------- src/backend/commands/indexcmds.c | 3 +++ src/test/isolation/expected/multiple-cic.out | 19 +++++++++++++ src/test/isolation/isolation_schedule | 1 + src/test/isolation/specs/multiple-cic.spec | 40 ++++++++++++++++++++++++++++ 4 files changed, 63 insertions(+)
Hi, On 2018-01-03 02:20:14 +0000, Alvaro Herrera wrote: > Fix deadlock hazard in CREATE INDEX CONCURRENTLY > > Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are > supposed to be able to work in parallel, as evidenced by fixes in commit > c3d09b3bd23f specifically to support this case. In reality, one of the > sessions would be aborted by a misterious "deadlock detected" error. > > Jeff Janes diagnosed that this is because of leftover snapshots used for > system catalog scans -- this was broken by 8aa3e47510b9 keeping track of > (registering) the catalog snapshot. To fix the deadlocks, it's enough > to de-register that snapshot prior to waiting. > > Backpatch to 9.4, which introduced MVCC catalog scans. > > Include an isolationtester spec that 8 out of 10 times reproduces the > deadlock with the unpatched code for me (Álvaro). The isolation test doesn't appear to be fully stable: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=baiji&dt=2018-01-03%2003%3A00%3A01 Greetings, Andres Freund
Hello, Andres Freund wrote: > On 2018-01-03 02:20:14 +0000, Alvaro Herrera wrote: > > Include an isolationtester spec that 8 out of 10 times reproduces the > > deadlock with the unpatched code for me (Álvaro). > > The isolation test doesn't appear to be fully stable: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=baiji&dt=2018-01-03%2003%3A00%3A01 Yeah, thanks for reporting that. It's annoying, but obvious in hindsight -- the second process occasionally gets stuck waiting for the first one to release its snapshot. I can reproduce this about half the time, by adding a 100ms sleep in ProcessUtilitySlow right after the DefineIndex call. I can turn the isolation test to produce the output where s2 waits (without the sleep) if I introduce a third session s3 to lock s2; then releasing the lock in s3 causes both s1 and s2 to complete. I think the completions would be reported always in that order because of the way isolationtester verifies suspension. However this means that the test will get removed in 9.4 and 9.5 because isolationtester is not smart enough there. I suppose the other option would be to add an alternate expected file for the test. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Alvaro Herrera wrote: > However this means that the test will get removed in 9.4 and 9.5 because > isolationtester is not smart enough there. > > I suppose the other option would be to add an alternate expected file > for the test. Actually, so far only 9.6 and up have failed. Maybe the old isolationtester is different enough that the other thing doesn't happen. I'm more inclined now to add the alternate file instead of the other patch. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Alvaro Herrera wrote: >> However this means that the test will get removed in 9.4 and 9.5 because >> isolationtester is not smart enough there. >> >> I suppose the other option would be to add an alternate expected file >> for the test. > Actually, so far only 9.6 and up have failed. Maybe the old > isolationtester is different enough that the other thing doesn't happen. > I'm more inclined now to add the alternate file instead of the other > patch. Meh. I'd rather have the more stable test going forward; I think alternate expected-files too easily hide unexpected behavior. We could try leaving 9.4/9.5 alone and see if it's true that it doesn't fail there. If not, I wouldn't mind losing the test in those branches --- it's mainly intended to catch future breakage, after all. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Actually, so far only 9.6 and up have failed. Maybe the old > > isolationtester is different enough that the other thing doesn't happen. > > > I'm more inclined now to add the alternate file instead of the other > > patch. > > Meh. I'd rather have the more stable test going forward; I think > alternate expected-files too easily hide unexpected behavior. We could > try leaving 9.4/9.5 alone and see if it's true that it doesn't fail > there. If not, I wouldn't mind losing the test in those branches > --- it's mainly intended to catch future breakage, after all. Makes sense. Pushed to 9.6 and up. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Tom Lane wrote: >> Meh. I'd rather have the more stable test going forward; I think >> alternate expected-files too easily hide unexpected behavior. We could >> try leaving 9.4/9.5 alone and see if it's true that it doesn't fail >> there. If not, I wouldn't mind losing the test in those branches >> --- it's mainly intended to catch future breakage, after all. > Makes sense. Pushed to 9.6 and up. Some of the buildfarm machines still don't like this. It looks like the buildfarm script is only capturing the postmaster log and not regression.diffs, making it hard to diagnose :-( regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Tom Lane wrote: > >> Meh. I'd rather have the more stable test going forward; I think > >> alternate expected-files too easily hide unexpected behavior. We could > >> try leaving 9.4/9.5 alone and see if it's true that it doesn't fail > >> there. If not, I wouldn't mind losing the test in those branches > >> --- it's mainly intended to catch future breakage, after all. > > > Makes sense. Pushed to 9.6 and up. > > Some of the buildfarm machines still don't like this. It looks > like the buildfarm script is only capturing the postmaster log > and not regression.diffs, making it hard to diagnose :-( Actually three of these recent failures are showing the diff: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=protosciurus&dt=2018-01-03%2018%3A39%3A46 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=castoroides&dt=2018-01-03%2017%3A31%3A24 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=woodlouse&dt=2018-01-03%2015%3A42%3A03 It proves me wrong about the ordering in which the steps return completion being consistent: *************** *** 20,24 **** unlck t - step s1i: <... completed> step s2i: <... completed> --- 20,24 ---- unlck t step s2i: <... completed> + step s1i: <... completed> Again this could be solved by just including an alternate file, or we could go a bit further and report all completed steps in a single line rather than each in its own line. This would require patching isolationtester back to 9.6, but it should be a small fix ... Will look into it after pushing another patch. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Tom Lane wrote: >> Some of the buildfarm machines still don't like this. > It proves me wrong about the ordering in which the steps return > completion being consistent: > *************** > *** 20,24 **** > unlck > t > - step s1i: <... completed> > step s2i: <... completed> > --- 20,24 ---- > unlck > t > step s2i: <... completed> > + step s1i: <... completed> > Again this could be solved by just including an alternate file, or we > could go a bit further and report all completed steps in a single line > rather than each in its own line. This would require patching > isolationtester back to 9.6, but it should be a small fix ... Will look > into it after pushing another patch. No, I think that's probably a bad idea, because it would mean that in cases where you do care about the finishing order (which is all of them up to now), the test output would fail to prove that you got the expected behavior. At this point I'm on board with using an alternate expected file. We could revert the test back to your original version and make the pre-9.6 branches look the same, which would be good. regards, tom lane
Tom Lane wrote: > No, I think that's probably a bad idea, because it would mean that in > cases where you do care about the finishing order (which is all of > them up to now), the test output would fail to prove that you got the > expected behavior. Hmm .. I was thinking that we would check all the steps that are unlocked when running a single one, so the only test that would be affected would be deadlock-hard, which has this: step s8a1: LOCK TABLE a1; <waiting ...> step s8a1: <... completed> step s7a8: <... completed> error in steps s8a1 s7a8: ERROR: deadlock detected TBH I'm surprised that this is never reported in the other order but that this doesn't hold for the new test, but there you have it. > At this point I'm on board with using an alternate expected file. > We could revert the test back to your original version and make > the pre-9.6 branches look the same, which would be good. I reverted the test change, so the tests are now the same everywhere. I hadn't touched 9.4 and 9.5 previously. The failures in 9.6-10-master had been accumulating fast, and we didn't have a single failure in 9.4 and 9.5, so I'm hoping that this means the older branches just cannot fail in this way. But we shall see. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 3, 2018 at 4:31 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > step s8a1: LOCK TABLE a1; <waiting ...> > step s8a1: <... completed> > step s7a8: <... completed> > error in steps s8a1 s7a8: ERROR: deadlock detected > > TBH I'm surprised that this is never reported in the other order but > that this doesn't hold for the new test, but there you have it. That test was quite unstable at first (although not on the machine where I developed it, sigh). But now session s7 sets deadlock_timeout = '100s' and session s8 sets deadlock_timeout = '10s', which is apparently sufficient insulation against timing variations. The original commit used 10ms and 10s, and that wasn't good enough. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
So we're still not out of the woods with that CREATE INDEX CONCURRENTLY deadlock test. Buildfarm member okapi has failed most (not all) of its runs since that patch went in, but only in the 9.4 branch. The failures look like *** /home/data/buildfarm/root.x86_64/REL9_4_STABLE/pgsql.build/src/test/isolation/expected/multiple-cic.out Thu Feb 2202:35:05 2018 --- /home/data/buildfarm/root.x86_64/REL9_4_STABLE/pgsql.build/src/test/isolation/results/multiple-cic.out Thu Feb 2202:44:48 2018 *************** *** 14,19 **** --- 14,20 ---- WHERE unlck(); step s1i: <... completed> + error in steps s2i s1i: ERROR: deadlock detected s1 as though the bug hadn't been fixed at all. I have no idea what to make of this. I'd easily believe a 9.4-only bug, but why is okapi the only critter showing it? It doesn't seem to be a particularly unusual platform or build configuration. I thought maybe it's a timing issue, but fooling about with altering the timing by injecting sleeps hasn't allowed me to reproduce it here. I do notice that okapi is running a not-very-new icc version with -O3, opening the possibility that there's a compiler bug or at least an undesirable optimization going on. It might be interesting to find out if the failure is still reproducible at lower -O levels. regards, tom lane