Thread: Deadlock while doing VACUUM??
I've been experimenting with Matthew T. O'Connor's pg_avd (auto vacuum daemon), which is actually quite an interesting piece of software that so far seems to function pretty well. I've ported it to PG 7.2.4, which is the version I'm running on my system, and experimenting with it has uncovered what I can only consider a potentially serious bug in PG. When a heavy INSERT or UPDATE load on a table is occurring (lots of quick INSERTs or UPDATEs within a single transaction), a VACUUM ANALYZE (or just straight VACUUM) has a really good chance (10% or so) of causing either the INSERT/UPDATE or the VACUUM to fail with a "deadlock detected" error. Usually the INSERT/UPDATE is what fails. I have verified this with manual VACUUM and VACUUM ANALYZE commands issued on the command line while a perl script is generating a heavy load of inserts and/or updates, so this isn't pg_avd's fault. A straight ANALYZE on the table doesn't seem to trigger the problem, while a straight VACUUM does. Statistics collection is enabled -- it's required for pg_avd to run -- but I suspect that this is irrelevant (what does a straight VACUUM do with the statistics tables, if anything?). Even dropping the table and all of its indexes and recreating it does nothing to solve this problem, so I don't have any reason to suspect corruption of the table itself. Corruption elsewhere is a possibility, I suppose, but I haven't noticed any strangeness elsewhere. The only other thing of note is that a REALLY LONG running INSERT ... SELECT (with the source being a temporary table) is running concurrently on a different table in a different database. That, too, is running within a transaction, but I don't have reason to believe that it's having an effect. When it completes I'll do more testing to determine if it really did have an effect. I've searched the archives for references to this and have found nothing relevant (all references I found are to older implementations in which VACUUM locks the table). Have I managed to uncover something new here? I don't have 7.3.x (or later) installed so I can't test this myself on that or CVS tip, but I may have to if nobody else has the time. Anyone have any suggestions on what to do about this? This problem pretty much kills pg_avd's usefulness, at least on 7.2.x. :-( I'll be happy to supply the Perl script I'm using to do the inserts, if that'll help people track this down... -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > When a heavy INSERT or UPDATE load on a table is occurring (lots of > quick INSERTs or UPDATEs within a single transaction), a VACUUM > ANALYZE (or just straight VACUUM) has a really good chance (10% or so) > of causing either the INSERT/UPDATE or the VACUUM to fail with a > "deadlock detected" error. I was unable to replicate this in CVS tip, using "pgbench -c 10 -t 1000" as the load generator. regards, tom lane
> Even dropping the table and all of its indexes and recreating it does > nothing to solve this problem, so I don't have any reason to suspect > corruption of the table itself. Corruption elsewhere is a > possibility, I suppose, but I haven't noticed any strangeness > elsewhere. This sounds very similiar to problems I was having under the thread: Re: [HACKERS] BUG: Vacuum Analyze - datumGetSize: Invalid typLen 0 Dropping the table, indexes, and recreating it didn't help me either. I had to drop the database! What kind of platform are you on? 64-bit? I've noticed that 7.4-snapshot fixed a huge amount of bugs, but the vacuum analyze bug is still there. -- Thomas
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > When a heavy INSERT or UPDATE load on a table is occurring (lots of > > quick INSERTs or UPDATEs within a single transaction), a VACUUM > > ANALYZE (or just straight VACUUM) has a really good chance (10% or so) > > of causing either the INSERT/UPDATE or the VACUUM to fail with a > > "deadlock detected" error. > > I was unable to replicate this in CVS tip, using "pgbench -c 10 -t 1000" > as the load generator. I guess I should mention that I was doing VACUUM on the individual tables, not the entire database. I have no idea if that makes any difference. I'll see about compiling and installing CVS tip for testing. I'll also see about doing some testing on 7.3.2. Can anyone here independently verify what I've found on 7.2.4? You'll probably have to run a process that does continuous INSERTs on a table, and another that does continuous VACUUMs on the same table at the same time. -- Kevin Brown kevin@sysexperts.com
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > When a heavy INSERT or UPDATE load on a table is occurring (lots of > > quick INSERTs or UPDATEs within a single transaction), a VACUUM > > ANALYZE (or just straight VACUUM) has a really good chance (10% or so) > > of causing either the INSERT/UPDATE or the VACUUM to fail with a > > "deadlock detected" error. > > I was unable to replicate this in CVS tip, using "pgbench -c 10 -t 1000" > as the load generator. I finally got 7.3.2 installed, and confirmed that the problem does not exist on that version. So this is something that's limited to the 7.2.x tree. Which, I guess, means that it's not going to get fixed for that tree (I assume that 7.2.x is effectively end-of-lifed)... On 7.3.2, a concurrent VACUUM appears to slow inserts down A LOT, but it won't deadlock them. -- Kevin Brown kevin@sysexperts.com
Kevin Brown <kevin@sysexperts.com> writes: > I finally got 7.3.2 installed, and confirmed that the problem does not > exist on that version. So this is something that's limited to the > 7.2.x tree. Which, I guess, means that it's not going to get fixed > for that tree (I assume that 7.2.x is effectively end-of-lifed)... I dug through the CVS logs, and could not find any entry between 7.2 and 7.3 that seemed like it might fix such a problem. If we knew what the problem was, maybe we could put together a back-patch. I'm not personally eager to spend more time on 7.2.*, though. regards, tom lane