Thread: vacuumdb parallel has a deadlock detected in 9.5.4
Hi friends, There is a error deadlock detected in vacuumdb parallel . but if I do not use parallel , it has not deadlock detected . [postgres@vdedu1 ~]$ vacuumdb -fazev -j 10 SELECT datname FROM pg_database WHERE datallowconn ORDER BY 1; vacuumdb: vacuuming database "mydb" SELECT c.relname, ns.nspname FROM pg_class c, pg_namespace ns WHERE relkind IN ('r', 'm') AND c.relnamespace = ns.oid ORDER BY c.relpages DESC; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_proc; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_depend; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_attribute; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_description; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_collation; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_statistic; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_class; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_operator; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_rewrite; VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_type; ERROR: deadlock detected DETAIL: Process 19840 waits for AccessShareLock on relation 1249 of database 16387; blocked by process 19841. Process 19841 waits for RowExclusiveLock on relation 2608 of database 16387; blocked by process 19840. Process 19840: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_depend; Process 19841: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_attribute; HINT: See server log for query details. STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_depend; vacuumdb: vacuuming of database "mydb" failed: ERROR: deadlock detected DETAIL: Process 19840 waits for AccessShareLock on relation 1249 of database 16387; blocked by process 19841. Process 19841 waits for RowExclusiveLock on relation 2608 of database 16387; blocked by process 19840. HINT: See server log for query details. ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_proc; ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_attribute; LOG: could not send data to client: Broken pipe FATAL: connection to client lost ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_description; ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_collation; ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_statistic; ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_operator; ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_rewrite; ERROR: canceling statement due to user request STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_type; [postgres@vdedu1 ~]$ LOG: could not send data to client: Broken pipe FATAL: connection to client lost LOG: could not send data to client: Broken pipe STATEMENT: VACUUM (FULL, VERBOSE, ANALYZE) pg_catalog.pg_class; ERROR: canceling statement due to user request FATAL: connection to client lost LOG: could not send data to client: Broken pipe ERROR: canceling statement due to user request FATAL: connection to client lost [postgres@vdedu1 ~]$ [postgres@vdedu1 ~]$ cat /proc/v version vmallocinfo vmstat [postgres@vdedu1 ~]$ cat /proc/version Linux version 2.6.32-504.el6.x86_64 (mockbuild@c6b9.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) #1 SMP Wed Oct 15 04:27:16 UTC 2014 [postgres@vdedu1 ~]$ vacuumdb --version vacuumdb (PostgreSQL) 9.5.4 [postgres@vdedu1 ~]$ psql -c"select version();" version -------------------------------------------------------------------------------- -------------------------- PostgreSQL 9.5.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 ( Red Hat 4.4.7-11), 64-bit (1 row) [postgres@vdedu1 ~]$ 2016-09-28 TT
On Wed, Sep 28, 2016 at 09:46:22PM +0800, huang wrote: > Hi friends, > > There is a error deadlock detected in vacuumdb parallel . > but if I do not use parallel , it has not deadlock detected . man vacuumdb says about -j option: >> Note that using this mode together with the -f (FULL) option might cause >> deadlock failures if certain system catalogs are processed in parallel. so, while I understand it's not pretty, it's well documented. Best regards, depesz
hubert depesz lubaczewski wrote: > On Wed, Sep 28, 2016 at 09:46:22PM +0800, huang wrote: > > Hi friends, > > > > There is a error deadlock detected in vacuumdb parallel . > > but if I do not use parallel , it has not deadlock detected . > > man vacuumdb says about -j option: > > >> Note that using this mode together with the -f (FULL) option might cause > >> deadlock failures if certain system catalogs are processed in parallel. > > so, while I understand it's not pretty, it's well documented. Yeah. However I think this behavior is rather unhelpful. Maybe we should try to fix it somehow, but I'm not sure how. We could say that pg_catalog tables can only be processed one at a time, instead of trying to paralelize them? For example: have vacuumdb fill two lists of tables, one for pg_catalog and one for tables in other schemas. Each worker chooses a table from the pg_catalog list first if it's not empty and there's no other worker doing one of these, otherwise one from the other table. I think that'd fix it, while not destroying paralelizability too badly. I'm not volunteering, though. Patches welcome. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi: On Wed, Sep 28, 2016 at 10:39 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> > There is a error deadlock detected in vacuumdb parallel . >> > but if I do not use parallel , it has not deadlock detected . >> man vacuumdb says about -j option: >> >> >> Note that using this mode together with the -f (FULL) option might cause >> >> deadlock failures if certain system catalogs are processed in parallel. >> so, while I understand it's not pretty, it's well documented. > > Yeah. However I think this behavior is rather unhelpful. Maybe we > should try to fix it somehow, but I'm not sure how. We could say that > pg_catalog tables can only be processed one at a time, instead of trying > to paralelize them? For example: have vacuumdb fill two lists of > tables, one for pg_catalog and one for tables in other schemas. Each > worker chooses a table from the pg_catalog list first if it's not empty > and there's no other worker doing one of these, otherwise one from the > other table. > > I think that'd fix it, while not destroying paralelizability too badly. I would propose another behaviour, which I think can solve the problem without introducing more complex code. Put a couple of flags to vacuum only catalog tables / non catalog tables ( I believe this can be served by include/exclude schemas too, which will be even more useful for other things ). This way one can do a full paralell vacuum of non-catalog objects followed by a serial one on catalog objects ( which should not be too slow on 'normal' databases ) ( I propose this order because IIRC normal vacuum updates catalog tables so they get vacuumed after to tidy them ) Francisco Olarte
CgpBdCAyMDE2LTA5LTI5IDE1OjU3OjU4LCAiRnJhbmNpc2NvIE9sYXJ0ZSIgPGZvbGFydGVAcGVv cGxlY2FsbC5jb20+IHdyb3RlOgo+SGk6Cj4KPk9uIFdlZCwgU2VwIDI4LCAyMDE2IGF0IDEwOjM5 IFBNLCBBbHZhcm8gSGVycmVyYQo+PGFsdmhlcnJlQDJuZHF1YWRyYW50LmNvbT4gd3JvdGU6Cj4+ PiA+IFRoZXJlIGlzIGEgZXJyb3IgZGVhZGxvY2sgZGV0ZWN0ZWQgaW4gdmFjdXVtZGIgcGFyYWxs ZWwgLgo+Pj4gPiBidXQgaWYgSSBkbyBub3QgdXNlIHBhcmFsbGVsICwgaXQgaGFzIG5vdCBkZWFk bG9jayBkZXRlY3RlZCAuCj4+PiBtYW4gdmFjdXVtZGIgc2F5cyBhYm91dCAtaiBvcHRpb246Cj4+ Pgo+Pj4gPj4gTm90ZSB0aGF0IHVzaW5nIHRoaXMgbW9kZSB0b2dldGhlciB3aXRoIHRoZSAtZiAo RlVMTCkgb3B0aW9uIG1pZ2h0IGNhdXNlCj4+PiA+PiBkZWFkbG9jayBmYWlsdXJlcyBpZiBjZXJ0 YWluIHN5c3RlbSBjYXRhbG9ncyBhcmUgcHJvY2Vzc2VkIGluIHBhcmFsbGVsLgo+Pj4gc28sIHdo aWxlIEkgdW5kZXJzdGFuZCBpdCdzIG5vdCBwcmV0dHksIGl0J3Mgd2VsbCBkb2N1bWVudGVkLgo+ Pgo+PiBZZWFoLiAgSG93ZXZlciBJIHRoaW5rIHRoaXMgYmVoYXZpb3IgaXMgcmF0aGVyIHVuaGVs cGZ1bC4gIE1heWJlIHdlCj4+IHNob3VsZCB0cnkgdG8gZml4IGl0IHNvbWVob3csIGJ1dCBJJ20g bm90IHN1cmUgaG93LiAgV2UgY291bGQgc2F5IHRoYXQKPj4gcGdfY2F0YWxvZyB0YWJsZXMgY2Fu IG9ubHkgYmUgcHJvY2Vzc2VkIG9uZSBhdCBhIHRpbWUsIGluc3RlYWQgb2YgdHJ5aW5nCj4+IHRv IHBhcmFsZWxpemUgdGhlbT8gIEZvciBleGFtcGxlOiBoYXZlIHZhY3V1bWRiIGZpbGwgdHdvIGxp c3RzIG9mCj4+IHRhYmxlcywgb25lIGZvciBwZ19jYXRhbG9nIGFuZCBvbmUgZm9yIHRhYmxlcyBp biBvdGhlciBzY2hlbWFzLiAgRWFjaAo+PiB3b3JrZXIgY2hvb3NlcyBhIHRhYmxlIGZyb20gdGhl IHBnX2NhdGFsb2cgbGlzdCBmaXJzdCBpZiBpdCdzIG5vdCBlbXB0eQo+PiBhbmQgdGhlcmUncyBu byBvdGhlciB3b3JrZXIgZG9pbmcgb25lIG9mIHRoZXNlLCBvdGhlcndpc2Ugb25lIGZyb20gdGhl Cj4+IG90aGVyIHRhYmxlLgo+Pgo+PiBJIHRoaW5rIHRoYXQnZCBmaXggaXQsIHdoaWxlIG5vdCBk ZXN0cm95aW5nIHBhcmFsZWxpemFiaWxpdHkgdG9vIGJhZGx5Lgo+Cj5JIHdvdWxkIHByb3Bvc2Ug YW5vdGhlciBiZWhhdmlvdXIsIHdoaWNoIEkgdGhpbmsgY2FuIHNvbHZlIHRoZSBwcm9ibGVtCj53 aXRob3V0IGludHJvZHVjaW5nIG1vcmUgY29tcGxleCBjb2RlLiBQdXQgYSBjb3VwbGUgb2YgZmxh Z3MgdG8gdmFjdXVtCj5vbmx5IGNhdGFsb2cgdGFibGVzIC8gbm9uIGNhdGFsb2cgdGFibGVzICgg SSBiZWxpZXZlIHRoaXMgY2FuIGJlCj5zZXJ2ZWQgYnkgaW5jbHVkZS9leGNsdWRlIHNjaGVtYXMg dG9vLCB3aGljaCB3aWxsIGJlIGV2ZW4gbW9yZSB1c2VmdWwKPmZvciBvdGhlciB0aGluZ3MgKS4g VGhpcyB3YXkgb25lIGNhbiBkbyBhIGZ1bGwgcGFyYWxlbGwgdmFjdXVtIG9mCj5ub24tY2F0YWxv ZyBvYmplY3RzIGZvbGxvd2VkIGJ5IGEgc2VyaWFsIG9uZSBvbiBjYXRhbG9nIG9iamVjdHMgKAo+ d2hpY2ggc2hvdWxkIG5vdCBiZSB0b28gc2xvdyBvbiAnbm9ybWFsJyBkYXRhYmFzZXMgKSAoIEkg cHJvcG9zZSB0aGlzCj5vcmRlciBiZWNhdXNlIElJUkMgbm9ybWFsIHZhY3V1bSB1cGRhdGVzIGNh dGFsb2cgdGFibGVzIHNvIHRoZXkgZ2V0Cj52YWN1dW1lZCBhZnRlciB0byB0aWR5IHRoZW0gKQo+ Cgo+RnJhbmNpc2NvIE9sYXJ0ZQoKCgoKaGkgLCAKSSBqdXN0ICBob3BlICB0aGUgIHZhY3V1bWRi ICBwYXJhbGxlbCBjYW4gcXVpY2tseSBmaW5pc2hpbmcgd29yayAgLiAgCm91ciBkYXRhYmFzZSBo YXMgYSBsb3Qgb2YgdGFibGVzIC4KdGhleSBuZWVkIHZhY3V1bWRiIC4gYnV0IHRoZSB0aW1lIGlz IHRvbyBsb25nIC4gCnNvICwgSSB1c2VkIHRoZSB2YWN1dW1kYiBwYXJhbGxlbCAuIAogCnRoYW5r cyB0byBldmVyeW9uZSAuICAgOiApCgoKQmVzdCByZWdhcmRzLApUVAoyOC0wOS0yMDE2CgoKCg==
Francisco Olarte wrote: > I would propose another behaviour, which I think can solve the problem > without introducing more complex code. Put a couple of flags to vacuum > only catalog tables / non catalog tables ( I believe this can be > served by include/exclude schemas too, which will be even more useful > for other things ). This way one can do a full paralell vacuum of > non-catalog objects followed by a serial one on catalog objects ( > which should not be too slow on 'normal' databases ) OK, that sounds easier to implement. Are you going to submit a patch? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello all: On Thu, Sep 29, 2016 at 7:10 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Francisco Olarte wrote: > >> I would propose another behaviour, which I think can solve the problem >> without introducing more complex code. Put a couple of flags to vacuum >> only catalog tables / non catalog tables ( I believe this can be >> served by include/exclude schemas too, which will be even more useful >> for other things ). This way one can do a full paralell vacuum of >> non-catalog objects followed by a serial one on catalog objects ( >> which should not be too slow on 'normal' databases ) > > OK, that sounds easier to implement. Are you going to submit a patch? After looking at vacuumdb source I'm trying to develop and submit a patch implementing the following: Two new flags --include-schema=NAME, --exclude-schema=NAME Two new flags --system-schemas --non-system-schemas, as alias to --include-schema=pg_catalog and --exclude-schema=pg_catalog The intended behaviour will be: - if a table list (--table) is specified, include / exclude schema will be ignored ( potentially with a warning or may be an error if more knowledgeable people steer me that way ). - if both include and exclude are specified, include will take preference (with warning or error ? ). - if not the already existing code used to enumerate tables for paralell vacuum will be augmented to either use only tables in the include schemas or not use tables in the exclude schemas. If, after enumeration, the table list becomes empty an error will be produced ( think on --include-schema=oops-I-mistyped-it ). The filtering will be done client side and will be case sensitive against the schema name name returned by the queries to the catalogs ( no case folding, to make it simple ). - After that vacuumdb will proceed as it would if the table list where given in the command line, either in serial or paralell mode. This does not intend to solve the paralell vacuum deadlocks, but it can let people avoid them by just doing vacuumdb -f --exclude-schema=pg_catalog -j 20 vacuumdb -f --include-schema=pg_catalog And vaccuuming catalogs is not going to benefit that much from paralellism as it is normally fast. Any comment ? Francisco Olarte.
On 01-10-2016 12:57, Francisco Olarte wrote: > On Thu, Sep 29, 2016 at 7:10 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Francisco Olarte wrote: >> >>> I would propose another behaviour, which I think can solve the problem >>> without introducing more complex code. Put a couple of flags to vacuum >>> only catalog tables / non catalog tables ( I believe this can be >>> served by include/exclude schemas too, which will be even more useful >>> for other things ). This way one can do a full paralell vacuum of >>> non-catalog objects followed by a serial one on catalog objects ( >>> which should not be too slow on 'normal' databases ) >> >> OK, that sounds easier to implement. Are you going to submit a patch? > > After looking at vacuumdb source I'm trying to develop and submit a > patch implementing the following: > > Two new flags --include-schema=NAME, --exclude-schema=NAME > > Two new flags --system-schemas --non-system-schemas, as alias to > --include-schema=pg_catalog and --exclude-schema=pg_catalog > This is another feature. Also, it will *sometimes* solve the problem. Why don't you implement your parallel + serialization idea or Alvaro's two lists idea? I'm not against adding schema options, I just think it's not the right way to solve the deadlock problem. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Hi Euler: On Sun, Oct 2, 2016 at 5:33 AM, Euler Taveira <euler@timbira.com.br> wrote: > This is another feature. Yes, which allows an easier workaround around the problem. Personally I do not need it because I do not vacuumm paralell and I can easily write a simple dozen lines in perl which achieves it, but I feel it can help people like the OP. > Also, it will *sometimes* solve the problem. No, it does not solve the problem ( paralell vacuum of system catalogs can still deadlock ). What it does is let you transform a deadlockish operation ( full paralell database wide vacuum ) into a non deadlockish one ( full paralell non-system plus linear system ) without too much effort. > Why don't you implement your parallel + serialization idea or Alvaro's > two lists idea? Well, I want a servant, not a nanny. If I command vacuum db to do full paralell system catalog vacuum, I want it to do it ( although I WILL tolerate a warning and MAY tolerate a confirmation request ). Besides, I proposed the schema solution because I thought that paralell vacuum would need some code building a table list, and the -t option will need code to process tables from a list. Code examination revealed I was correct in both. Adding the options is a simple matter of adding some simple code ( a couple options, a couple string lists to hold their data, a couple ifs in the table scanning loop used when no table list given, a modification to make it happen when schema optins are given ( presently it plays only when no -t options and paralell is requested, I need to activate it when schema options are present too, either in serial or paralell mode ), and the working code remains untouched). I could refactor a bit to make several runs easier, have a built in list of system data, and many other things, but this is a deeper modification which I do not feel presently prepared to attack. I may do it in the future, but I feel anyone needing to run full paralell vacuums has a big system and should have this kind of ops scripted, so it is not going to be that useful. Bear in mind my work is not specifically in postgres like yours, Its just a tool. I've been using it since before it had SQL, and have some experience with C and libpq, but I do not feel confortable adding complex code even to a simple client program. > I'm not against adding schema options, I just think it's > not the right way to solve the deadlock problem. As I said, it does not solve the problem, just builds an easier path than the present ones around it ( you can do exactly the same by querying the catalog on a script and emitting two vacuumdb commands with a table list ). Francisco Olarte.