Thread: BUG #11033: 'pg_dump -a' much slower than 'pg_dump'
The following bug has been logged on the website: Bug reference: 11033 Logged by: Joe Van Dyk Email address: joe@tanga.com PostgreSQL version: 9.3.5 Operating system: Ubuntu 12.04 Description: I've got a small database that takes about 0.8 seconds to dump with 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. I've narrowed it down to the foreign key constraints in the database, if those are removed, then 'pg_dump -a' becomes fast again. I can't come up with an artificial test case. I can reproduce it using our company's db schema. I don't want to publicly post that schema to the mailing list, but I'd be happy to send the schema that shows the problem to someone privately. I can reproduce the problem in 9.3.4 and 9.4.beta1.
This is what I'm seeing, as far as timing. $ time pg_dump tanga_dev > /dev/null real 0m1.182s user 0m0.260s sys 0m0.020s $ time pg_dump -a tanga_dev > /dev/null pg_dump: NOTICE: there are circular foreign-key constraints among these table(s): pg_dump: products pg_dump: You might not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints. pg_dump: Consider using a full dump instead of a --data-only dump to avoid this problem. <snip a few more of these> real 0m11.390s user 0m10.653s sys 0m0.032s On Thu, Jul 24, 2014 at 5:10 PM, <joe@tanga.com> wrote: > The following bug has been logged on the website: > > Bug reference: 11033 > Logged by: Joe Van Dyk > Email address: joe@tanga.com > PostgreSQL version: 9.3.5 > Operating system: Ubuntu 12.04 > Description: > > I've got a small database that takes about 0.8 seconds to dump with > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. > > I've narrowed it down to the foreign key constraints in the database, if > those are removed, then 'pg_dump -a' becomes fast again. > > I can't come up with an artificial test case. I can reproduce it using our > company's db schema. I don't want to publicly post that schema to the > mailing list, but I'd be happy to send the schema that shows the problem to > someone privately. > > I can reproduce the problem in 9.3.4 and 9.4.beta1. > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
joe@tanga.com writes: > I've got a small database that takes about 0.8 seconds to dump with > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. > I've narrowed it down to the foreign key constraints in the database, if > those are removed, then 'pg_dump -a' becomes fast again. Given that and the warning messages shown in your followup, a plausible guess is that pg_dump is wasting a lot of time vainly searching for ways to break the circular dependency loops that arise from trying to order the tables in a way that allows them to be restored with foreign key constraints active. However: (a) it seems like it'd take a heck of a lot of FK constraints for that to become a dominant factor, and (b) if that were the explanation, seems like it should not be that hard to make an artificial test case. So I'm thinking there's some other contributing factor you haven't shown us. > I can't come up with an artificial test case. I can reproduce it using our > company's db schema. I don't want to publicly post that schema to the > mailing list, but I'd be happy to send the schema that shows the problem to > someone privately. I'd be willing to take a look, but it's not clear that there is any easy fix, and TBH I'm not sure that this case is worth spending a lot of time on. As the warning messages are telling you, a --data-only dump is of limited value when you've got circular foreign key references. What are you intending to do with the dump anyway? regards, tom lane
On Thu, Jul 24, 2014 at 6:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > joe@tanga.com writes: > > I've got a small database that takes about 0.8 seconds to dump with > > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. > > > I've narrowed it down to the foreign key constraints in the database, if > > those are removed, then 'pg_dump -a' becomes fast again. > > Given that and the warning messages shown in your followup, a plausible > guess is that pg_dump is wasting a lot of time vainly searching for ways > to break the circular dependency loops that arise from trying to order > the tables in a way that allows them to be restored with foreign key > constraints active. However: (a) it seems like it'd take a heck of a lot > of FK constraints for that to become a dominant factor, and (b) if that > were the explanation, seems like it should not be that hard to make an > artificial test case. So I'm thinking there's some other contributing > factor you haven't shown us. > I don't have an unreasonable amount of foreign key constraints, as far as I can tell. For an example, on one of my slower systems, adding two foreign key constraints caused 'pg_dump -a' to get 5 seconds slower (goes from 15 seconds to 20) -- the constraints are on tables that have 12 and 5 rows. I'll send you a test db shortly. Thanks for looking into it! > > > I can't come up with an artificial test case. I can reproduce it using > our > > company's db schema. I don't want to publicly post that schema to the > > mailing list, but I'd be happy to send the schema that shows the problem > to > > someone privately. > > I'd be willing to take a look, but it's not clear that there is any easy > fix, and TBH I'm not sure that this case is worth spending a lot of time > on. As the warning messages are telling you, a --data-only dump is of > limited value when you've got circular foreign key references. What > are you intending to do with the dump anyway? > This is used for running automated tests that modify the database. Before the tests are run, I use 'pg_dump -a' to generate the initial set of data that the tests use. Then, before each test, I disable all the triggers, truncate all the tables, load the data-only dump into the database, then re-enable the triggers. As far as I can tell, this is the fastest way to revert a database to a known state, it's faster than dropping/recreating databases. Joe > > regards, tom lane >
On Fri, Jul 25, 2014 at 12:10:52AM +0000, joe@tanga.com wrote: > The following bug has been logged on the website: > > Bug reference: 11033 > Logged by: Joe Van Dyk > Email address: joe@tanga.com > PostgreSQL version: 9.3.5 > Operating system: Ubuntu 12.04 > Description: > > I've got a small database that takes about 0.8 seconds to dump with > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. > > I've narrowed it down to the foreign key constraints in the database, if > those are removed, then 'pg_dump -a' becomes fast again. > > I can't come up with an artificial test case. I can reproduce it using our > company's db schema. I don't want to publicly post that schema to the > mailing list, but I'd be happy to send the schema that shows the problem to > someone privately. > > I can reproduce the problem in 9.3.4 and 9.4.beta1. Wow, that is certainly odd. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 2014-10-11 17:26:59 -0400, Bruce Momjian wrote: > On Fri, Jul 25, 2014 at 12:10:52AM +0000, joe@tanga.com wrote: > > The following bug has been logged on the website: > > > > Bug reference: 11033 > > Logged by: Joe Van Dyk > > Email address: joe@tanga.com > > PostgreSQL version: 9.3.5 > > Operating system: Ubuntu 12.04 > > Description: > > > > I've got a small database that takes about 0.8 seconds to dump with > > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. > > > > I've narrowed it down to the foreign key constraints in the database, if > > those are removed, then 'pg_dump -a' becomes fast again. > > > > I can't come up with an artificial test case. I can reproduce it using our > > company's db schema. I don't want to publicly post that schema to the > > mailing list, but I'd be happy to send the schema that shows the problem to > > someone privately. > > > > I can reproduce the problem in 9.3.4 and 9.4.beta1. > > Wow, that is certainly odd. There've been a couple cases of that where the dependency resolution gets more complex for data only dumps because of the added dependencies that try to get the order right to not violate foreign keys. IIRC Tom fixed a couple performance problems recently. Yep, 51fc6133488a80a1310972b8a0ad20aca13f5b02. Joe, can you check with some more recent version? Unfortunately that fix got committed after 9.3.5. 9.4 beta3 should be fine. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Oct 11, 2014 at 2:47 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-10-11 17:26:59 -0400, Bruce Momjian wrote: > > On Fri, Jul 25, 2014 at 12:10:52AM +0000, joe@tanga.com wrote: > > > The following bug has been logged on the website: > > > > > > Bug reference: 11033 > > > Logged by: Joe Van Dyk > > > Email address: joe@tanga.com > > > PostgreSQL version: 9.3.5 > > > Operating system: Ubuntu 12.04 > > > Description: > > > > > > I've got a small database that takes about 0.8 seconds to dump with > > > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. > > > > > > I've narrowed it down to the foreign key constraints in the database, > if > > > those are removed, then 'pg_dump -a' becomes fast again. > > > > > > I can't come up with an artificial test case. I can reproduce it using > our > > > company's db schema. I don't want to publicly post that schema to the > > > mailing list, but I'd be happy to send the schema that shows the > problem to > > > someone privately. > > > > > > I can reproduce the problem in 9.3.4 and 9.4.beta1. > > > > Wow, that is certainly odd. > > There've been a couple cases of that where the dependency resolution > gets more complex for data only dumps because of the added dependencies > that try to get the order right to not violate foreign keys. IIRC Tom > fixed a couple performance problems recently. > Yep, 51fc6133488a80a1310972b8a0ad20aca13f5b02. > > Joe, can you check with some more recent version? Unfortunately that fix > got committed after 9.3.5. 9.4 beta3 should be fine. > I've been using the 9.3-stable branch for the past few months (as a result of needing this fix), this problem has gone away as far as I can tell. Joe
On Wed, Oct 15, 2014 at 11:58 PM, Joe Van Dyk <joe@tanga.com> wrote: > On Sat, Oct 11, 2014 at 2:47 PM, Andres Freund <andres@2ndquadrant.com> > wrote: > >> On 2014-10-11 17:26:59 -0400, Bruce Momjian wrote: >> > On Fri, Jul 25, 2014 at 12:10:52AM +0000, joe@tanga.com wrote: >> > > The following bug has been logged on the website: >> > > >> > > Bug reference: 11033 >> > > Logged by: Joe Van Dyk >> > > Email address: joe@tanga.com >> > > PostgreSQL version: 9.3.5 >> > > Operating system: Ubuntu 12.04 >> > > Description: >> > > >> > > I've got a small database that takes about 0.8 seconds to dump with >> > > 'pg_dump' and about 11 seconds to dump with 'pg_dump -a'. >> > > >> > > I've narrowed it down to the foreign key constraints in the database, >> if >> > > those are removed, then 'pg_dump -a' becomes fast again. >> > > >> > > I can't come up with an artificial test case. I can reproduce it >> using our >> > > company's db schema. I don't want to publicly post that schema to the >> > > mailing list, but I'd be happy to send the schema that shows the >> problem to >> > > someone privately. >> > > >> > > I can reproduce the problem in 9.3.4 and 9.4.beta1. >> > >> > Wow, that is certainly odd. >> >> There've been a couple cases of that where the dependency resolution >> gets more complex for data only dumps because of the added dependencies >> that try to get the order right to not violate foreign keys. IIRC Tom >> fixed a couple performance problems recently. >> Yep, 51fc6133488a80a1310972b8a0ad20aca13f5b02. >> >> Joe, can you check with some more recent version? Unfortunately that fix >> got committed after 9.3.5. 9.4 beta3 should be fine. >> > > I've been using the 9.3-stable branch for the past few months (as a result > of needing this fix), this problem has gone away as far as I can tell. > > I forgot this never got posted to the list -- this is what Tom wrote back in July: "I've not entirely worked out just what's creating the performance issue, but it seemed like contributing factors included (a) there were long chains of FK references (not loops, just table A refers to table B refers to table C yadda yadda), and (b) there were a lot of plain old objects, such as functions. The reference chains seemed to be creating a combinatorial explosion in the number of times the search reached the plain old objects, but I've still not quite wrapped my head around exactly why. It might be that there's an easier/better fix available than this one." Joe