Thread: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup
The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and 40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware other than the pgsql upgrade have change.
--
Henrik Cednert
cto | compositor
From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 9:29 AM
To: pgsql-performance@lists.postgresql.org
Subject: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
Hello
We use a system in filmproduction called DaVinci Resolve. It uses a pgsql database when you work in a collaborative workflow and multiple people share projects. Previously it was using pgsql 8.4 but for a new major upgrade they recommend an upgrade to 9.5. Probably also to some macOS limitation/support and that 9.x is required for macOS >10.11.
They (BlackMagic Design) provide three tools for the migration.
1. For for dumping everything form the old 8.4 database
2. One for upgrading from 8.4 to 9.5
3. One for restoring the backup in step 1 in 9.5
All that went smoothly and working in the systems also works smoothly and as good as previously, maybe even a bit better/faster.
What's not working smoothly is my daily pg_dump's though. I don't have a reference to what's a big and what's a small database since I'm no db-guy and don't really maintain nor work with it on a daily basis. Pretty much only this system we use that has a db system like this. Below is a list of what we dump.
930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup
The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and 40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware other than the pgsql upgrade have change.
I dump the db's with a custom script and this is the line I use to get the DB's:
DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align --tuples-only --command="SELECT datname from pg_database WHERE NOT datistemplate")
After that I iterate over them with a for loop and dump with:
${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs --format=custom --verbose --file=${pg_dump_filename}_${database}.backup ${database} | tee -a ${log_pg_dump}_${database}.log
When observing the system during the dump it LOOKS like it did in 8.4. pg_dump is using 100% of one core and from what I can see it does this through out the operation. But it's still sooooo much slower. I read about the parallell option in pg_dump for 9.5 but sadly I cannot dump like that because the application in question can (probably) not import that format on it's own and I would have to use pgrestore or something. Which in theory is fine but sometimes one of the artists have to import the db backup. So need to keep it simple.
The system is:
MacPro 5,1
2x2.66 GHz Quad Core Xeon
64 GB RAM
macOS 10.11.6
PostgreSQL 9.5.4
DB on a 6 disk SSD RAID
I hope I got all the info needed. Really hope someone with more expertise and skills than me can point me in the right direction.
Cheers and thanks
--
Henrik Cednert
cto | compositor
According to pg_dump command in your script you are dumping your databases in custom format:
--format=custom
These backups could only be restored using pg_restore (or something that wraps pg_restore).
So, you can safely add parallel option. It should not affect your restore procedure.
Regards,
Igor Neyman
--
Henrik Cednert
cto | compositor
Filmlance International
On 21 Nov 2017, at 17:25, Igor Neyman <ineyman@perceptron.com> wrote:From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 9:29 AM
To: pgsql-performance@lists.postgresql.org
Subject: pg_dump 3 times as slow after 8.4 -> 9.5 upgradeHelloWe use a system in filmproduction called DaVinci Resolve. It uses a pgsql database when you work in a collaborative workflow and multiple people share projects. Previously it was using pgsql 8.4 but for a new major upgrade they recommend an upgrade to 9.5. Probably also to some macOS limitation/support and that 9.x is required for macOS >10.11.They (BlackMagic Design) provide three tools for the migration.1. For for dumping everything form the old 8.4 database2. One for upgrading from 8.4 to 9.53. One for restoring the backup in step 1 in 9.5All that went smoothly and working in the systems also works smoothly and as good as previously, maybe even a bit better/faster.What's not working smoothly is my daily pg_dump's though. I don't have a reference to what's a big and what's a small database since I'm no db-guy and don't really maintain nor work with it on a daily basis. Pretty much only this system we use that has a db system like this. Below is a list of what we dump.930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup
The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and 40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware other than the pgsql upgrade have change.I dump the db's with a custom script and this is the line I use to get the DB's:DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align --tuples-only --command="SELECT datname from pg_database WHERE NOT datistemplate")After that I iterate over them with a for loop and dump with:${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs --format=custom --verbose --file=${pg_dump_filename}_${database}.backup ${database} | tee -a ${log_pg_dump}_${database}.logWhen observing the system during the dump it LOOKS like it did in 8.4. pg_dump is using 100% of one core and from what I can see it does this through out the operation. But it's still sooooo much slower. I read about the parallell option in pg_dump for 9.5 but sadly I cannot dump like that because the application in question can (probably) not import that format on it's own and I would have to use pgrestore or something. Which in theory is fine but sometimes one of the artists have to import the db backup. So need to keep it simple.The system is:MacPro 5,12x2.66 GHz Quad Core Xeon64 GB RAMmacOS 10.11.6PostgreSQL 9.5.4DB on a 6 disk SSD RAIDI hope I got all the info needed. Really hope someone with more expertise and skills than me can point me in the right direction.Cheers and thanks
--
Henrik Cednert
cto | compositorAccording to pg_dump command in your script you are dumping your databases in custom format:--format=customThese backups could only be restored using pg_restore (or something that wraps pg_restore).So, you can safely add parallel option. It should not affect your restore procedure.Regards,Igor Neyman
From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 11:27 AM
To: pgsql-performance@lists.postgresql.org
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
Ahh! Nice catch Igor. Thanks. =)
Will try and see if resolve can read that back in.
Still very curious about the 3x slowdown in 9.5 pg_dump though.
--
Henrik Cednert
cto | compositor
Filmlance International
Basically, you are dumping 40GB of data.
I’d say even 212 minutes under 8.4 version was too slow.
What kind of RAID is it? RAID1/RAID10/RAID5?
Regards,
Igor Neyman
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 21 Nov 2017, at 17:34, Igor Neyman <ineyman@perceptron.com> wrote:From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 11:27 AM
To: pgsql-performance@lists.postgresql.org
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgradeAhh! Nice catch Igor. Thanks. =)Will try and see if resolve can read that back in.Still very curious about the 3x slowdown in 9.5 pg_dump though.
--
Henrik Cednert
cto | compositor
Filmlance InternationalBasically, you are dumping 40GB of data.I’d say even 212 minutes under 8.4 version was too slow.What kind of RAID is it? RAID1/RAID10/RAID5?Regards,Igor Neyman
From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@
filmlance.se]
Sent: Tuesday, November 21, 2017 9:29 AM
To: pgsql-performance@lists.postgresql.org
Subject: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
Hello
We use a system in filmproduction called DaVinci Resolve. It uses a pgsql database when you work in a collaborative workflow and multiple people share projects. Previously it was using pgsql 8.4 but for a new major upgrade they recommend an upgrade to 9.5. Probably also to some macOS limitation/support and that 9.x is required for macOS >10.11.
They (BlackMagic Design) provide three tools for the migration.
1. For for dumping everything form the old 8.4 database
2. One for upgrading from 8.4 to 9.5
3. One for restoring the backup in step 1 in 9.5
All that went smoothly and working in the systems also works smoothly and as good as previously, maybe even a bit better/faster.
What's not working smoothly is my daily pg_dump's though. I don't have a reference to what's a big and what's a small database since I'm no db-guy and don't really maintain nor work with it on a daily basis. Pretty much only this system we use that has a db system like this. Below is a list of what we dump.
930M Nov 18 13:31 filmserver03_2017-11-18_
132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02. backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14. backup
The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and 40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware other than the pgsql upgrade have change.
I dump the db's with a custom script and this is the line I use to get the DB's:
DATABASES=$(${BINARY_PATH}/
psql --user=postgres -w --no-align --tuples-only --command="SELECT datname from pg_database WHERE NOT datistemplate")
After that I iterate over them with a for loop and dump with:
${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs --format=custom --verbose --file=${pg_dump_filename}_${
database}.backup ${database} | tee -a ${log_pg_dump}_${database}.log
When observing the system during the dump it LOOKS like it did in 8.4. pg_dump is using 100% of one core and from what I can see it does this through out the operation. But it's still sooooo much slower. I read about the parallell option in pg_dump for 9.5 but sadly I cannot dump like that because the application in question can (probably) not import that format on it's own and I would have to use pgrestore or something. Which in theory is fine but sometimes one of the artists have to import the db backup. So need to keep it simple.
The system is:
MacPro 5,1
2x2.66 GHz Quad Core Xeon
64 GB RAM
macOS 10.11.6
PostgreSQL 9.5.4
DB on a 6 disk SSD RAID
I hope I got all the info needed. Really hope someone with more expertise and skills than me can point me in the right direction.
Cheers and thanks
--
Henrik Cednert
cto | compositorAccording to pg_dump command in your script you are dumping your databases in custom format:
--format=custom
These backups could only be restored using pg_restore (or something that wraps pg_restore).
So, you can safely add parallel option. It should not affect your restore procedure.
Regards,
Igor Neyman
From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 11:37 AM
To: pgsql-performance@lists.postgresql.org
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
Attention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.
RAID6. Doing disk test I have 1000MB/sec write and 1200MB/sec read.
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
_________________________________________________________________________________________________
Okay, I was kind of wrong about 40GB. That’s the size of your compressed backup files, not the size of your databases.
May be your dbs are “bloated”?
You could try VACUUM FULL on your databases, when there is no other activity.
Igor Neyman
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 21 Nov 2017, at 17:44, Igor Neyman <ineyman@perceptron.com> wrote:From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 11:37 AM
To: pgsql-performance@lists.postgresql.org
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgradeAttention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.
RAID6. Doing disk test I have 1000MB/sec write and 1200MB/sec read.
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
_________________________________________________________________________________________________Okay, I was kind of wrong about 40GB. That’s the size of your compressed backup files, not the size of your databases.May be your dbs are “bloated”?You could try VACUUM FULL on your databases, when there is no other activity.Igor Neyman
From: Henrik Cednert (Filmlance) [mailto:henrik.cednert@filmlance.se]
Sent: Tuesday, November 21, 2017 11:48 AM
To: pgsql-performance@lists.postgresql.org
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
I VACUUM every sunday so that is done already. =/
Not sure I have the proper params though since I'm not used to db's but have followed other's "how to's", but these are the lines in my script for that;
${BINARY_PATH}/vacuumdb --analyze --host=localhost --username=postgres --echo --verbose --no-password ${database} | tee -a ${log_pg_optimize}_${database}.log
${BINARY_PATH}/reindexdb --host=localhost --username=postgres --no-password --echo ${database} | tee -a ${log_pg_optimize}_${database}.log
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
_______________________________________________________________________________________________
To do vacuum full you need to add –full option to your vacuumdb command:
${BINARY_PATH}/vacuumdb --full --analyze --host=localhost --username=postgres --echo --verbose --no-password ${database} | tee -a ${log_pg_optimize}_${database}.log
Just be aware that “vacuum full” locks tables unlike just analyze”. So, like I said, no other acivity during this process.
Regards,
Igor
"Henrik Cednert (Filmlance)" <henrik.cednert@filmlance.se> writes: > The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware otherthan the pgsql upgrade have change. Can you get a profile of where the machine is spending its time during the dump run? On Linux I'd recommend "perf", but on macOS, hmm ... You could use Activity Monitor, but as far as I can see that just captures short-duration snapshots, which might not be representative of a 10-hour run. XCode's Instruments feature would probably be better about giving a full picture, but it has a steep learning curve. regards, tom lane
On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Henrik Cednert (Filmlance)" <henrik.cednert@filmlance.se> writes: >> The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware otherthan the pgsql upgrade have change. > > Can you get a profile of where the machine is spending its time during the > dump run? On Linux I'd recommend "perf", but on macOS, hmm ... > You could use Activity Monitor, but as far as I can see that just captures > short-duration snapshots, which might not be representative of a 10-hour > run. XCode's Instruments feature would probably be better about giving > a full picture, but it has a steep learning curve. macOS's "sample" is pretty easy to use and produces text format output that is easy to email. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Can you get a profile of where the machine is spending its time during the >> dump run? On Linux I'd recommend "perf", but on macOS, hmm ... >> You could use Activity Monitor, but as far as I can see that just captures >> short-duration snapshots, which might not be representative of a 10-hour >> run. XCode's Instruments feature would probably be better about giving >> a full picture, but it has a steep learning curve. > macOS's "sample" is pretty easy to use and produces text format output > that is easy to email. Ah, good idea. But note that only traces one process, so you'd need to first determine whether it's pg_dump or the backend that's eating most of the CPU. Or sample both of them. regards, tom lane
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 21 Nov 2017, at 19:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:Robert Haas <robertmhaas@gmail.com> writes:On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Can you get a profile of where the machine is spending its time during the
dump run? On Linux I'd recommend "perf", but on macOS, hmm ...
You could use Activity Monitor, but as far as I can see that just captures
short-duration snapshots, which might not be representative of a 10-hour
run. XCode's Instruments feature would probably be better about giving
a full picture, but it has a steep learning curve.macOS's "sample" is pretty easy to use and produces text format output
that is easy to email.
Ah, good idea. But note that only traces one process, so you'd need to
first determine whether it's pg_dump or the backend that's eating most
of the CPU. Or sample both of them.
regards, tom lane
Henrik Cednert
cto | compositor
Filmlance International
On 21 Nov 2017, at 19:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:Robert Haas <robertmhaas@gmail.com> writes:On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Can you get a profile of where the machine is spending its time during the
dump run? On Linux I'd recommend "perf", but on macOS, hmm ...
You could use Activity Monitor, but as far as I can see that just captures
short-duration snapshots, which might not be representative of a 10-hour
run. XCode's Instruments feature would probably be better about giving
a full picture, but it has a steep learning curve.macOS's "sample" is pretty easy to use and produces text format output
that is easy to email.
Ah, good idea. But note that only traces one process, so you'd need to
first determine whether it's pg_dump or the backend that's eating most
of the CPU. Or sample both of them.
regards, tom lane
"Henrik Cednert (Filmlance)" <henrik.cednert@filmlance.se> writes: > I'm not sure if I can attach screenshots here. Trying, screenshot from instruments after running for a few mins. It looks like practically all of pg_dump's time is going into deflate(), ie zlib. I don't find that terribly surprising in itself, but it offers no explanation for why you'd see a slowdown --- zlib isn't even our code, nor has it been under active development for a long time, so presumably 8.4 and 9.5 would have used the same version. Perhaps you were doing the 8.4 dump without compression enabled? regards, tom lane
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 21 Nov 2017, at 22:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:"Henrik Cednert (Filmlance)" <henrik.cednert@filmlance.se> writes:I'm not sure if I can attach screenshots here. Trying, screenshot from instruments after running for a few mins.
It looks like practically all of pg_dump's time is going into deflate(),
ie zlib. I don't find that terribly surprising in itself, but it offers
no explanation for why you'd see a slowdown --- zlib isn't even our
code, nor has it been under active development for a long time, so
presumably 8.4 and 9.5 would have used the same version. Perhaps you
were doing the 8.4 dump without compression enabled?
regards, tom lane
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 22 Nov 2017, at 04:48, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote:
This sender failed our fraud detection checks and may not be who they appear to be. Learn about spoofingFeedback Hi TomI'm honestly not sure about anything. =) I use the exact same flags as with 8.4 for the dump:${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs --format=custom --verbose --file=${pg_dump_filename}_${database}.backup ${database}So unless the default behaviour have changed in 9.x I'd say I don't use compression. I will try to force it to no compression and see if it's different.Sadly the instruments session stopped recording when I logged out of the system yesterday. Doh. =/Cheers
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]On 21 Nov 2017, at 22:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:"Henrik Cednert (Filmlance)" <henrik.cednert@filmlance.se> writes:I'm not sure if I can attach screenshots here. Trying, screenshot from instruments after running for a few mins.
It looks like practically all of pg_dump's time is going into deflate(),
ie zlib. I don't find that terribly surprising in itself, but it offers
no explanation for why you'd see a slowdown --- zlib isn't even our
code, nor has it been under active development for a long time, so
presumably 8.4 and 9.5 would have used the same version. Perhaps you
were doing the 8.4 dump without compression enabled?
regards, tom lane
> On Nov 21, 2017, at 10:18 PM, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote: > > WHat's the normal way to deal with compression? Dump uncompressed and use something that threads better to compress thedump? I would say most likely your zlib is screwed up somehow, like maybe it didn't get optimized right by the C compiler or somethingelse sucks w/ the compression settings. The CPU should easily blast away at that faster than disks can read. I did do some studies of this previously some years ago, and I found gzip -6 offered the best ratio between size reductionand CPU time out of a very wide range of formats, but at the time xz was also not yet available. If I were you I would first pipe the uncompressed output through a separate compression command, then you can experimentwith the flags and threads, and you already get another separate process for the kernel to put on other CPUs asan automatic bonus for multi-core with minimal work. After that, xz is GNU standard now and has xz -T for cranking up some threads, with little extra effort for the user. Butit can be kind of slow so probably need to lower the compression level somewhat depending a bit on some time testing.I would try on some medium sized DB table, like a bit over the size of system RAM, instead of dumping this greatbig DB, in order to benchmark a couple times until it looks happy. Matthew
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 22 Nov 2017, at 11:32, Matthew Hall <mhall@mhcomputing.net> wrote:On Nov 21, 2017, at 10:18 PM, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote:
WHat's the normal way to deal with compression? Dump uncompressed and use something that threads better to compress the dump?
I would say most likely your zlib is screwed up somehow, like maybe it didn't get optimized right by the C compiler or something else sucks w/ the compression settings. The CPU should easily blast away at that faster than disks can read.
I did do some studies of this previously some years ago, and I found gzip -6 offered the best ratio between size reduction and CPU time out of a very wide range of formats, but at the time xz was also not yet available.
If I were you I would first pipe the uncompressed output through a separate compression command, then you can experiment with the flags and threads, and you already get another separate process for the kernel to put on other CPUs as an automatic bonus for multi-core with minimal work.
After that, xz is GNU standard now and has xz -T for cranking up some threads, with little extra effort for the user. But it can be kind of slow so probably need to lower the compression level somewhat depending a bit on some time testing. I would try on some medium sized DB table, like a bit over the size of system RAM, instead of dumping this great big DB, in order to benchmark a couple times until it looks happy.
Matthew
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]
On 22 Nov 2017, at 13:17, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote:
This sender failed our fraud detection checks and may not be who they appear to be. Learn about spoofingFeedback HelloI've ran it with all the different compression levels on one of the smaller db's now. And not sending any flags to it see is, as I've seen hinted on some page on internet, same as level 6.I do, somewhat, share the opinion that something is up with zlib. But at the same time I haven't touch it since the 8.4 installation so it's a mystery how it could've failed on its own. The only thing performed was an upgrade from 8.4 to 9.5. But yes, I can not really say exactly what that upgrade touched and what it didn't touch. Will investigate further.COMPRESSION LEVEL: 0FILE SIZE: 6205982696real 0m38.218suser 0m3.558ssys 0m17.309sCOMPRESSION LEVEL: 1FILE SIZE: 1391475419real 4m3.725suser 3m54.132ssys 0m5.565sCOMPRESSION LEVEL: 2FILE SIZE: 1344563403real 4m18.574suser 4m9.466ssys 0m5.417sCOMPRESSION LEVEL: 3FILE SIZE: 1267601394real 5m23.373suser 5m14.339ssys 0m5.462sCOMPRESSION LEVEL: 4FILE SIZE: 1241632684real 6m19.501suser 6m10.148ssys 0m5.655sCOMPRESSION LEVEL: 5FILE SIZE: 1178377949real 9m18.449suser 9m9.733ssys 0m5.169sCOMPRESSION LEVEL: 6FILE SIZE: 1137727582real 13m28.424suser 13m19.842ssys 0m5.036sCOMPRESSION LEVEL: 7FILE SIZE: 1126257786real 16m39.392suser 16m30.094ssys 0m5.724sCOMPRESSION LEVEL: 8FILE SIZE: 1111804793real 30m37.135suser 30m26.785ssys 0m6.660sCOMPRESSION LEVEL: 9FILE SIZE: 1112194596real 33m40.325suser 33m27.122ssys 0m6.498sCOMPRESSION LEVEL AT DEFAULT NO FLAG PASSED TO 'pg_dump'FILE SIZE: 1140261276real 13m18.178suser 13m9.417ssys 0m5.242s
--
Henrik Cednert
cto | compositor
Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype [ cednert ]On 22 Nov 2017, at 11:32, Matthew Hall <mhall@mhcomputing.net> wrote:On Nov 21, 2017, at 10:18 PM, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote:
WHat's the normal way to deal with compression? Dump uncompressed and use something that threads better to compress the dump?
I would say most likely your zlib is screwed up somehow, like maybe it didn't get optimized right by the C compiler or something else sucks w/ the compression settings. The CPU should easily blast away at that faster than disks can read.
I did do some studies of this previously some years ago, and I found gzip -6 offered the best ratio between size reduction and CPU time out of a very wide range of formats, but at the time xz was also not yet available.
If I were you I would first pipe the uncompressed output through a separate compression command, then you can experiment with the flags and threads, and you already get another separate process for the kernel to put on other CPUs as an automatic bonus for multi-core with minimal work.
After that, xz is GNU standard now and has xz -T for cranking up some threads, with little extra effort for the user. But it can be kind of slow so probably need to lower the compression level somewhat depending a bit on some time testing. I would try on some medium sized DB table, like a bit over the size of system RAM, instead of dumping this great big DB, in order to benchmark a couple times until it looks happy.
Matthew
Hi, On 2017-11-22 02:32:45 -0800, Matthew Hall wrote: > I would say most likely your zlib is screwed up somehow, like maybe it > didn't get optimized right by the C compiler or something else sucks > w/ the compression settings. The CPU should easily blast away at that > faster than disks can read. Huh? Zlib compresses at a few 10s of MB/s. Greetings, Andres Freund
On Nov 22, 2017, at 5:06 AM, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote: > > When investigating the zlib lead I looked at 8.4 installation and 9.5 installation. 9.5 includes zlib.h (/Library/PostgreSQL//9.5/include/zlib.h),but 8.4 doesn't. But that's a header file and I have no idea how that really worksand if that's the one used by pgres9.5 or not. The version in it says 1.2.8 and that's what the Instruments are showingwhen I monitor pg_dump while running. > > Guess I'll have to install instruments in a dev env and do a pg_dump with 8.4 to see the difference. Tedious. =/ I would also check the library linkages of the pg_dump binaries. See if one thing is using an embedded zlib and the other a system zlib. Then you could imagine one didn't get compiled with the best-performing CFLAGS, etc. Matthew.
--
Henrik Cednert
cto | compositor
Filmlance International
On 22 Nov 2017, at 20:52, Matthew Hall <mhall@mhcomputing.net> wrote:On Nov 22, 2017, at 5:06 AM, Henrik Cednert (Filmlance) <henrik.cednert@filmlance.se> wrote:
When investigating the zlib lead I looked at 8.4 installation and 9.5 installation. 9.5 includes zlib.h (/Library/PostgreSQL//9.5/include/zlib.h), but 8.4 doesn't. But that's a header file and I have no idea how that really works and if that's the one used by pgres9.5 or not. The version in it says 1.2.8 and that's what the Instruments are showing when I monitor pg_dump while running.
Guess I'll have to install instruments in a dev env and do a pg_dump with 8.4 to see the difference. Tedious. =/
I would also check the library linkages of the pg_dump binaries.
See if one thing is using an embedded zlib and the other a system zlib.
Then you could imagine one didn't get compiled with the best-performing CFLAGS, etc.
Matthew.
Hello,
I had this behaviors when the upgraded pg 9.5 was on ssl mode by default.
So i deactivated ssl mode in postgresql.conf. That's all.
Regards,
Patrick
HelloWe use a system in filmproduction called DaVinci Resolve. It uses a pgsql database when you work in a collaborative workflow and multiple people share projects. Previously it was using pgsql 8.4 but for a new major upgrade they recommend an upgrade to 9.5. Probably also to some macOS limitation/support and that 9.x is required for macOS >10.11.They (BlackMagic Design) provide three tools for the migration.1. For for dumping everything form the old 8.4 database2. One for upgrading from 8.4 to 9.53. One for restoring the backup in step 1 in 9.5All that went smoothly and working in the systems also works smoothly and as good as previously, maybe even a bit better/faster.What's not working smoothly is my daily pg_dump's though. I don't have a reference to what's a big and what's a small database since I'm no db-guy and don't really maintain nor work with it on a daily basis. Pretty much only this system we use that has a db system like this. Below is a list of what we dump.930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup
The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the very same pg_dump takes 644 minutes and 40 seconds. To it takes about three times as long now and I have no idea to why. Nothing in the system or hardware other than the pgsql upgrade have change.I dump the db's with a custom script and this is the line I use to get the DB's:DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align --tuples-only --command="SELECT datname from pg_database WHERE NOT datistemplate")After that I iterate over them with a for loop and dump with:${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs --format=custom --verbose --file=${pg_dump_filename}_${database}.backup ${database} | tee -a ${log_pg_dump}_${database}.logWhen observing the system during the dump it LOOKS like it did in 8.4. pg_dump is using 100% of one core and from what I can see it does this through out the operation. But it's still sooooo much slower. I read about the parallell option in pg_dump for 9.5 but sadly I cannot dump like that because the application in question can (probably) not import that format on it's own and I would have to use pgrestore or something. Which in theory is fine but sometimes one of the artists have to import the db backup. So need to keep it simple.The system is:MacPro 5,12x2.66 GHz Quad Core Xeon64 GB RAMmacOS 10.11.6PostgreSQL 9.5.4DB on a 6 disk SSD RAIDI hope I got all the info needed. Really hope someone with more expertise and skills than me can point me in the right direction.Cheers and thanks
--
Henrik Cednert
cto | compositor
On 22 Nov 2017, at 22:07, Patrick KUI-LI <pkuili@hipay.com> wrote:Hello,
I had this behaviors when the upgraded pg 9.5 was on ssl mode by default.
So i deactivated ssl mode in postgresql.conf. That's all.
Regards,
Patrick
I confess I don't do dump or any backups much other than file system snapshots.
But when I do, I don't like how long it takes.
I confess my database is big, I have about 200 GB. But still, dumping it should not take 48 hours (and running) while the system is 75% idle and reads are at 4.5 MB/s when the system sustains over 100 MB/s during processing of table scan and hash join queries.
Something is wrong with the dump thing. And no, it's not SSL or whatever, I am doing it on a local system with local connections. Version 9.5 something.
regards,
-Gunther
On 22 Nov 2017, at 22:07, Patrick KUI-LI <pkuili@hipay.com> wrote:Hello,
I had this behaviors when the upgraded pg 9.5 was on ssl mode by default.
So i deactivated ssl mode in postgresql.conf. That's all.
Regards,
Patrick
HelloAnd you just uncommented the 'ssl = off' line in the config for this?Is this default behaviour different from 8.4? Is there a 'show running config' for pgsql?I tried that in the test vm and didn't really give me a significant difference.COMPRESSION LEVEL: 6, SSL ONreal 82m33.744suser 60m55.069ssys 3m3.375sCOMPRESSION LEVEL: 6, SSL OFFreal 76m31.083suser 61m23.282ssys 1m23.341s
Gunther wrote: > Something is wrong with the dump thing. And no, it's not SSL or whatever, > I am doing it on a local system with local connections. Version 9.5 something. That's a lot of useful information. Try to profile where the time is spent, using "perf" or similar. Do you connect via the network, TCP localhost or UNIX sockets? The last option should be the fastest. Yours, Laurenz Albe
On Thu, Dec 7, 2017 at 2:31 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > Gunther wrote: >> Something is wrong with the dump thing. And no, it's not SSL or whatever, >> I am doing it on a local system with local connections. Version 9.5 something. > > That's a lot of useful information. > > Try to profile where the time is spent, using "perf" or similar. > > Do you connect via the network, TCP localhost or UNIX sockets? > The last option should be the fastest. You can use SSL over a local TCP connection. Whether it's the case is the thing. In my experience, SSL isn't a problem, but compression *is*. With a modern-enough openssl, enabling compression is tough, it's forcefully disabled by default due to the vulnerabilities that were discovered related to its use lately. So chances are, no matter what you configured, compression isn't being used. I never measured it compared to earlier versions, but pg_dump is indeed quite slow, and the biggest offender is formatting the COPY data to be transmitted over the wire. That's why parallel dump is so useful, you can use all your cores and achieve almost perfect multicore acceleration. Compression of the archive is also a big overhead, if you want compression but want to keep the overhead to the minimum, set the minimum compression level (1). Something like: pg_dump -Fd -j 8 -Z 1 -f target_dir yourdb