Thread: Optimize pg_dump schema-only
Hi All, I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade --link". Since this schema only dump can't take advantage of parallel processing with jobs I'm looking for any preparation or configuration settings that can improve speed. 9.2 to 9.6 CentOS 6/64bit 512GB I see only one CPU of 32 doing anything and it's often at 100%. Disk IO is minimal. Memory use varies but always plenty to spare. During upgrade I'm running: Only the upgrade - no other services work_mem = 50MB maintenance_work_mem = 2048MB shared_buffers = 30GB max_locks_per_transaction = 4096 autovacuum = off autovacuum_freeze_max_age = 1500000000 #Had previous issues with vacuum (to prevent wrap) Truthfully, I thought I had increased work_mem until starting this email. But increasing it is just a guess unless I get advice to do so here. I'm at a knowledge level where I can only guess at the relevance of vacuum, analyze or any other preparatory actions I can complete before taking postgres offline for upgrade. My feeling is that the bottleneck is the backend and not pg_dump. School me on that if needed please. Any advice and explanation is appreciated. - Senor
On 4/28/19 1:21 PM, senor wrote: > Hi All, > > I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade > --link". Since this schema only dump can't take advantage of parallel The above is going to need more explanation or a command line example. > processing with jobs I'm looking for any preparation or configuration > settings that can improve speed. > > 9.2 to 9.6 > CentOS 6/64bit > 512GB > > I see only one CPU of 32 doing anything and it's often at 100%. Disk IO > is minimal. Memory use varies but always plenty to spare. > > During upgrade I'm running: > Only the upgrade - no other services > work_mem = 50MB > maintenance_work_mem = 2048MB > shared_buffers = 30GB > max_locks_per_transaction = 4096 > autovacuum = off > autovacuum_freeze_max_age = 1500000000 #Had previous issues with > vacuum (to prevent wrap) > > Truthfully, I thought I had increased work_mem until starting this > email. But increasing it is just a guess unless I get advice to do so > here. I'm at a knowledge level where I can only guess at the relevance > of vacuum, analyze or any other preparatory actions I can complete > before taking postgres offline for upgrade. My feeling is that the > bottleneck is the backend and not pg_dump. School me on that if needed > please. > > Any advice and explanation is appreciated. > > - Senor > -- Adrian Klaver adrian.klaver@aklaver.com
On 4/28/19 3:21 PM, senor wrote: > Hi All, > > I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade > --link". Since this schema only dump can't take advantage of parallel > processing with jobs I'm looking for any preparation or configuration > settings that can improve speed. > > 9.2 to 9.6 > CentOS 6/64bit > 512GB > > I see only one CPU of 32 doing anything and it's often at 100%. Disk IO > is minimal. Memory use varies but always plenty to spare. "pg_dump --schema-only" is single-threaded. -- Angular momentum makes the world go 'round.
Ron <ronljohnsonjr@gmail.com> writes: > On 4/28/19 3:21 PM, senor wrote: >> I see only one CPU of 32 doing anything and it's often at 100%. Disk IO >> is minimal. Memory use varies but always plenty to spare. > "pg_dump --schema-only" is single-threaded. Yup. But ... pg_upgrade does have a --jobs option, since 9.5 or so. What it does is to run the dump and restore tasks for different databases of the cluster in parallel. So it won't help much if you've just got one DB with a lotta objects, but I thought it was worth mentioning. regards, tom lane
I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option cannot benefit pg_dump in my case (single DB, Single schema, 100000+ tables). Using pg_upgrade with the --link option is very fast except for the pg_dump portion of the upgrade which takes days.
I think I am looking for advice on how to optimize the postgres process for best performance of a single thread making whatever queries pg_dump --schema-only does. Obviously, that's not a common concern for DB tuning. At the moment, all I'm reasonably sure about is that a reduction in the number of tables should reduce the time needed but that is not the reality I'm faced with.
Thanks,
Senor
On 4/28/2019 14:08, Ron wrote:
On 4/28/19 3:21 PM, senor wrote:Hi All,
I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade
--link". Since this schema only dump can't take advantage of parallel
processing with jobs I'm looking for any preparation or configuration
settings that can improve speed.
9.2 to 9.6
CentOS 6/64bit
512GB
I see only one CPU of 32 doing anything and it's often at 100%. Disk IO
is minimal. Memory use varies but always plenty to spare.
"pg_dump --schema-only" is single-threaded.
senor <frio_cervesa@hotmail.com> writes: > I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option cannotbenefit pg_dump in my case (single DB, Single schema, 100000+ tables). Using pg_upgrade with the --link option is veryfast except for the pg_dump portion of the upgrade which takes days. One simple question is whether the source server is the last available minor release (9.2.24 I believe). If not, you may be missing performance fixes that would help. pg_dump -s on 100K tables should not take "days", so I'm guessing you're hitting some O(N^2) behavior somewhere, and it might be something we fixed. Likewise make sure that pg_dump is the newest available in the destination release series. regards, tom lane
I'm afraid it is 9.2.4. I'm using pg_upgrade from 9.6 and that is using pg_dump from 9.6. I noticed on 2 installations with similar table numbers (~200,000), schema and hardware that one was done in hours and the other didn't finish over the weekend. Keeping tabs on pg_stat_activity indicated pg_dump was still processing and nothing else running. Would you say that updating to 9.2.24 would be beneficial before upgrading to 9.6? An update is pretty quick and could be worth the time if there aren't additional requirements prior to starting the upgrade. Thank you. Senor On 4/28/2019 18:19, Tom Lane wrote: > senor <frio_cervesa@hotmail.com> writes: >> I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option cannotbenefit pg_dump in my case (single DB, Single schema, 100000+ tables). Using pg_upgrade with the --link option is veryfast except for the pg_dump portion of the upgrade which takes days. > One simple question is whether the source server is the last available > minor release (9.2.24 I believe). If not, you may be missing performance > fixes that would help. pg_dump -s on 100K tables should not take "days", > so I'm guessing you're hitting some O(N^2) behavior somewhere, and it > might be something we fixed. > > Likewise make sure that pg_dump is the newest available in the destination > release series. > > regards, tom lane
senor <frio_cervesa@hotmail.com> writes: > I'm afraid it is 9.2.4. That's pretty old :-( > I noticed on 2 installations with similar table numbers (~200,000), > schema and hardware that one was done in hours and the other didn't > finish over the weekend. Keeping tabs on pg_stat_activity indicated > pg_dump was still processing and nothing else running. Hmm ... but without knowing what's different between the two, that's not much help. > Would you say that updating to 9.2.24 would be beneficial before > upgrading to 9.6? An update is pretty quick and could be worth the time > if there aren't additional requirements prior to starting the upgrade. That was my reasoning. It might not help, but it's a simple thing to do and it might help. regards, tom lane