Thread: Optimize pg_dump schema-only

Optimize pg_dump schema-only

From
senor
Date:
Hi All,

I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade 
--link". Since this schema only dump can't take advantage of parallel 
processing with jobs I'm looking for any preparation or configuration 
settings that can improve speed.

9.2 to 9.6
CentOS 6/64bit
512GB

I see only one CPU of 32 doing anything and it's often at 100%. Disk IO 
is minimal. Memory use varies but always plenty to spare.

During upgrade I'm running:
     Only the upgrade - no other services
     work_mem = 50MB
     maintenance_work_mem = 2048MB
     shared_buffers = 30GB
     max_locks_per_transaction = 4096
     autovacuum = off
     autovacuum_freeze_max_age = 1500000000  #Had previous issues with 
vacuum (to prevent wrap)

Truthfully, I thought I had increased work_mem until starting this 
email. But increasing it is just a guess unless I get advice to do so 
here. I'm at a knowledge level where I can only guess at the relevance 
of vacuum, analyze or any other preparatory actions I can complete 
before taking postgres offline for upgrade. My feeling is that the 
bottleneck is the backend and not pg_dump. School me on that if needed 
please.

Any advice and explanation is appreciated.

- Senor


Re: Optimize pg_dump schema-only

From
Adrian Klaver
Date:
On 4/28/19 1:21 PM, senor wrote:
> Hi All,
> 
> I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade
> --link". Since this schema only dump can't take advantage of parallel

The above is going to need more explanation or a command line example.

> processing with jobs I'm looking for any preparation or configuration
> settings that can improve speed.
> 
> 9.2 to 9.6
> CentOS 6/64bit
> 512GB
> 
> I see only one CPU of 32 doing anything and it's often at 100%. Disk IO
> is minimal. Memory use varies but always plenty to spare.
> 
> During upgrade I'm running:
>       Only the upgrade - no other services
>       work_mem = 50MB
>       maintenance_work_mem = 2048MB
>       shared_buffers = 30GB
>       max_locks_per_transaction = 4096
>       autovacuum = off
>       autovacuum_freeze_max_age = 1500000000  #Had previous issues with
> vacuum (to prevent wrap)
> 
> Truthfully, I thought I had increased work_mem until starting this
> email. But increasing it is just a guess unless I get advice to do so
> here. I'm at a knowledge level where I can only guess at the relevance
> of vacuum, analyze or any other preparatory actions I can complete
> before taking postgres offline for upgrade. My feeling is that the
> bottleneck is the backend and not pg_dump. School me on that if needed
> please.
> 
> Any advice and explanation is appreciated.
> 
> - Senor
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Optimize pg_dump schema-only

From
Ron
Date:
On 4/28/19 3:21 PM, senor wrote:
> Hi All,
>
> I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade
> --link". Since this schema only dump can't take advantage of parallel
> processing with jobs I'm looking for any preparation or configuration
> settings that can improve speed.
>
> 9.2 to 9.6
> CentOS 6/64bit
> 512GB
>
> I see only one CPU of 32 doing anything and it's often at 100%. Disk IO
> is minimal. Memory use varies but always plenty to spare.

"pg_dump --schema-only" is single-threaded.


-- 
Angular momentum makes the world go 'round.



Re: Optimize pg_dump schema-only

From
Tom Lane
Date:
Ron <ronljohnsonjr@gmail.com> writes:
> On 4/28/19 3:21 PM, senor wrote:
>> I see only one CPU of 32 doing anything and it's often at 100%. Disk IO
>> is minimal. Memory use varies but always plenty to spare.

> "pg_dump --schema-only" is single-threaded.

Yup.  But ...

pg_upgrade does have a --jobs option, since 9.5 or so.  What it does
is to run the dump and restore tasks for different databases of the
cluster in parallel.  So it won't help much if you've just got one
DB with a lotta objects, but I thought it was worth mentioning.

            regards, tom lane



Re: Optimize pg_dump schema-only

From
senor
Date:

I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option cannot benefit pg_dump in my case (single DB, Single schema, 100000+ tables). Using pg_upgrade with the --link option is very fast except for the pg_dump portion of the upgrade which takes days.

I think I am looking for advice on how to optimize the postgres process for best performance of a single thread making whatever queries pg_dump --schema-only does. Obviously, that's not a common concern for DB tuning.  At the moment, all I'm reasonably sure about is that a reduction in the number of tables should reduce the time needed but that is not the reality I'm faced with.
Thanks,
Senor

On 4/28/2019 14:08, Ron wrote:
On 4/28/19 3:21 PM, senor wrote:
Hi All,

I'm looking for advice for optimizing the pg_dump portion of "pg_upgrade
--link". Since this schema only dump can't take advantage of parallel
processing with jobs I'm looking for any preparation or configuration
settings that can improve speed.

9.2 to 9.6
CentOS 6/64bit
512GB

I see only one CPU of 32 doing anything and it's often at 100%. Disk IO
is minimal. Memory use varies but always plenty to spare.

"pg_dump --schema-only" is single-threaded.



Re: Optimize pg_dump schema-only

From
Tom Lane
Date:
senor <frio_cervesa@hotmail.com> writes:
> I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option
cannotbenefit pg_dump in my case (single DB, Single schema, 100000+ tables). Using pg_upgrade with the --link option is
veryfast except for the pg_dump portion of the upgrade which takes days. 

One simple question is whether the source server is the last available
minor release (9.2.24 I believe).  If not, you may be missing performance
fixes that would help.  pg_dump -s on 100K tables should not take "days",
so I'm guessing you're hitting some O(N^2) behavior somewhere, and it
might be something we fixed.

Likewise make sure that pg_dump is the newest available in the destination
release series.

            regards, tom lane



Re: Optimize pg_dump schema-only

From
senor
Date:
I'm afraid it is 9.2.4. I'm using pg_upgrade from 9.6 and that is using
pg_dump from 9.6.

I noticed on 2 installations with similar table numbers (~200,000),
schema and hardware that one was done in hours and the other didn't
finish over the weekend. Keeping tabs on pg_stat_activity indicated
pg_dump was still processing and nothing else running.

Would you say that updating to 9.2.24 would be beneficial before
upgrading to 9.6? An update is pretty quick and could be worth the time
if there aren't additional requirements prior to starting the upgrade.

Thank you.
Senor


On 4/28/2019 18:19, Tom Lane wrote:
> senor <frio_cervesa@hotmail.com> writes:
>> I know from a previous post to the community that pg_dump --schema-only is single threaded and the --jobs option
cannotbenefit pg_dump in my case (single DB, Single schema, 100000+ tables). Using pg_upgrade with the --link option is
veryfast except for the pg_dump portion of the upgrade which takes days. 
> One simple question is whether the source server is the last available
> minor release (9.2.24 I believe).  If not, you may be missing performance
> fixes that would help.  pg_dump -s on 100K tables should not take "days",
> so I'm guessing you're hitting some O(N^2) behavior somewhere, and it
> might be something we fixed.
>
> Likewise make sure that pg_dump is the newest available in the destination
> release series.
>
>             regards, tom lane




Re: Optimize pg_dump schema-only

From
Tom Lane
Date:
senor <frio_cervesa@hotmail.com> writes:
> I'm afraid it is 9.2.4.

That's pretty old :-(

> I noticed on 2 installations with similar table numbers (~200,000), 
> schema and hardware that one was done in hours and the other didn't 
> finish over the weekend. Keeping tabs on pg_stat_activity indicated 
> pg_dump was still processing and nothing else running.

Hmm ... but without knowing what's different between the two, that's
not much help.

> Would you say that updating to 9.2.24 would be beneficial before 
> upgrading to 9.6? An update is pretty quick and could be worth the time 
> if there aren't additional requirements prior to starting the upgrade.

That was my reasoning.  It might not help, but it's a simple thing
to do and it might help.

            regards, tom lane