Thread: Ways to speed up dump&reload

Ways to speed up dump&reload

From
"Markus Wollny"
Date:
Hello!

Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 8.0.1 - and though I'm very much looking
forwardto some of the new features, the dump&reload process is worrying me a bit this time. I've got one cluster in
particularwhich is roughly 9GB in size and features some dreadfully large indices of the GiST-type for Tsearch2. 

I have already scheduled a nightshift for this upgrade, but I'm not all too certain that I'll be up and running again
inthe morning, so I wondered if there might be some preparations that would allow for some speed-up during the reload
process?What kind of tuning could be done in postgresql.conf in respect to just this particular workload (COPY and
finallythe CREATE INDEX stuff) with no other concurrent access going on? 

The machine in question features a RAID10 disk array which hosts the DB-cluster, some 2GB RAM and four processors and
I'vetuned postgresql.conf for our ordinary everyday workload with lots and lots of concurrent reads and writes, but
I'vegot the feeling that this configuration might not be the optimum for dump&reload, so it might make some sense to
usesome sort of "maintenance"-configuration. What kind of tuning would you recommend? 

Thank you very much for your advice and lots of cheers to the developers for 8.0!

Kind regards

   Markus

Re: Ways to speed up dump&reload

From
Csaba Nagy
Date:
Hi Markus,

Have you considered installing slony and replicate from the old version
to the new one ? It can do that, and when the replica is up to date, the
switchover will take minutes I guess.
Note that I have never done that so please ask others about that too, or
research for yourself ;-)

Cheers,
Csaba.


On Mon, 2005-02-21 at 07:11, Markus Wollny wrote:
> Hello!
>
> Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 8.0.1 - and though I'm very much looking
forwardto some of the new features, the dump&reload process is worrying me a bit this time. I've got one cluster in
particularwhich is roughly 9GB in size and features some dreadfully large indices of the GiST-type for Tsearch2. 
>
> I have already scheduled a nightshift for this upgrade, but I'm not all too certain that I'll be up and running again
inthe morning, so I wondered if there might be some preparations that would allow for some speed-up during the reload
process?What kind of tuning could be done in postgresql.conf in respect to just this particular workload (COPY and
finallythe CREATE INDEX stuff) with no other concurrent access going on? 
>
> The machine in question features a RAID10 disk array which hosts the DB-cluster, some 2GB RAM and four processors and
I'vetuned postgresql.conf for our ordinary everyday workload with lots and lots of concurrent reads and writes, but
I'vegot the feeling that this configuration might not be the optimum for dump&reload, so it might make some sense to
usesome sort of "maintenance"-configuration. What kind of tuning would you recommend? 
>
> Thank you very much for your advice and lots of cheers to the developers for 8.0!
>
> Kind regards
>
>    Markus
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq


Re: Ways to speed up dump&reload

From
Oleg Bartunov
Date:
Markus,

have you read updated http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
about regprocedure_7.4.patch.gz and regprocedure_update.sql ?

Oleg
On Mon, 21 Feb 2005, Markus Wollny wrote:

> Hello!
>
> Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 8.0.1 - and though I'm very much looking
forwardto some of the new features, the dump&reload process is worrying me a bit this time. I've got one cluster in
particularwhich is roughly 9GB in size and features some dreadfully large indices of the GiST-type for Tsearch2. 
>
> I have already scheduled a nightshift for this upgrade, but I'm not all too certain that I'll be up and running again
inthe morning, so I wondered if there might be some preparations that would allow for some speed-up during the reload
process?What kind of tuning could be done in postgresql.conf in respect to just this particular workload (COPY and
finallythe CREATE INDEX stuff) with no other concurrent access going on? 
>
> The machine in question features a RAID10 disk array which hosts the DB-cluster, some 2GB RAM and four processors and
I'vetuned postgresql.conf for our ordinary everyday workload with lots and lots of concurrent reads and writes, but
I'vegot the feeling that this configuration might not be the optimum for dump&reload, so it might make some sense to
usesome sort of "maintenance"-configuration. What kind of tuning would you recommend? 
>
> Thank you very much for your advice and lots of cheers to the developers for 8.0!
>
> Kind regards
>
>   Markus
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: Ways to speed up dump&reload

From
"Magnus Hagander"
Date:
> Hello!
>
> Finally the time has come for us to upgrade our PostgreSQL
> 7.4.5-servers to 8.0.1 - and though I'm very much looking
> forward to some of the new features, the dump&reload process
> is worrying me a bit this time. I've got one cluster in
> particular which is roughly 9GB in size and features some
> dreadfully large indices of the GiST-type for Tsearch2.
>
> I have already scheduled a nightshift for this upgrade, but
> I'm not all too certain that I'll be up and running again in
> the morning, so I wondered if there might be some
> preparations that would allow for some speed-up during the
> reload process? What kind of tuning could be done in
> postgresql.conf in respect to just this particular workload
> (COPY and finally the CREATE INDEX stuff) with no other
> concurrent access going on?

If you'll be around to babysit the system during the reload, turn off
fsync during the load.
If you have the RAM, consider temporarily moving the WAL logs to a
ramdrive.
Increase sort_mem and maintenance_sort_mem, since you will only have a
single connection.
If you're on hyperthreading CPUs, disable hyperthreading. Since you load
in a single connectino, only one CPU will be used.

Be very careful to remember to turn this back on after the load!

Not sure if there is something specific you can do for tsearch, but this
should help with the general stuff

//Magnus

Re: Ways to speed up dump&reload

From
"Markus Wollny"
Date:
Hello Oleg,

Yes, thanks - I have installed this patch as soon as it became available. Concerning my largest database, I haven't yet
migratedthat to tsearch2 - it's still tsearch1-based, so I intend to drop ts1 before the last dump, reload the database
withPostgreSQL 8.0.1 and deploy tsearch2 after that. I was thinking more in terms of what would be the optimum settings
forthings like workmem during reload and creation of the indices - as there's not much else going on at this point of
time,it might be worthwhile to use some settings which allocate all available resources to just this task. 

Kind regards

   Markus

-----Original Message-----
From:    Oleg Bartunov [mailto:oleg@sai.msu.su]
Sent:    Mon 2/21/2005 11:15
To:    Markus Wollny
Cc:    pgsql-general@postgresql.org
Subject:    Re: [GENERAL] Ways to speed up dump&reload
Markus,

have you read updated http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
about regprocedure_7.4.patch.gz and regprocedure_update.sql ?

Oleg


Re: Ways to speed up dump&reload

From
"Markus Wollny"
Date:
Hi!

Thanks very much, this was exactly the kind of advice I was hoping for! I'll give the WAL to ramdisk thing a try; fsync
isoff by default anyways - the data in this cluster is not so valuable as not to risk one day of rollback for the
performancegain of having fsync turned off and I'm doing nightly dumps anyway AND the machine as both a dedicated UPS
andredundant PSUs, so the likelyhood of things turning snafu is considerably small. But having WAL in RAM might
actuallyspeed up the whole lot quite considerably.  

Thank you!

Kind regards

   Markus


-----Original Message-----
From:    Magnus Hagander [mailto:mha@sollentuna.net]
Sent:    Mon 2/21/2005 13:04
To:    Markus Wollny; pgsql-general@postgresql.org
Cc:
Subject:    RE: [GENERAL] Ways to speed up dump&reload
> Hello!
>
> Finally the time has come for us to upgrade our PostgreSQL
> 7.4.5-servers to 8.0.1 - and though I'm very much looking
> forward to some of the new features, the dump&reload process
> is worrying me a bit this time. I've got one cluster in
> particular which is roughly 9GB in size and features some
> dreadfully large indices of the GiST-type for Tsearch2.
>
> I have already scheduled a nightshift for this upgrade, but
> I'm not all too certain that I'll be up and running again in
> the morning, so I wondered if there might be some
> preparations that would allow for some speed-up during the
> reload process? What kind of tuning could be done in
> postgresql.conf in respect to just this particular workload
> (COPY and finally the CREATE INDEX stuff) with no other
> concurrent access going on?

If you'll be around to babysit the system during the reload, turn off
fsync during the load.
If you have the RAM, consider temporarily moving the WAL logs to a
ramdrive.
Increase sort_mem and maintenance_sort_mem, since you will only have a
single connection.
If you're on hyperthreading CPUs, disable hyperthreading. Since you load
in a single connectino, only one CPU will be used.

Be very careful to remember to turn this back on after the load!

Not sure if there is something specific you can do for tsearch, but this
should help with the general stuff

//Magnus




Re: Ways to speed up dump&reload

From
Christopher Browne
Date:
Quoth Markus.Wollny@computec.de ("Markus Wollny"):
> Finally the time has come for us to upgrade our PostgreSQL
> 7.4.5-servers to 8.0.1 - and though I'm very much looking forward to
> some of the new features, the dump&reload process is worrying me a
> bit this time. I've got one cluster in particular which is roughly
> 9GB in size and features some dreadfully large indices of the
> GiST-type for Tsearch2.
>
> I have already scheduled a nightshift for this upgrade, but I'm not
> all too certain that I'll be up and running again in the morning, so
> I wondered if there might be some preparations that would allow for
> some speed-up during the reload process? What kind of tuning could
> be done in postgresql.conf in respect to just this particular
> workload (COPY and finally the CREATE INDEX stuff) with no other
> concurrent access going on?

You might consider using Slony-I to minimize the downtime.

I haven't done an upgrade using, but know that others have, and I've
done the same using eRServer...

The idea is that you set up an 8.0.1 backend, and set up replication
well ahead of time.  Replicate from the 7.4 system to the 8.0 one.

It might take several days to get replication up to date if the
databases are large enough, but once they are relatively in sync, they
should stay there pretty easily.

Switching versions is then as easy as using MOVE SET to switch the
origin from the 7.4 system to the 8.0 one.  That ought to take mere
seconds, so you wouldn't need a long time to do the upgrade.  See the
URL below for more information...
--
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/versionupgrade.html
"Did you  ever walk in a  room and forget  why you walked in?  I think
that's how dogs spend their lives." -- Sue Murphy

Re: Ways to speed up dump&reload

From
Christopher Browne
Date:
You might consider using Slony-I to minimize the downtime.

I haven't done an upgrade using, but know that others have, and I've
done the same using eRServer...

The idea is that you set up an 8.0.1 backend, and set up replication
well ahead of time.  Replicate from the 7.4 system to the 8.0 one.

It might take several days to get replication up to date if the
databases are large enough, but once they are relatively in sync, they
should stay there pretty easily.

Switching versions is then as easy as using MOVE SET to switch the
origin from the 7.4 system to the 8.0 one.  That ou
--
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/versionupgrade.html
The quickest way to a man's heart is through his chest, with an axe.

Re: Ways to speed up dump&reload

From
"Markus Wollny"
Date:
Hi!

Thanks, I'll try Slony-I next time - I currently lack the time to test it beforehand on my non-production system; I'd
bea bit worried about the additional load a second PG-instance as replication slave would impose on the machine, so at
themoment I'm more confident going along the same tracks I've gone before during the previous switches since 7.1. 

Kind regards

   Markus




Re: Ways to speed up dump&reload

From
Vivek Khera
Date:
max out your checkpoint_segments.  128 or 256 is good if you have
plenty of spare space for the pg_xlog directory.  you will also want
to increase checkpoint_timeout to something large.  I like 900 seconds
personally.  if you can put pg_xlog on a separate physical RAID you're
best off.

increase the amount of ram your index operations can use (sort_mem or
work_mem depending on PG version) to a very large number and then
reset that to a normal value when you're done.

then just let pg_dump + pg_restore do its work.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/