Thread: LARGE db dump/restore for upgrade question

LARGE db dump/restore for upgrade question

From
Philip Crotwell
Date:
Hi

I have a very large database of seismic data. It is about 27 Gb now, and
growing at about the rate of 1 Gb every 3-4 days. I am running
postgres 7.1.2. I might possibly try to upgrade to 7.2 when it comes out,
but I don't know if it will be possible for me to do 7.3 due to
the pg_dump/pg_restore problem. In a little over a year the database will
probably pass the halfway point on my raid and so it will physically be
impossible to upgrade. Most of the space is probably taken up by large
objects, which I am hoping will make a solution at least a little bit
easier.

I am trying a pg_dump right now, and in the first 25 minutes it dumped
54Mb, which means that a full dump will take about 200 hours! I would
guess the restore would take about the same amount of time, so I would be
looking at 17 DAYS of downtime to upgrade! Maybe it will speed up later in
the dump, I don't know. And in about 3 months or so it will take me twice
that amout of time. Also, this is on a 4 processor sun E450 with a A1000
hardware raid, so it not that I am using old slow hardware. Just for
comparison, a file system dump to tape took 6 hours, and that was back
when I only had a software raid!

So, my question is, is it likely that one day postgres will no longer
require dump/restores for upgrades? I would assume that there will always
be a need to tweak the internal layout of files, but I wonder if there
isn't a way to do this "in place" or at least to allow a file system move
of the large objects without requiring them to be dumped as well?
Even better would be if each new version of postgres could read the
immediatly previous version tables, and could convert them in the
background. Maybe just dreaming here. :)

Could something related to making the upgrade less painful for very large
databases be added to the ToDo list even if it isn't a high priority?

Not that I am complaining, postgres seems to handle this data volume quite
well, and it is certainly worth very dollar I didn't pay for it. :)

Any suggestion on how to prepare for the next upgrade would be
appreciated.

thanks,
Philip



Re: LARGE db dump/restore for upgrade question

From
Joseph Shraibman
Date:

Philip Crotwell wrote:
> Hi
>
> I have a very large database of seismic data. It is about 27 Gb now, and
> growing at about the rate of 1 Gb every 3-4 days. I am running
<snip>

Out of curiosity, how long does it take you to vacuum that?

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio.  http://www.targabot.com


RE: LARGE db dump/restore for upgrade question

From
"Andrew Snow"
Date:
> Any suggestion on how to prepare for the next upgrade would be
> appreciated.

I think it has to be said that if you want decent performance on excessively
large (50GB+) databases, you're going to need excessively good hardware to
operate it on.  Buy a 3ware IDE RAID controller (www.hypermicro.com) and a
stripe a couple of seagate 80GB IDE drives (the new ones have an awesome
transfer rate).  Turn of fsync, and use softupdates or even async I/O.

I am guessing though you've already done all of the above.  You might have
to go beefier hardware, for PCs that means DDR SDRAM/Rambus, and a 64 bit
PCI 3ware card with 4 or 8 hard drives.


Hope that helps


Andrew



Re: LARGE db dump/restore for upgrade question

From
Philip Crotwell
Date:
Hi

Just did a vacuum, took almost 4 hours. The interesting thing about this
is that there are only two small tables that have updates, everything else
has been just inserts. I would have thought that a vacuum of a database
shouldn't take very long if there aren't alot of "deleted" rows.

Another problem is that the size of the database is now 52Gb!!! I
assume that it is WAL files growing out of control? I thought this was
fixed in 7.1.2? As soon as I start a second vacuum the size drops back
down to 27Gb.

thanks,
Philip


pooh 19>date ; time vacuumdb sceppdata ; date
Wed Aug 15 09:09:19 EDT 2001
VACUUM
0.03u 0.08s 3:49:07.53 0.0%
Wed Aug 15 12:58:29 EDT 2001
pooh 20>



On Tue, 14 Aug 2001, Joseph Shraibman wrote:

>
>
> Philip Crotwell wrote:
> > Hi
> >
> > I have a very large database of seismic data. It is about 27 Gb now, and
> > growing at about the rate of 1 Gb every 3-4 days. I am running
> <snip>
>
> Out of curiosity, how long does it take you to vacuum that?
>
> --
> Joseph Shraibman
> jks@selectacast.net
> Increase signal to noise ratio.  http://www.targabot.com
>
>



viewing/restoring old rows

From
David Wright
Date:
My understanding of pgSQL is that, when a row is updated or deleted, the
old row is marked as such, but maintained in the DB until I vaccuum.
(This certainly would ease the implementation of transactions.)
Please correct me if I am mistaken. Assuming for now this is true...

Is there  some way to view the old rows or even restore one?

And doesn't this mean a timestamp must be automagically associated with
each row? How do I access these timestamps?


Re: LARGE db dump/restore for upgrade question

From
wsheldah@lexmark.com
Date:

If there were lots of inserts, I would guess it would need to re-analyze the
tables to update its statistics on them, so the query optimizer can make good
choices.

--Wes




Philip Crotwell <crotwell%seis.sc.edu@interlock.lexmark.com> on 08/15/2001
01:08:19 PM

To:   Joseph Shraibman <jks%selectacast.net@interlock.lexmark.com>
cc:   pgsql-general%postgresql.org@interlock.lexmark.com (bcc: Wesley
      Sheldahl/Lex/Lexmark)
Subject:  Re: [GENERAL] LARGE db dump/restore for upgrade question



Hi

Just did a vacuum, took almost 4 hours. The interesting thing about this
is that there are only two small tables that have updates, everything else
has been just inserts. I would have thought that a vacuum of a database
shouldn't take very long if there aren't alot of "deleted" rows.






Re: LARGE db dump/restore for upgrade question

From
Micah Yoder
Date:
> I am trying a pg_dump right now, and in the first 25 minutes it dumped
> 54Mb, which means that a full dump will take about 200 hours! I would
> guess the restore would take about the same amount of time, so I would be
> looking at 17 DAYS of downtime to upgrade! Maybe it will speed up later in

Thinking out loud here ...

Perhaps a slightly modified (or even custom written) pg_dump that dumps your
large tables incrementally.  Since you said that the large tables only have
inserts, you should be able to run most of this while the DB is up.  Have it
keep track of the timestamp of the last row dumped for each table.  Then it
could continue from there, appending to the same file, next time you run it.
If you keep the dump, then you won't have to do another large pg_dump next
time you upgrade, assuming you never do updates on those tables.

Of course, you'll still have to wait a while for it to restore after every
upgrade.  The other suggestion about fast hardware should help there.

If this doesn't make any sense, sorry, it's 3 in the morning....

> Not that I am complaining, postgres seems to handle this data volume quite
> well, and it is certainly worth very dollar I didn't pay for it. :)

What a ringing endorsement....

--
Like to travel?                        http://TravTalk.org
Micah Yoder Internet Development       http://yoderdev.com