Thread: Updating large postgresql database with blobs
Hello, <br /><br />I didn't get any response on the GENERAL list so i'm escalating this ....<br /><br />We have several independentdatabase servers with ~50GB+ databases running postgres 8.0.x. We are planning to upgrade these databases to postgres8.2.x over the weekend <br /><br />We plan to use the following steps to upgrade each server, <br /><br />1. Dumpthe 8.0.x database cluster using 8.2.x pg_dumpall<br />% ./pg_dumpall > pgdumpall_backup.sql<br /><br />2.Dump the8.0.x database including large objects in compressed custom format using 8.2.x pg_dump <br /> % ./pg_dump -Fc -b -Z9dbname > pgdump_lobs_backup<br /><br /><br />Restoring database<br />1. Initialize 8.2.x darabase<br />% initdb -D/data/pgdata<br /><br />2. Restore template1 database from cluster dump<br />% ./psql -d template1 < pgdumpall_backup.sql<br /><br />3. Delete database dbname else restoring will give error about existing dbname<br />% dropdbdbname<br /><br />4. Create fresh dbname<br />% createdb -O dbowner dbname<br /><br />5. Restore database with lobs<br/>% ./pg_restore -v -Fc -d dbname -e -U dbowner < pgdumpall_lobs_backup <br /><br />Some of the problems we haveare,<br />1. We are not sure if all of the data will be available after dump/restore with above process<br />2. The dumpand restore process is very very slow to be complete over the weekend (takes approx 1GB/hr to dump on a dual G5 PPC 2Ghzwith 1GB RAM and RAID 1 disks) <br /><br />What is the fastest way to upgrade postgres for large databases that has binaryobjects?<br /><br />Thanks for all your help.
CAJ CAJ wrote: > Hello, > > I didn't get any response on the GENERAL list so i'm escalating this .... > > We have several independent database servers with ~50GB+ databases > running postgres 8.0.x. We are planning to upgrade these databases to > postgres 8.2.x over the weekend > > We plan to use the following steps to upgrade each server, > > 1. Dump the 8.0.x database cluster using 8.2.x pg_dumpall > % ./pg_dumpall > pgdumpall_backup.sql > > 2.Dump the 8.0.x database including large objects in compressed > custom format using 8.2.x pg_dump > % ./pg_dump -Fc -b -Z9 dbname > pgdump_lobs_backup > > > Restoring database > 1. Initialize 8.2.x darabase > % initdb -D /data/pgdata > > 2. Restore template1 database from cluster dump > % ./psql -d template1 < pgdumpall_backup.sql > > 3. Delete database dbname else restoring will give error about > existing dbname > % dropdb dbname > > 4. Create fresh dbname > % createdb -O dbowner dbname > > 5. Restore database with lobs > % ./pg_restore -v -Fc -d dbname -e -U dbowner < pgdumpall_lobs_backup > > Some of the problems we have are, > 1. We are not sure if all of the data will be available after > dump/restore with above process > 2. The dump and restore process is very very slow to be complete over > the weekend (takes approx 1GB/hr to dump on a dual G5 PPC 2Ghz with > 1GB RAM and RAID 1 disks) > > What is the fastest way to upgrade postgres for large databases that > has binary objects? > > Thanks for all your help. Your procedure dumps and restore the databases twice. This seems less than sound. My prediction is that you could get a 50% speed improvement by fixing that ... The only thing you really need pg_dumpall for is the global tables. I would just use pg_dumpall -g to get those, and then use pg_dump -F c + pg_restore for each actual database. Another thing is to make sure that pg_dump/pg_restore are not competing with postgres for access to the same disk(s). One way to do that is to run them from a different machine - they don't have to be run on the server machine - of course then the network can become a bottleneck, so YMMV. cheers andrew
<snip>
Thanks for the response. This'd be wonderful if I can get my process right. My assumptions (probably incorrect) are that pgdump has to be excuted twice on a database with blobs. Once to get the data and once to get the blob (using the -b flag).
This makes sense :) I assume that running pg_dump with -b will get all of the data including the blobs?
We are using separate servers for dump and restore.
Thanks again for your suggestions. This helps immensely.
> What is the fastest way to upgrade postgres for large databases that
> has binary objects?
Your procedure dumps and restore the databases twice. This seems less
than sound. My prediction is that you could get a 50% speed improvement
by fixing that ...
Thanks for the response. This'd be wonderful if I can get my process right. My assumptions (probably incorrect) are that pgdump has to be excuted twice on a database with blobs. Once to get the data and once to get the blob (using the -b flag).
The only thing you really need pg_dumpall for is the global tables. I
would just use pg_dumpall -g to get those, and then use pg_dump -F c +
pg_restore for each actual database.
This makes sense :) I assume that running pg_dump with -b will get all of the data including the blobs?
Another thing is to make sure that pg_dump/pg_restore are not competing
with postgres for access to the same disk(s). One way to do that is to
run them from a different machine - they don't have to be run on the
server machine - of course then the network can become a bottleneck, so
YMMV.
We are using separate servers for dump and restore.
Thanks again for your suggestions. This helps immensely.
CAJ CAJ wrote: > > > Thanks for the response. This'd be wonderful if I can get my process > right. My assumptions (probably incorrect) are that pgdump has to be > excuted twice on a database with blobs. Once to get the data and once > to get the blob (using the -b flag). > > Why do you assume that? The pg_dump manual says: -b --blobs Include large objects in the dump. This is the default behavior except when --schema, --table, or --schema-only is specified,so the -b switch is only useful to add large objects to selective dumps. IOW, in most cases you will just get the large objects for free, and don't even need to use this flag to get them. No need to run twice. cheers andrew