Thread: pg_restore

pg_restore

From
Mike Orr
Date:
I'm converting a MySQL webapp to PostgreSQL. I have a backup server
which is refreshed twice daily with mysqldump/mysql and has a
continuously-running copy of the webapp. I want to replicate this with
pg_dump/pg_restore.  Ideally I'd like to restore just a few tables,
without stopping the webapp and closing/reopening its connections. Is
this possible with pg_restore?

MySQL simply locks the tables, drops/recreates them, loads the data,
and unlocks the tables. Other connections have to wait but they don't
have to be closed/reopened. The PostgreSQL manual recommends restoring
into an empty database using template0, which would require first
closing the other connections and then dropping the database. It would
also take unnecessary time to recreate the database and tables that
aren't changing. So I'm wondering if there's a less obtrusive way to
refresh the data.

The tables to be updated do have 1:many relationships. How would I
restore all of them at once? List them all as '-t' args to the same
pg_restore command?

--
Mike Orr <sluggoster@gmail.com>

Re: pg_restore

From
John R Pierce
Date:
On 03/30/11 1:56 PM, Mike Orr wrote:
> MySQL simply locks the tables, drops/recreates them, loads the data,
> and unlocks the tables. Other connections have to wait but they don't
> have to be closed/reopened. The PostgreSQL manual recommends restoring
> into an empty database using template0, which would require first
> closing the other connections and then dropping the database. It would
> also take unnecessary time to recreate the database and tables that
> aren't changing. So I'm wondering if there's a less obtrusive way to
> refresh the data.

its a backup server, right?   so noone is accessing it, are they?

rather than using pg_dump -Fc |pg_restore, you can use pg_dump | psql
... and you can tell pg_dump in this mode to only dump specified tables.

however, you might look at PITR and/or WAL log shipping rather than
dump/restore.  this would only update new data, and when you playback
the WAL log on the backup server bring it up to whatever point in time
you want.




Re: pg_restore

From
Scott Marlowe
Date:
On Wed, Mar 30, 2011 at 2:56 PM, Mike Orr <sluggoster@gmail.com> wrote:
> I'm converting a MySQL webapp to PostgreSQL. I have a backup server
> which is refreshed twice daily with mysqldump/mysql and has a
> continuously-running copy of the webapp. I want to replicate this with
> pg_dump/pg_restore.  Ideally I'd like to restore just a few tables,
> without stopping the webapp and closing/reopening its connections. Is
> this possible with pg_restore?
>

Would replication of just those tables give you what you want?  If so
look into slony which allows you to replicate individual tables.

Re: pg_restore

From
Mike Orr
Date:
On Wed, Mar 30, 2011 at 2:36 PM, John R Pierce <pierce@hogranch.com> wrote:
> On 03/30/11 1:56 PM, Mike Orr wrote:
>>
>> MySQL simply locks the tables, drops/recreates them, loads the data,
>> and unlocks the tables. Other connections have to wait but they don't
>> have to be closed/reopened. The PostgreSQL manual recommends restoring
>> into an empty database using template0, which would require first
>> closing the other connections and then dropping the database. It would
>> also take unnecessary time to recreate the database and tables that
>> aren't changing. So I'm wondering if there's a less obtrusive way to
>> refresh the data.
>
> its a backup server, right?   so noone is accessing it, are they?

Somebody may be accessing it. I could take the site down for the
duration, but it would be easier not to.

> rather than using pg_dump -Fc |pg_restore, you can use pg_dump | psql ...
> and you can tell pg_dump in this mode to only dump specified tables.

That might be a better solution. I was hoping to use the same pgdump
file for this that I also use for routine offline backup, but maybe
this is such a special case that a separate dump file would be better.

> however, you might look at PITR and/or WAL log shipping rather than
> dump/restore.  this would only update new data, and when you playback the
> WAL log on the backup server bring it up to whatever point in time you want.

I looked at PITR and WAL, but it looks like I can't have the backup
database running and answering queries while it's WAL'ing. I'd have to
log in and switch it to normal mode and start the webapp, and that's
what I may not be able to do if the backup server were needed.

The backup server exists in case there's something like an earthquake
at the main data center. Something that could cause a regional
communication blackout and prevent the sysadmins from accessing the
backup server remotely. The site is a data-sharing tool for emergency
responders across the country. They may be in another region working
on an unrelated incident, and need the website. So the backup site has
to be already running and loaded with data -- we can't depend on a
sysadmin being able to log in remotely to turn it on.

--
Mike Orr <sluggoster@gmail.com>

Re: pg_restore

From
John R Pierce
Date:
On 03/30/11 4:08 PM, Mike Orr wrote:
> I looked at PITR and WAL, but it looks like I can't have the backup
> database running and answering queries while it's WAL'ing. I'd have to
> log in and switch it to normal mode and start the webapp, and that's
> what I may not be able to do if the backup server were needed.
>

look at the 'warm standby' stuff.  just realize that this backup HAS to
be readonly.


> The backup server exists in case there's something like an earthquake
> at the main data center. Something that could cause a regional
> communication blackout and prevent the sysadmins from accessing the
> backup server remotely. The site is a data-sharing tool for emergency
> responders across the country. They may be in another region working
> on an unrelated incident, and need the website. So the backup site has
> to be already running and loaded with data -- we can't depend on a
> sysadmin being able to log in remotely to turn it on


distributed failover is tricky stuff.    the standby *could* be
monitoring the active server and initiate automatic failover if it goes
down, but you don't want spontaneous failover if communications between
the servers is disrupted or you end up with two masters.



Re: pg_restore

From
"Joshua D. Drake"
Date:
> > however, you might look at PITR and/or WAL log shipping rather than
> > dump/restore.  this would only update new data, and when you playback the
> > WAL log on the backup server bring it up to whatever point in time you want.
>
> I looked at PITR and WAL, but it looks like I can't have the backup
> database running and answering queries while it's WAL'ing. I'd have to
> log in and switch it to normal mode and start the webapp, and that's
> what I may not be able to do if the backup server were needed.

You can if you are using 9.x, take a look at the HS stuff.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


Re: pg_restore

From
Marco Colombo
Date:
On 03/31/2011 01:08 AM, Mike Orr wrote:
> That might be a better solution. I was hoping to use the same pgdump
> file for this that I also use for routine offline backup, but maybe
> this is such a special case that a separate dump file would be better.

Why don't you post the exact mysqldump/mysql commands you are using? It
would be easier to provide equivalent pgdump/psql commands.

Meanwhile, have a look at the pg_restore man page, -t and/or -L options.
Maybe -c as well. Oh, and -1.

.TM.

Re: pg_restore

From
Merlin Moncure
Date:
On Wed, Mar 30, 2011 at 3:56 PM, Mike Orr <sluggoster@gmail.com> wrote:
> I'm converting a MySQL webapp to PostgreSQL. I have a backup server
> which is refreshed twice daily with mysqldump/mysql and has a
> continuously-running copy of the webapp. I want to replicate this with
> pg_dump/pg_restore.  Ideally I'd like to restore just a few tables,
> without stopping the webapp and closing/reopening its connections. Is
> this possible with pg_restore?
>
> MySQL simply locks the tables, drops/recreates them, loads the data,
> and unlocks the tables. Other connections have to wait but they don't
> have to be closed/reopened. The PostgreSQL manual recommends restoring
> into an empty database using template0, which would require first
> closing the other connections and then dropping the database. It would
> also take unnecessary time to recreate the database and tables that
> aren't changing. So I'm wondering if there's a less obtrusive way to
> refresh the data.
>
> The tables to be updated do have 1:many relationships. How would I
> restore all of them at once? List them all as '-t' args to the same
> pg_restore command?

pg_dump and pg_restore can do it.  pg_dump -Fc is just like regular
dump but you can pull out specific tables by name.  or you can just
take are regular backup and just pg_dump out specific tables with the
proper options set in a small script.

because pg is transactional, you can do this any time of the day
without blocking  users (although there may be a small performance
hit).

hs/sr is also something to look at.

merlin

Re: pg_restore

From
Mike Orr
Date:
On Fri, Apr 1, 2011 at 2:39 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Mar 30, 2011 at 3:56 PM, Mike Orr <sluggoster@gmail.com> wrote:
>> I'm converting a MySQL webapp to PostgreSQL. I have a backup server
>> which is refreshed twice daily with mysqldump/mysql and has a
>> continuously-running copy of the webapp. I want to replicate this with
>> pg_dump/pg_restore.  Ideally I'd like to restore just a few tables,
>> without stopping the webapp and closing/reopening its connections. Is
>> this possible with pg_restore?
>>
>> MySQL simply locks the tables, drops/recreates them, loads the data,
>> and unlocks the tables. Other connections have to wait but they don't
>> have to be closed/reopened. The PostgreSQL manual recommends restoring
>> into an empty database using template0, which would require first
>> closing the other connections and then dropping the database. It would
>> also take unnecessary time to recreate the database and tables that
>> aren't changing. So I'm wondering if there's a less obtrusive way to
>> refresh the data.
>>
>> The tables to be updated do have 1:many relationships. How would I
>> restore all of them at once? List them all as '-t' args to the same
>> pg_restore command?
>
> pg_dump and pg_restore can do it.  pg_dump -Fc is just like regular
> dump but you can pull out specific tables by name.  or you can just
> take are regular backup and just pg_dump out specific tables with the
> proper options set in a small script.
>
> because pg is transactional, you can do this any time of the day
> without blocking  users (although there may be a small performance
> hit).

I'm tentatively going with a psql script:

"""
\set quiet 1
\timing off
BEGIN;
TRUNCATE incidents, entries, locator, events;
\i /tmp/sync-pg/hotline.sql
\i /tmp/sync-pg/locator.sql
\i /tmp/sync-pg/events.sql
COMMIT;
ANALYZE;
"""

And a shell script that can do the saving, transfer, and loading in
discrete parts. (So I can have the same script on both hosts, and do
the parts individually for testing or together for production):

"""
usage: daily-pg [-hdtlk]

Synchronize Postgres data to the backup server. Options:
  -d: Dump the data to /tmp/sync-pg
  -t HOSTNAME: Rsync the data to the specified host
  -l: Load the data from /tmp/sync-pg
  -k: Keep the dump directory after loading (otherwise delete it)
  -h: Print this help message and exit

If no options, do nothing.
"""

Dumping also compresses the files, and loading uncompresses them if
they're compressed.

--
Mike Orr <sluggoster@gmail.com>