Thread: Re: pg_Restore

Re: pg_Restore

From
bhanu udaya
Date:
Hello,
 
Greetings !
I am new to postgres and recently got migrated from oracle. I am using postgresql 9.2 version.
I am trying to restore 9.5G database (1GB dumpfile) which has 500 schemas with 1 lakh rows in each schema. Could take the data dump using pg_dump and it takes around 40 minutes. I tried to use pg_restore to restore this dump, but it takes hours to restore the dump. I have used the configurations parameters as below:
 
shared_buffers = 1024MB   
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB
 
I have used command to restore as below:
pg_Restore -d newdb -j2 e:\file.dmp
 
My machine configurations are as below:
Windows Core I5 with 4GB Ram.
 
Thanks for your help.
 
Thanks and Regards
Radha Krishna

Re: pg_Restore

From
François Beausoleil
Date:
Le 2013-01-20 à 23:10, bhanu udaya a écrit :

> I am new to postgres and recently got migrated from oracle. I am using postgresql 9.2 version.

Welcome, and good choice for the version.

> I am trying to restore 9.5G database (1GB dumpfile) which has 500 schemas with 1 lakh rows in each schema. Could take
thedata dump using pg_dump and it takes around 40 minutes. I tried to use pg_restore to restore this dump, but it takes
hoursto restore the dump. I have used the configurations parameters as below: 
>
> shared_buffers = 1024MB
> work_mem = 512MB
> maintenance_work_mem = 512MB
> wal_buffers = 100MB
>
> I have used command to restore as below:
> pg_Restore -d newdb -j2 e:\file.dmp
>
> My machine configurations are as below:
> Windows Core I5 with 4GB Ram.

Other settings you can change during the initial restore / load phase:

fsync = off # ONLY DURING INITIAL DATA LOAD!

checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9

wal_level = minimal # You'll need to do a full base backup if you use this

Read this section of the manual: http://www.postgresql.org/docs/current/static/runtime-config-wal.html

Have a great day!
François Beausoleil

Re: pg_Restore

From
bhanu udaya
Date:

 Hello,
Greetings !
Thank you for the prompt reply. I have changed the settings as listed below:
> > shared_buffers = 1024MB
> > work_mem = 512MB
> > maintenance_work_mem = 512MB
> > wal_buffers = 100MB
> fsync = off # ONLY DURING INITIAL DATA LOAD!
> checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
> checkpoint_timeout = 30min
> checkpoint_completion_target = 0.9
> wal_level = minimal # You'll need to do a full base backup if you use this
But, have same problem. It is almost 1 hour now, the restoration is still going on.  After every test case execution, we would like to refresh the database and expected refresh should be completed in less than 10 minutes. Is this achievable with the kind of configuration I have listed in my earlier email.
 
Kindly help , as how to speed up this restoration process.
 
Thanks and Regards
Radha Krishna
 
 
> Subject: Re: [GENERAL] pg_Restore
> From: francois@teksol.info
> Date: Sun, 20 Jan 2013 23:19:44 -0500
> CC: pgsql-general@postgresql.org
> To: udayabhanu1984@hotmail.com
>
>
> Le 2013-01-20 à 23:10, bhanu udaya a écrit :
>
> > I am new to postgres and recently got migrated from oracle. I am using postgresql 9.2 version.
>
> Welcome, and good choice for the version.
>
> > I am trying to restore 9.5G database (1GB dumpfile) which has 500 schemas with 1 lakh rows in each schema. Could take the data dump using pg_dump and it takes around 40 minutes. I tried to use pg_restore to restore this dump, but it takes hours to restore the dump. I have used the configurations parameters as below:
> >
> > shared_buffers = 1024MB
> > work_mem = 512MB
> > maintenance_work_mem = 512MB
> > wal_buffers = 100MB
> >
> > I have used command to restore as below:
> > pg_Restore -d newdb -j2 e:\file.dmp
> >
> > My machine configurations are as below:
> > Windows Core I5 with 4GB Ram.
>
> Other settings you can change during the initial restore / load phase:
>
> fsync = off # ONLY DURING INITIAL DATA LOAD!
>
> checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
> checkpoint_timeout = 30min
> checkpoint_completion_target = 0.9
>
> wal_level = minimal # You'll need to do a full base backup if you use this
>
> Read this section of the manual: http://www.postgresql.org/docs/current/static/runtime-config-wal.html
>
> Have a great day!
> François Beausoleil

Re: pg_Restore

From
Magnus Hagander
Date:


On Jan 21, 2013 7:17 AM, "bhanu udaya" <udayabhanu1984@hotmail.com> wrote:
>
>
>  Hello,
> Greetings !
> Thank you for the prompt reply. I have changed the settings as listed below:
> > > shared_buffers = 1024MB
> > > work_mem = 512MB
> > > maintenance_work_mem = 512MB
> > > wal_buffers = 100MB
> > fsync = off # ONLY DURING INITIAL DATA LOAD!
> > checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
> > checkpoint_timeout = 30min
> > checkpoint_completion_target = 0.9
> > wal_level = minimal # You'll need to do a full base backup if you use this
> But, have same problem. It is almost 1 hour now, the restoration is still going on.  After every test case execution, we would like to refresh the database and expected refresh should be completed in less than 10 minutes. Is this achievable with the kind of configuration I have listed in my earlier email.
>  
> Kindly help , as how to speed up this restoration process.
>

Try running pg_restore with the -1 option. If that doesn't help, try -m4 or something like that (you'll have to remove the first option then, can't use both at once)

But it's going to be pushing it anyway. Your scenario is going to create thousands of files (assuming you have multiple tables in each of your schemas as is normal), and that's just not something ntfs does very fast. Once the files are there, I bet loading the data is reasonably fast since it can't be all that big....

/Magnus

Re: pg_Restore

From
bhanu udaya
Date:
Hello,
Greetings !
I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB database.  This much time can not be affordable as the execution of test cases take only 10% of this whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team.  Kindly let me know if we can reduce the database restoration time .
 
Thanks and Regards
Radha Krishna
 

Date: Mon, 21 Jan 2013 08:15:47 +0100
Subject: Re: [GENERAL] pg_Restore
From: magnus@hagander.net
To: udayabhanu1984@hotmail.com
CC: francois@teksol.info; pgsql-general@postgresql.org


On Jan 21, 2013 7:17 AM, "bhanu udaya" <udayabhanu1984@hotmail.com> wrote:
>
>
>  Hello,
> Greetings !
> Thank you for the prompt reply. I have changed the settings as listed below:
> > > shared_buffers = 1024MB
> > > work_mem = 512MB
> > > maintenance_work_mem = 512MB
> > > wal_buffers = 100MB
> > fsync = off # ONLY DURING INITIAL DATA LOAD!
> > checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
> > checkpoint_timeout = 30min
> > checkpoint_completion_target = 0.9
> > wal_level = minimal # You'll need to do a full base backup if you use this
> But, have same problem. It is almost 1 hour now, the restoration is still going on.  After every test case execution, we would like to refresh the database and expected refresh should be completed in less than 10 minutes. Is this achievable with the kind of configuration I have listed in my earlier email.
>  
> Kindly help , as how to speed up this restoration process.
>

Try running pg_restore with the -1 option. If that doesn't help, try -m4 or something like that (you'll have to remove the first option then, can't use both at once)

But it's going to be pushing it anyway. Your scenario is going to create thousands of files (assuming you have multiple tables in each of your schemas as is normal), and that's just not something ntfs does very fast. Once the files are there, I bet loading the data is reasonably fast since it can't be all that big....

/Magnus

Re: pg_Restore

From
Raghavendra
Date:

On Mon, Jan 21, 2013 at 3:01 PM, bhanu udaya <udayabhanu1984@hotmail.com> wrote:
Hello,
Greetings !
I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB database.  This much time can not be affordable as the execution of test cases take only 10% of this whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team.  Kindly let me know if we can reduce the database restoration time .

 
On linux, below settings work well using using -j option of pg_restore. Since its windows, give another try with below option.(as already best suggested in this email).

shared_buffers= 1024MB
work_mem= 512MB
maintenance_work_mem = 1GB
checkpoint_segments=(in between 128 - 256)
checkpoint_timeout=(default is 15mns make to 1h)
autovacuum=off
track_counts=off
fsync=off
full_page_writes=off
synchronous_commit=off
bgwriter_delay=(default 200ms, change to 50ms)

---
Regards,
Raghavendra
EnterpriseDB Corporation

 
 
Thanks and Regards
Radha Krishna
 

Date: Mon, 21 Jan 2013 08:15:47 +0100
Subject: Re: [GENERAL] pg_Restore
From: magnus@hagander.net
To: udayabhanu1984@hotmail.com
CC: francois@teksol.info; pgsql-general@postgresql.org



On Jan 21, 2013 7:17 AM, "bhanu udaya" <udayabhanu1984@hotmail.com> wrote:
>
>
>  Hello,
> Greetings !
> Thank you for the prompt reply. I have changed the settings as listed below:
> > > shared_buffers = 1024MB
> > > work_mem = 512MB
> > > maintenance_work_mem = 512MB
> > > wal_buffers = 100MB
> > fsync = off # ONLY DURING INITIAL DATA LOAD!
> > checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
> > checkpoint_timeout = 30min
> > checkpoint_completion_target = 0.9
> > wal_level = minimal # You'll need to do a full base backup if you use this
> But, have same problem. It is almost 1 hour now, the restoration is still going on.  After every test case execution, we would like to refresh the database and expected refresh should be completed in less than 10 minutes. Is this achievable with the kind of configuration I have listed in my earlier email.
>  
> Kindly help , as how to speed up this restoration process.
>

Try running pg_restore with the -1 option. If that doesn't help, try -m4 or something like that (you'll have to remove the first option then, can't use both at once)

But it's going to be pushing it anyway. Your scenario is going to create thousands of files (assuming you have multiple tables in each of your schemas as is normal), and that's just not something ntfs does very fast. Once the files are there, I bet loading the data is reasonably fast since it can't be all that big....

/Magnus


Re: pg_Restore

From
Albe Laurenz
Date:
bhanu udaya wrote:
> I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB
> database.  This much time can not be affordable as the execution of test cases take only 10% of this
> whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team.  Kindly
> let me know if we can reduce the database restoration time .

I don't know if that helps, but have you tried creating a template database
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?

Maybe that is faster.

Yours,
Laurenz Albe


Re: pg_Restore

From
dinesh kumar
Date:
Hi Bhanu,

Yes, below is the faster approach to follow.

I don't know if that helps, but have you tried creating a template database
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?

Maybe that is faster.


If you are trying to take the dump from one cluster and restoring it in another cluster, then make sure your pg_restore use parallel option "-j" and also follow the parameters what Raghav said and tune WAL_BUFFERS to some 32 to 64 MB value. And also if possible, keep your dump file into another partition than the PGDATA which can improve the I/O balance.

Thanks. 

Best Regards,
Dinesh

Re: pg_Restore

From
Chris Travers
Date:


On Mon, Jan 21, 2013 at 3:39 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
bhanu udaya wrote:
> I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB
> database.  This much time can not be affordable as the execution of test cases take only 10% of this
> whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team.  Kindly
> let me know if we can reduce the database restoration time .

I don't know if that helps, but have you tried creating a template database
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?

Also for test cases, my preferred way is to put every test case in a transaction that cannot commit.  This way the tests are safe to run on production environments.  See pgTAP for one possibility here if you are testing stored procedures.  (Application code we run through wrappers that filter out commits.) 

But also the template approach is a good one fi you cannot guarantee that the tests always role back.

Best Wishes,
Chris Travers

Re: pg_Restore

From
Chris Ernst
Date:
On 01/20/2013 11:17 PM, bhanu udaya wrote:
> I am trying to restore 9.5G database (1GB dumpfile) which has 500
> schemas with 1 lakh rows in each schema. Could take the data dump using
> pg_dump and it takes around 40 minutes. I tried to use pg_restore to
> restore this dump, but it takes hours to restore the dump. I have used
> the configurations parameters as
....
> But, have same problem. It is almost 1 hour now, the restoration is
> still going on.  After every test case execution, we would like to
> refresh the database and expected refresh should be completed in less
> than 10 minutes. Is this achievable with the kind of configuration I
> have listed in my earlier email.

Probably not what you want to hear, but I think This is a completely
unrealistic expectation.  If it takes 40 minutes for pg_dump, I would
expect pg_restore to take at least as long and likely significantly
longer (assuming both are done on similar hardware).

pg_dump only has to read the schema(s) and data and write them to a
file.  pg_restore has to read write the schema and  data into a new
database *AND* re-create all of the indexes, analyze, check referential
integrity......

So if the dump is taking 40 minutes, I would expect the restore to take
somewhere in the 60-90 minute range, depending on the number of and
complexity of the indexing.

    - Chris


Re: pg_Restore

From
bhanu udaya
Date:
Hello,
Thanks alot for all your replies.
I tried all settings suggested, it did not work. pg_restore is very slow. It does not come out less than 1 1/2 hour. Can you please let me know the procedure for Template. Will it restore the data also . Please update. I need the database (metadata + data) to be restored back after certain Java executions.
 
Ex:- I have 9.5 gb database (with 500 schemas + data). This is treated as base database and it needs to be restored every time after certain transactions.
 
Thanks for your reply.
 
Thanks and REgards
Radha Krishna

 

Date: Mon, 21 Jan 2013 04:16:19 -0800
Subject: Re: [GENERAL] pg_Restore
From: chris.travers@gmail.com
To: laurenz.albe@wien.gv.at
CC: udayabhanu1984@hotmail.com; magnus@hagander.net; francois@teksol.info; pgsql-general@postgresql.org



On Mon, Jan 21, 2013 at 3:39 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
bhanu udaya wrote:
> I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB
> database.  This much time can not be affordable as the execution of test cases take only 10% of this
> whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team.  Kindly
> let me know if we can reduce the database restoration time .

I don't know if that helps, but have you tried creating a template database
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?

Also for test cases, my preferred way is to put every test case in a transaction that cannot commit.  This way the tests are safe to run on production environments.  See pgTAP for one possibility here if you are testing stored procedures.  (Application code we run through wrappers that filter out commits.) 

But also the template approach is a good one fi you cannot guarantee that the tests always role back.

Best Wishes,
Chris Travers

Re: pg_Restore

From
Albe Laurenz
Date:
bhanu udaya wrote:
> Can you please let me know the procedure for Template. Will it restore the data also
> .

It will create a complete copy of an existing database

The procedure is
CREATE DATABASE newdb TEMPLATE olddb;

Nobody may be connected to olddb for this to work.

Yours,
Laurenz Albe


Re: pg_Restore

From
Alban Hertroys
Date:
On 21 January 2013 16:10, bhanu udaya <udayabhanu1984@hotmail.com> wrote:
Can you please let me know the procedure for Template.

As they say, Google is your friend.

The basic principle is this: You create a read-only (template) version of your sample database and use that as a template for the creation of new ones. Of course, now you have another copy of the database stored, which takes up another 9.5GB of disk space, but that's not much on modern systems (oh wait, Windows & 4GB memory?...)

pg_dump would probably be quite a lot faster on a less limited system - databases like memory and fast raid arrays for disks.

Re: pg_Restore

From
Alan Hodgson
Date:
On Monday, January 21, 2013 08:40:05 PM bhanu udaya wrote:
> Hello,Thanks alot for all your replies. I tried all settings suggested, it
> did not work. pg_restore is very slow. It does not come out less than 1 1/2
> hour. Can you please let me know the procedure for Template. Will it
> restore the data also . Please update. I need the database (metadata +
> data) to be restored back after certain Java executions.  Ex:- I have 9.5
> gb database (with 500 schemas + data). This is treated as base database and
> it needs to be restored every time after certain transactions.

Don't use pg_restore, do the backups/restorations outside of PostgreSQL:

- run on a server with a snapshot-capable volume manager, use that for quick
restores
- just try using rsync from a backup copy of the base data directory

(either of the above require PostgreSQL to not be running during the
restorations)



Re: pg_Restore

From
bhanu udaya
Date:
Hello,
Thanks all for the great help. Template is very good option so far seen. It comes little quickly than pg_restore. But, this also takes 40 minutes time.
 
I am using Windows with 4GB Ram.  
 
Thanks and Regards
Radha Krishna
 
> From: laurenz.albe@wien.gv.at
> To: udayabhanu1984@hotmail.com; chris.travers@gmail.com
> CC: magnus@hagander.net; francois@teksol.info; pgsql-general@postgresql.org
> Subject: RE: [GENERAL] pg_Restore
> Date: Mon, 21 Jan 2013 15:19:45 +0000
>
> bhanu udaya wrote:
> > Can you please let me know the procedure for Template. Will it restore the data also
> > .
>
> It will create a complete copy of an existing database
>
> The procedure is
> CREATE DATABASE newdb TEMPLATE olddb;
>
> Nobody may be connected to olddb for this to work.
>
> Yours,
> Laurenz Albe

Re: pg_Restore

From
bhanu udaya
Date:
Hello All,
 
Can we achieve this template or pg_Restore in less than 20 minutes time. Any more considerations. Kindly reply.
 
Thanks and Regards
Radha Krishna
 

From: udayabhanu1984@hotmail.com
To: laurenz.albe@wien.gv.at; chris.travers@gmail.com
CC: magnus@hagander.net; francois@teksol.info; pgsql-general@postgresql.org
Subject: RE: [GENERAL] pg_Restore
Date: Mon, 21 Jan 2013 21:45:32 +0530

Hello,
Thanks all for the great help. Template is very good option so far seen. It comes little quickly than pg_restore. But, this also takes 40 minutes time.
 
I am using Windows with 4GB Ram.  
 
Thanks and Regards
Radha Krishna
 
> From: laurenz.albe@wien.gv.at
> To: udayabhanu1984@hotmail.com; chris.travers@gmail.com
> CC: magnus@hagander.net; francois@teksol.info; pgsql-general@postgresql.org
> Subject: RE: [GENERAL] pg_Restore
> Date: Mon, 21 Jan 2013 15:19:45 +0000
>
> bhanu udaya wrote:
> > Can you please let me know the procedure for Template. Will it restore the data also
> > .
>
> It will create a complete copy of an existing database
>
> The procedure is
> CREATE DATABASE newdb TEMPLATE olddb;
>
> Nobody may be connected to olddb for this to work.
>
> Yours,
> Laurenz Albe

Re: pg_Restore

From
Adrian Klaver
Date:
On 01/21/2013 08:46 AM, bhanu udaya wrote:
> Hello All,
>
> Can we achieve this template or pg_Restore in less than 20 minutes time.
> Any more considerations. Kindly reply.

Seems to me this is where Point in Time Recovery(PITR) might be helpful.

http://www.postgresql.org/docs/9.2/static/continuous-archiving.html

> Thanks and Regards



--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_Restore

From
Eduardo Morras
Date:
On Mon, 21 Jan 2013 22:16:00 +0530
bhanu udaya <udayabhanu1984@hotmail.com> wrote:

> Hello All,
>  Can we achieve this template or pg_Restore in less than 20 minutes time. Any more considerations. Kindly reply.
Thanksand RegardsRadha Krishna From: udayabhanu1984@hotmail.com 

Perhaps you can use the ntfs snapshot feature.

a) Create the db with the data you want
b) Shutdown PostgreSql completly
c) Go to data folder, properties, make a recover point
d) Start PostgreSql
f) Work with it
g) Stop PostgreSql
h) Go to data folder, properties, recover the previous version
i) Go to d)

Don't know exactly the name of the ntfs property, if i remember well it's recover point, but haven't used Windows for a
while.

Perhaps other folders should be set to recover.

It's easy with this "feature" shoot your own feet (yes, in plural). If something wrong happens recover from the dump.

HTH


Re: pg_Restore

From
"C. Bensend"
Date:
> Don't know exactly the name of the ntfs property, if i remember well it's
> recover point, but haven't used Windows for a while.

I think you're talking about shadow copies.  :)

Benny


--
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"                          -- George Carlin



Re: pg_Restore

From
John R Pierce
Date:
On 1/30/2013 5:12 AM, C. Bensend wrote:
>> >Don't know exactly the name of the ntfs property, if i remember well it's
>> >recover point, but haven't used Windows for a while.
> I think you're talking about shadow copies.


the UI and usability of NTFS shadow copies is quite frustrating. you can
create them, but you can't actually access them, they are only available
to software like backup programs that know how to use them.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast