Thread: Creating a hot copy of PostgreSQL database

Creating a hot copy of PostgreSQL database

From
Shreesha
Date:
Hello All,
I was looking for an option of taking the hot copy of the pgsql database periodically or with manual trigger. In order to accomplish the same, I need to make sure that all the active transactions are halted for some time.
In MySQL I could do this with FLUSH with READ LOCK. Once the backup is taken, I could UNLOCK the tables to continue the normal course of database operations.

How can I do this in PostgreSQL database?

Any information on this regard would be highly appreciated.

Thanks,
Shreesha.

P.S.
I happened to explore Backup and Restore options provided by PostgreSQL.
a) Taking SQL dump would result in a slower operation and our system wouldn't be able to leverage the performance delay caused by this operation.
b) File system level cache wouldn't be feasible as I don't want my the database server to be shutdown for taking the database backup and restore operation.
c) In order to accomplish the archival of WAL logs, I would anyway want to halt the database which brings me to the same problem again.

Re: Creating a hot copy of PostgreSQL database

From
"ktm@rice.edu"
Date:
On Sun, Jul 20, 2014 at 12:28:27PM -0700, Shreesha wrote:
> Hello All,
> I was looking for an option of taking the hot copy of the pgsql database
> periodically or with manual trigger. In order to accomplish the same, I
> need to make sure that all the active transactions are halted for some time.
> In MySQL I could do this with FLUSH with READ LOCK. Once the backup is
> taken, I could UNLOCK the tables to continue the normal course of database
> operations.
>
> How can I do this in PostgreSQL database?
>
> Any information on this regard would be highly appreciated.
>
> Thanks,
> Shreesha.
>
> P.S.
> I happened to explore Backup and Restore options provided by PostgreSQL.
> a) Taking SQL dump would result in a slower operation and our system
> wouldn't be able to leverage the performance delay caused by this operation.
> b) File system level cache wouldn't be feasible as I don't want my the
> database server to be shutdown for taking the database backup and restore
> operation.
> c) In order to accomplish the archival of WAL logs, I would anyway want to
> halt the database which brings me to the same problem again.

Hi Shreesha,

I think the following excerpt from the section in the documentation on
"File System Level Backup" may be what you need. This is for version 9.3:

If simultaneous snapshots are not possible, one option is to shut down the
database server long enough to establish all the frozen snapshots. Another
option is to perform a continuous archiving base backup (Section 24.3.2)
because such backups are immune to file system changes during the backup.
This requires enabling continuous archiving just during the backup process;
restore is done using continuous archive recovery (Section 24.3.4).

Regards,
Ken


Re: Creating a hot copy of PostgreSQL database

From
Albe Laurenz
Date:
Shreesha wrote:
> I was looking for an option of taking the hot copy of the pgsql database periodically or with manual
> trigger. In order to accomplish the same, I need to make sure that all the active transactions are
> halted for some time.
> In MySQL I could do this with FLUSH with READ LOCK. Once the backup is taken, I could UNLOCK the
> tables to continue the normal course of database operations.
> 
> How can I do this in PostgreSQL database?
> 
> 
> Any information on this regard would be highly appreciated.
> 
> 
> Thanks,
> Shreesha.
> 
> P.S.
> I happened to explore Backup and Restore options provided by PostgreSQL.
> a) Taking SQL dump would result in a slower operation and our system wouldn't be able to leverage the
> performance delay caused by this operation.
> b) File system level cache wouldn't be feasible as I don't want my the database server to be shutdown
> for taking the database backup and restore operation.

What do you mean by that? How can you use the cache for creating a copy?

> c) In order to accomplish the archival of WAL logs, I would anyway want to halt the database which
> brings me to the same problem again.

I don't see your point c)
The database server will automatically archive WALs any way you want, and there
is no need to stop or halt the database for that.

To create a copy of a PostgreSQL database cluster, take an online backup and restore it
to some point in time after the end of the backup using the WAL archives.

I am sure that you can automate this if necessary.

Yours,
Laurenz Albe

Re: Creating a hot copy of PostgreSQL database

From
Shreesha
Date:
@Albe Laurenz:
'cache' was a typo. I meant file system level backup as mentioned here - (http://www.postgresql.org/docs/9.3/static/backup-file.html). 
It would be really helpful if you can give steps for automating the copy of PostgreSQL database cluster. 

To give a clear picture of what I am currently trying to do, Please find below:
The system is using MySQL database and currently, we are doing backup ourselves with the help of a module which does the following: 
1) LOCK TABLES which will internally create a global read lock for all the tables. 
2) FLUSH TABLES
3) Iterate through all the tables in each database and create a copy of those files in destination location
4) UNLOCK the tables.
Repeat these steps for every database.

We are migrating from MySQL to PostgreSQL and I was trying to figure out the corresponding alternative in pg database.
I wanted to know if there is any way I can reuse the current module for creating the database backup. So following are my concerns:
Is there way to create a global read lock on all the tables in a database and ensure that all the active transactions are halted till UNLOCK? if so, how?

Even upon going through the documentation, I couldn't get this concrete thing here. Appreciate your help.



On Mon, Jul 21, 2014 at 12:38 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Shreesha wrote:
> I was looking for an option of taking the hot copy of the pgsql database periodically or with manual
> trigger. In order to accomplish the same, I need to make sure that all the active transactions are
> halted for some time.
> In MySQL I could do this with FLUSH with READ LOCK. Once the backup is taken, I could UNLOCK the
> tables to continue the normal course of database operations.
>
> How can I do this in PostgreSQL database?
>
>
> Any information on this regard would be highly appreciated.
>
>
> Thanks,
> Shreesha.
>
> P.S.
> I happened to explore Backup and Restore options provided by PostgreSQL.
> a) Taking SQL dump would result in a slower operation and our system wouldn't be able to leverage the
> performance delay caused by this operation.
> b) File system level cache wouldn't be feasible as I don't want my the database server to be shutdown
> for taking the database backup and restore operation.

What do you mean by that? How can you use the cache for creating a copy?

> c) In order to accomplish the archival of WAL logs, I would anyway want to halt the database which
> brings me to the same problem again.

I don't see your point c)
The database server will automatically archive WALs any way you want, and there
is no need to stop or halt the database for that.

To create a copy of a PostgreSQL database cluster, take an online backup and restore it
to some point in time after the end of the backup using the WAL archives.

I am sure that you can automate this if necessary.

Yours,
Laurenz Albe



--
~Shreesha.

Re: Creating a hot copy of PostgreSQL database

From
Payal Singh
Date:
Re-using a backup script for MySQL to make Postgres backups might not be a good idea. Besides, it shouldn't take much
effortto create a new script for filesystem level backup anyways. Of course, it depends on what filesystem you're
using.If your filesystem provides snapshots option, then it will be as simple as: 
1. issue pg_start_backup('label');
2. take snapshot of the filesystem
3. issue pg_stop_backup();

If not, you can look at other backup options like using pg_basebackup with pitr, or pg_dump.

On Mon, Jul 21, 2014 at 02:39:32PM -0700, Shreesha wrote:
> @Albe Laurenz:
> 'cache' was a typo. I meant file system level backup as mentioned here - (
> http://www.postgresql.org/docs/9.3/static/backup-file.html).
> It would be really helpful if you can give steps for automating the copy of
> PostgreSQL database cluster.
>
> To give a clear picture of what I am currently trying to do, Please find
> below:
> The system is using MySQL database and currently, we are doing backup
> ourselves with the help of a module which does the following:
> 1) LOCK TABLES which will internally create a global read lock for all the
> tables.
> 2) FLUSH TABLES
> 3) Iterate through all the tables in each database and create a copy of
> those files in destination location
> 4) UNLOCK the tables.
> Repeat these steps for every database.
>
> We are migrating from MySQL to PostgreSQL and I was trying to figure out
> the corresponding alternative in pg database.
> I wanted to know if there is any way I can reuse the current module for
> creating the database backup. So following are my concerns:
> Is there way to create a global read lock on all the tables in a database
> and ensure that all the active transactions are halted till UNLOCK? if so,
> how?
>
> Even upon going through the documentation, I couldn't get this concrete
> thing here. Appreciate your help.
>
>
>
> On Mon, Jul 21, 2014 at 12:38 AM, Albe Laurenz <laurenz.albe@wien.gv.at>
> wrote:
>
> > Shreesha wrote:
> > > I was looking for an option of taking the hot copy of the pgsql database
> > periodically or with manual
> > > trigger. In order to accomplish the same, I need to make sure that all
> > the active transactions are
> > > halted for some time.
> > > In MySQL I could do this with FLUSH with READ LOCK. Once the backup is
> > taken, I could UNLOCK the
> > > tables to continue the normal course of database operations.
> > >
> > > How can I do this in PostgreSQL database?
> > >
> > >
> > > Any information on this regard would be highly appreciated.
> > >
> > >
> > > Thanks,
> > > Shreesha.
> > >
> > > P.S.
> > > I happened to explore Backup and Restore options provided by PostgreSQL.
> > > a) Taking SQL dump would result in a slower operation and our system
> > wouldn't be able to leverage the
> > > performance delay caused by this operation.
> > > b) File system level cache wouldn't be feasible as I don't want my the
> > database server to be shutdown
> > > for taking the database backup and restore operation.
> >
> > What do you mean by that? How can you use the cache for creating a copy?
> >
> > > c) In order to accomplish the archival of WAL logs, I would anyway want
> > to halt the database which
> > > brings me to the same problem again.
> >
> > I don't see your point c)
> > The database server will automatically archive WALs any way you want, and
> > there
> > is no need to stop or halt the database for that.
> >
> > To create a copy of a PostgreSQL database cluster, take an online backup
> > and restore it
> > to some point in time after the end of the backup using the WAL archives.
> >
> > I am sure that you can automate this if necessary.
> >
> > Yours,
> > Laurenz Albe
> >
>
>
>
> --
> ~Shreesha.


Re: Creating a hot copy of PostgreSQL database

From
Daniel Staal
Date:
--As of July 21, 2014 2:39:32 PM -0700, Shreesha is alleged to have said:

> @Albe Laurenz:
> 'cache' was a typo. I meant file system level backup as mentioned here -
> (http://www.postgresql.org/docs/9.3/static/backup-file.html). 
> It would be really helpful if you can give steps for automating the copy
> of PostgreSQL database cluster. 
>
>
> To give a clear picture of what I am currently trying to do, Please find
> below:
> The system is using MySQL database and currently, we are doing backup
> ourselves with the help of a module which does the following: 
> 1) LOCK TABLES which will internally create a global read lock for all
> the tables. 
> 2) FLUSH TABLES
> 3) Iterate through all the tables in each database and create a copy of
> those files in destination location
> 4) UNLOCK the tables.
> Repeat these steps for every database.

--As for the rest, it is mine.

I understand that you are thinking 'copy the actual database files', but
might I suggest looking at pg_dump?  It does essentially the above - the
'directory' format would even output one file per table.  It does not lock
the database while it's in progress - but it does use Postgres's standard
concurrency control.  (Meaning that it essentially will output a snapshot
of the database in time: While writes, deletes, etc. will work while
pg_dump is being run, the backup will contain only the data at the point
the backup started.)  If you need further data integrity you can even use
the `--serializable-deferrable` switch.  (Though note in nearly all cases
it's overkill; read the docs.)

The output would be more portable and manipulateable as well: You could do
things like restore only one table, or even possibly restore to other
databases.  (From SQL format.)  It would also output a platform-independent
dump, while your procedure above might not work with even a minor
difference in compiler between the original and restore database.  (I'm not
sure how sensitive Postgres is to that type of thing, but the bare files
are not defined to be usable by anything except the exact binary that wrote
them.)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


Re: Creating a hot copy of PostgreSQL database

From
Shreesha
Date:
Thanks for your thoughts Daniel. Really appreciate it. :-)

Though taking the pg_dump file for backup and restoring it sounds feasible, this is a CPU extensive operation and the backup process is taking long time. This was the same case earlier with mysqldump as well. 
With this performance concern, we ended up writing a new module which does the copy operation. 

I was wondering if pg_basebackup will be of any help for me in this regard. 
Can you please shed some light on how this can be used for recovery? Any examples(apart from the one in the documentation) on this usage especially with the 'xlog switch' would really help. I didn't find much usage examples of this utility over the internet. That's why thought of posting it here.

Thanks.


On Mon, Jul 21, 2014 at 5:47 PM, Daniel Staal <DStaal@usa.net> wrote:
--As of July 21, 2014 2:39:32 PM -0700, Shreesha is alleged to have said:

@Albe Laurenz:
'cache' was a typo. I meant file system level backup as mentioned here -
(http://www.postgresql.org/docs/9.3/static/backup-file.html). 
It would be really helpful if you can give steps for automating the copy
of PostgreSQL database cluster. 


To give a clear picture of what I am currently trying to do, Please find
below:
The system is using MySQL database and currently, we are doing backup
ourselves with the help of a module which does the following: 
1) LOCK TABLES which will internally create a global read lock for all
the tables. 
2) FLUSH TABLES
3) Iterate through all the tables in each database and create a copy of
those files in destination location
4) UNLOCK the tables.
Repeat these steps for every database.

--As for the rest, it is mine.

I understand that you are thinking 'copy the actual database files', but might I suggest looking at pg_dump?  It does essentially the above - the 'directory' format would even output one file per table.  It does not lock the database while it's in progress - but it does use Postgres's standard concurrency control.  (Meaning that it essentially will output a snapshot of the database in time: While writes, deletes, etc. will work while pg_dump is being run, the backup will contain only the data at the point the backup started.)  If you need further data integrity you can even use the `--serializable-deferrable` switch.  (Though note in nearly all cases it's overkill; read the docs.)

The output would be more portable and manipulateable as well: You could do things like restore only one table, or even possibly restore to other databases.  (From SQL format.)  It would also output a platform-independent dump, while your procedure above might not work with even a minor difference in compiler between the original and restore database.  (I'm not sure how sensitive Postgres is to that type of thing, but the bare files are not defined to be usable by anything except the exact binary that wrote them.)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice



--
~Shreesha.

Re: Creating a hot copy of PostgreSQL database

From
Albe Laurenz
Date:
Shreesha wrote:
> I was wondering if pg_basebackup will be of any help for me in this regard.
> Can you please shed some light on how this can be used for recovery? Any examples(apart from the one
> in the documentation) on this usage especially with the 'xlog switch' would really help. I didn't find
> much usage examples of this utility over the internet. That's why thought of posting it here.

Yes, pg_basebackup will help.
Your script would call pg_basebackup with the -x flag so that you get a copy
of the database cluster that is ready for recovery (i.e., it contains all the
necessary WAL segments).

The server can then be started, and after recovery is completed, it will be
ready for operation. Do *not* remove the backup_label file before you start
the server.

I don't have such a script ready, but it should not be a problem.

Yours,
Laurenz Albe