Thread: tablespaces a priority for 7.5?
Is support for tablespaces a priority feature for 7.5? I believe there has been significant development in this area and it seems that postgres' file structure opens it up nicely to support it. What are the chances this will be completed? In my opinion, it really is a critical feature to support and administer enterprise databases. All the major databases currently support this and it is a compelling enough reason drive big users from away from using postgres for their enterprise/large databases. It really is a database administrator's feature. Brian
In RAID era tablespaces are not such important regarding performance. But for backup/restore - the ability to backup/restore selected tablespaces while leaving other tablespaces is a big thing. The whole point here is: it is assumed that backup/restore of tablespaces can hapen quite quickly and as simple as to copy tablespace files from one location to another(even while database is on - WAL can be used to handle this) - this is compared to dump. For example, index, tempoarary data tablespaces can be lost - not a big deal. Undo(rollback) tablespaces - in a way can be lost as well. While system data tablespace (table structure, stored procedures, etc) - at no cost should be lost. The same way application can be devided in "critical" and "not critical" tablespaces and their backups maintained accordingly. For example, it may not be a big deal to lose year 1996 tables while year 2004 tables should be online. Laimis > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Maguire > Sent: 21. janúar 2004 16:06 > To: pgsql-general@postgresql.org > Subject: [GENERAL] tablespaces a priority for 7.5? > > > Is support for tablespaces a priority feature for 7.5? I > believe there has been significant development in this area > and it seems that postgres' file structure opens it up nicely > to support it. What are the chances this will be completed? > > In my opinion, it really is a critical feature to support and > administer enterprise databases. All the major databases > currently support this and it is a compelling enough reason > drive big users from away from using postgres for their > enterprise/large databases. It really is a database > administrator's feature. > > > Brian > > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > http://archives.postgresql.org
I agree that RAID provides similar performance benifits especially with striping io benifits, however it is powerful andideal to have both options. For example you may have a set of tables that are read-only for reporting and another setmostly write only. You could have they resting on different raid configurations ideal for each situtation. I also agree there are several admin benifits in the areas of backup. You can also more easily create different frequency/schedulesof backup for certain critical tables to a different schedule than other less important tables. Theflexibility of easily growing your database beyond the current disk because of size limitations can be a life savior. brian -----Original Message----- From: lnd@hnit.is [mailto:lnd@hnit.is] Sent: Wed 1/21/2004 12:22 PM To: Brian Maguire Cc: Subject: RE: [GENERAL] tablespaces a priority for 7.5? In RAID era tablespaces are not such important regarding performance. But for backup/restore - the ability to backup/restore selected tablespaces while leaving other tablespaces is a big thing. The whole point here is: it is assumed that backup/restore of tablespaces can hapen quite quickly and as simple as to copy tablespace files from one location to another(even while database is on - WAL can be used to handle this) - this is compared to dump. For example, index, tempoarary data tablespaces can be lost - not a big deal. Undo(rollback) tablespaces - in a way can be lost as well. While system data tablespace (table structure, stored procedures, etc) - at no cost should be lost. The same way application can be devided in "critical" and "not critical" tablespaces and their backups maintained accordingly. For example, it may not be a big deal to lose year 1996 tables while year 2004 tables should be online. Laimis > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Maguire > Sent: 21. janúar 2004 16:06 > To: pgsql-general@postgresql.org > Subject: [GENERAL] tablespaces a priority for 7.5? > > > Is support for tablespaces a priority feature for 7.5? I > believe there has been significant development in this area > and it seems that postgres' file structure opens it up nicely > to support it. What are the chances this will be completed? > > In my opinion, it really is a critical feature to support and > administer enterprise databases. All the major databases > currently support this and it is a compelling enough reason > drive big users from away from using postgres for their > enterprise/large databases. It really is a database > administrator's feature. > > > Brian > > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > http://archives.postgresql.org
> The whole point here is: it is assumed that backup/restore of tablespaces can > hapen quite quickly and as simple as to copy tablespace files from one > location to another(even while database is on - WAL can be used to handle > this) - this is compared to dump. This is not going to happen. We intend to provide tablespaces in the form of a simple management scheme for table files that are scattered across multiple directories (typically on different filesystems). That doesn't make it any safer to copy files behind the database's back. regards, tom lane
On Wed, 2004-01-21 at 09:05, Brian Maguire wrote: > In my opinion, it really is a critical feature to support and administer > enterprise databases. All the major databases currently support this > and it is a compelling enough reason drive big users from away from > using postgres for their enterprise/large databases. It really is a > database administrator's feature. It seems to me that the lack of point-in-time recovery is a much bigger roadblock against big users. :(
> On Behalf Of Cott Lang > It seems to me that the lack of point-in-time recovery is a > much bigger roadblock against big users. :( Meaning incremental (hot)-backups? Or as protection against DROP/TRUNCATE/DELETE ALL TABLE/SCHEMA/DATABASE? With a WAL it should be doable in some 7.x version, all ingredients are there. Possibly someone suceeded in doing it already? Having a baseline backup and saved WAL logs, shouldn't it be possible to recover? Laimis
On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote: > Meaning incremental (hot)-backups? > Or as protection against DROP/TRUNCATE/DELETE ALL TABLE/SCHEMA/DATABASE? > > With a WAL it should be doable in some 7.x version, all ingredients are > there. > > Possibly someone suceeded in doing it already? Having a baseline backup and > saved WAL logs, shouldn't it be possible to recover? Incremental pg_dumps would be a huge step in the right direction! However, unless I am mistaken, a baseline backup would need to be taken cold because you cannot take a consistent online backup of the data files without using file system snapshots or split mirrors, and even that's questionable.
> Incremental pg_dumps would be a huge step in the right direction! For big people (meaning bid databases) - not shure if pg_dump is the right direction. Pg_dump must be quite slow also compact. Raw file copy is the way to go: quick and simple, virtually no configuration is required, no possible pg_dump bugs - the latest quite important(of course, no DB version and OS changes are possible, but not really needed - this is recovery, not migration). > However, unless I am mistaken, a baseline backup would need > to be taken cold because you cannot take a consistent online > backup of the data files without using file system snapshots > or split mirrors, and even that's questionable. That's doable: depends what you have in WAL logs. In short: a baseline full hot database backup is taken while database is running, then when recovering WAL logs are put on top of this baseline backup. WAL logs must actually account for a lot: table, index changes, etc. From my understanding, this is done not at SQL level, but at pages level. as far as I remember the method - a full hot database backup must be taken at page level as well, i.e. simple OS file copy-utility may not suite for the purpose. Regards, Laimis > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Cott Lang > Sent: 22. janúar 2004 15:00 > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] tablespaces a priority for 7.5? > > > On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote: > > > Meaning incremental (hot)-backups? > > Or as protection against DROP/TRUNCATE/DELETE ALL > > TABLE/SCHEMA/DATABASE? > > > > With a WAL it should be doable in some 7.x version, all ingredients > > are there. > > > > Possibly someone suceeded in doing it already? Having a baseline > > backup and saved WAL logs, shouldn't it be possible to recover? > > Incremental pg_dumps would be a huge step in the right direction! > > However, unless I am mistaken, a baseline backup would need > to be taken cold because you cannot take a consistent online > backup of the data files without using file system snapshots > or split mirrors, and even that's questionable. > > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 8: explain analyze is your friend >
On Jan 22, 2004, at 10:00 AM, Cott Lang wrote: > On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote: > >> Meaning incremental (hot)-backups? >> Or as protection against DROP/TRUNCATE/DELETE ALL >> TABLE/SCHEMA/DATABASE? >> >> With a WAL it should be doable in some 7.x version, all ingredients >> are >> there. >> >> Possibly someone suceeded in doing it already? Having a baseline >> backup and >> saved WAL logs, shouldn't it be possible to recover? > > Incremental pg_dumps would be a huge step in the right direction! > > However, unless I am mistaken, a baseline backup would need to be taken > cold because you cannot take a consistent online backup of the data > files without using file system snapshots or split mirrors, and even > that's questionable. > pg_dump always takes a consistent dump - things won't change underneath it. So you're backup won't have anything that changed after pg_dump started.. thus where incremental would come in. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Cott Lang <cott@internetstaff.com> writes: > However, unless I am mistaken, a baseline backup would need to be taken > cold because you cannot take a consistent online backup of the data > files without using file system snapshots or split mirrors, and even > that's questionable. No, it wouldn't. All you need is to archive WAL beginning from the last checkpoint record before you begin to take the baseline backup. The baseline backup would not be consistent in itself --- but replaying WAL from the previous checkpoint to any time later than the end of the backup would bring it to a consistent state. Missed updates in the backup are essentially damage that would get repaired by replay. There are a few trivial things that still have to be done before this can be a reality (one I can think of is that WAL really needs to have entries for file creation/deletion), but it's not that far off in terms of the base mechanisms. Writing the management software is the main task. There is a lot more info available in the pghackers archives. regards, tom lane
> > However, unless I am mistaken, a baseline backup would need to be taken > > cold because you cannot take a consistent online backup of the data > > files without using file system snapshots or split mirrors, and even > > that's questionable. > > > > pg_dump always takes a consistent dump - things won't change underneath > it. > So you're backup won't have anything that changed after pg_dump > started.. thus where incremental would come in. Sorry, I'm referring to two entirely different things there. :) 1. Being able to do incremental pg_dumps would be a big plus, because you could take much quicker dumps and thus do it more regularly. I have a 50GB database I dump every 3 hours that takes 35 minutes to dump. :( 2. The rest was my hypothesizing on what might be necessary for point-in-time recovery, which pg_dump isn't going to allow even if you can perform incremental dumps.
On Thu, 2004-01-22 at 09:04, Tom Lane wrote: > No, it wouldn't. All you need is to archive WAL beginning from the last > checkpoint record before you begin to take the baseline backup. The > baseline backup would not be consistent in itself --- but replaying WAL > from the previous checkpoint to any time later than the end of the > backup would bring it to a consistent state. Missed updates in the > backup are essentially damage that would get repaired by replay. I will experiment with this. I have plenty of databases to wreak havoc on. :) > There are a few trivial things that still have to be done before this > can be a reality (one I can think of is that WAL really needs to have > entries for file creation/deletion), but it's not that far off in terms > of the base mechanisms. Writing the management software is the main > task. Being able to write WAL logs to two locations would be handy. Is it currently possible to have the logs not be recycled and occasionally deleted? Is there any hope for support for all of this in 7.5? I dread knowing that at some point, I may be forced kicking and screaming back to Oracle because of this. :) thanks!
Cott Lang <cott@internetstaff.com> writes: > 2. The rest was my hypothesizing on what might be necessary for > point-in-time recovery, which pg_dump isn't going to allow even if you > can perform incremental dumps. Right. There seems to be some confusion about that in this thread, so just for the record: pg_dump has nothing to do with our plans for point-in-time recovery (or incremental backup, which is essentially the same thing). To do these, you would take a *physical* dump of the database directory as a baseline, and thereafter copy WAL segments off to tape or whatever you are using as archive media. Recovery would consist of restoring the physical baseline dump, and then replaying WAL against it up to whatever point in time you wanted to recover to. You would, therefore, need to keep a continuous sequence of WAL files back to the time of your most recent baseline backup. regards, tom lane
Tom Lane said: > To do these, you would take a *physical* dump of the > database directory as a baseline, and thereafter copy WAL segments off > to tape or whatever you are using as archive media. Recovery would > consist of restoring the physical baseline dump, and then replaying WAL > against it up to whatever point in time you wanted to recover to. You > would, therefore, need to keep a continuous sequence of WAL files back > to the time of your most recent baseline backup. This is similar to Oracle... With Oracle you have the option of EITHER exporting the database (equivalent to doing pg_dump) OR taking the database offline and carrying out a file system level copy of the database files. Recovery can be accomplished by either restoring the backup database files or creating a new database by importing the most recent export/dump file. After that you apply the archive logs (equivalent to WAL segments) to bring your database up to date. One caveat for Oracle (at least in 8.1.x) is that DDL statements are not recorded in the archive logs, and can screw things up. Best to take a new export/dump after making DDL changes! The difference between Oracle and Postgres appears to be that posgres requires a file system level copy of the database instead of being able to make use of a dump file for this type of recovery. Is that correct? John Sidney-Woollett
Cott Lang <cott@internetstaff.com> writes: > Being able to write WAL logs to two locations would be handy. Is it > currently possible to have the logs not be recycled and occasionally > deleted? The current thought is to add some sort of hook that allows a user-defined action to be taken at the point where a WAL file would normally get recycled. Presumably this action would involve copying the WAL file to someplace else (eg a tape), after which it could get recycled. If you've got any thoughts on exactly how to manage this, let's take up a discussion on pgsql-hackers. > Is there any hope for support for all of this in 7.5? People are thinking about it, but I'm not seeing a lot of work getting done. (Partly my fault, since this is one of the things I'm supposed to be working on...) regards, tom lane
"John Sidney-Woollett" <johnsw@wardbrook.com> writes: > With Oracle you have the option of EITHER exporting the database > (equivalent to doing pg_dump) OR taking the database offline and carrying > out a file system level copy of the database files. The planned PITR feature would not require you to take anything offline. The whole concept of an "offline" database is an Oracle-ism that I see no value in emulating. > One caveat for Oracle (at least in 8.1.x) is that DDL statements are not > recorded in the archive logs, and can screw things up. Best to take a new > export/dump after making DDL changes! Well, we're ahead of them on that... > The difference between Oracle and Postgres appears to be that posgres > requires a file system level copy of the database instead of being able to > make use of a dump file for this type of recovery. Is that correct? I dunno what a "dump file" would equate to in Postgres terms, but yeah, we're envisioning using ordinary filesystem tools (tar, say) as the mechanism for handling a baseline backup. regards, tom lane
Tom Lane said: > The planned PITR feature would not require you to take anything offline. > The whole concept of an "offline" database is an Oracle-ism that I see > no value in emulating. > I dunno what a "dump file" would equate to in Postgres terms, but yeah, > we're envisioning using ordinary filesystem tools (tar, say) as the > mechanism for handling a baseline backup. (I think) that was the point I was trying to make. Oracle lets you backup while up and running (using export) OR when offline (copying db files)... However, it looks like postgres will require a file system level backup to recover from. Or will you be able to build a new db from the dump file, and then apply the WAL segments? John Sidney-Woollett
> With Oracle you have the option of EITHER exporting the database > (equivalent to doing pg_dump) OR taking the database offline and carrying > out a file system level copy of the database files. OR putting the tablespaces in hot backup mode and carrying out a file system level copy of the database files, which is the option I certainly prefer. :) > Recovery can be accomplished by either restoring the backup > database files or creating a new database by importing the most > recent export/dump file. > After that you apply the archive logs (equivalent to WAL segments) > to bring your database up to date. You can't do that with an export in Oracle. Archive logs are only useful to apply to a cold or hot backup of the data files. Export with Postgres works about the same as with Oracle. It's the lack of well-supported hot backups in Postgres that are the limitation. > One caveat for Oracle (at least in 8.1.x) is that DDL statements > are not recorded in the archive logs, and can screw things up. > Best to take a new export/dump after making DDL changes! I've not heard that one. That would make hot standby databases a serious maintenance problem. :)
It's interesting to know what pg WAL's are: If they are page level logs, then you theoretically can have a file system backup - which is fast, reliable, no issues like pg_dump has, i.e. who is first/last regarding stored procedures/views/triggers, to have users precreated, etc, etc. If the are SQL statements which are REPLAYD after pg_dump restore: well, then file system backup has no chance. The issue is that you are missing index pages and probably much more. Regards, Laimis > -----Original Message----- > From: John Sidney-Woollett [mailto:johnsw@wardbrook.com] > Sent: 22. janúar 2004 17:53 > To: Laimutis Nedzinskas > Cc: johnsw@wardbrook.com > Subject: RE: [GENERAL] tablespaces a priority for 7.5? > > > lnd@hnit.is said: > >> (I think) that was the point I was trying to make. Oracle lets you > >> backup while up and running (using export) OR when offline > (copying > >> db files)... > > > > Not neccesseraly offline. In oracle you take tablespace in > backup mode > > and just use OCOPY(I believe you must watch out that file copy is > > performed at OS block sizes not smaller than db page size - > OCOPY does > > that) to copy files. > > REDO logs (i.e. WAL logs) will bring copied files in synch. > > Or one can use RMAN tool which is the same as to copy files > just it makes > > book-keeping for backups and discards empty pages from > backuped file. > > You obviously know Oracle better than me - I stand corrected! :) > > Actually I was trying to figure out whether the WAL segments > could be applied to a database rebuilt using a file generated > by pg_dump. I suspect not, and doesn't that mean that the > postmaster needs to be stopped to carry out the file system > level copy of the pg files? > > John >
"John Sidney-Woollett" <johnsw@wardbrook.com> writes: > Tom Lane said: >> I dunno what a "dump file" would equate to in Postgres terms, but yeah, >> we're envisioning using ordinary filesystem tools (tar, say) as the >> mechanism for handling a baseline backup. > However, it looks like postgres will require a file system level backup to > recover from. Or will you be able to build a new db from the dump file, > and then apply the WAL segments? What dump file? I'm trying to say that we have no such concept and no intention of inventing one. regards, tom lane
> On Wed, 2004-01-21 at 09:05, Brian Maguire wrote: > > > In my opinion, it really is a critical feature to support and administer > > enterprise databases. All the major databases currently support this > > and it is a compelling enough reason drive big users from away from > > using postgres for their enterprise/large databases. It really is a > > database administrator's feature. > > It seems to me that the lack of point-in-time recovery is a much bigger > roadblock against big users. :( This is certainly my feeling.
speaking as someone currently migrating enterprise stuff to postgres.... point-in-time is definitely the biggest issue. this is the main thing we are having to look reproducing using some form of replication to an alternative server. -----Original Message----- From: Rick Gigger [mailto:rick@alpinenetworking.com] Sent: 22 January 2004 19:09 To: Cott Lang; pgsql-general@postgresql.org Subject: Re: [GENERAL] tablespaces a priority for 7.5? > On Wed, 2004-01-21 at 09:05, Brian Maguire wrote: > > > In my opinion, it really is a critical feature to support and administer > > enterprise databases. All the major databases currently support this > > and it is a compelling enough reason drive big users from away from > > using postgres for their enterprise/large databases. It really is a > > database administrator's feature. > > It seems to me that the lack of point-in-time recovery is a much bigger > roadblock against big users. :( This is certainly my feeling. ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________