Thread: Upgrading a database dump/restore
Not to cause any arguments, but this is sort a standard discussion that gets brought up periodically and I was wondering if there has been any "softening" of the attitudes against an "in place" upgrade, or movement to not having to dump and restore for upgrades. I am aware that this is a difficult problem and I understand that if there is a radical restructuring of the database then a dump/restore is justified, but wouldn't it be a laudable goal to *not* require this with each new release? Can't we use some release as a standard who's binary format "shall not be changed." I know the arguments about "predicting the future," and all, but standards and stability are important too. I'm not saying it should never ever change or never ever require a dump/restore, but make it, as policy, difficult to get past the group and the norm not to require d/r. The issue is that as disks get bigger and bigger, databases get bigger and bigger, and this process becomes more and more onerous. If you haven't noticed, data transmission speeds are not accelerating at the rate disk space is growing. I am currently building a project that will have a huge number of records, 1/2tb of data. I can't see how I would ever be able to upgrade PostgreSQL on this system.
Mark Woodward wrote: > I am currently building a project that will have a huge number of records, > 1/2tb of data. I can't see how I would ever be able to upgrade PostgreSQL > on this system. > > Slony will help you upgrade (and downgrade, for that matter) with no downtime at all, pretty much. Of course, you do need double the resources .... You other suggestion of setting the on disk format in high viscosity jello, if not in concrete, seems doomed to failure. Cool features that you and other people want occasionally rely on format changes. Of course, you don't have to upgrade every release. Many people (including me) don't. cheers andrew
> Mark Woodward wrote: >> I am currently building a project that will have a huge number of >> records, >> 1/2tb of data. I can't see how I would ever be able to upgrade >> PostgreSQL >> on this system. >> >> > > Slony will help you upgrade (and downgrade, for that matter) with no > downtime at all, pretty much. Of course, you do need double the > resources .... > > You other suggestion of setting the on disk format in high viscosity > jello, if not in concrete, seems doomed to failure. Cool features that > you and other people want occasionally rely on format changes. I disagree with the "all or nothing" attitude, I'm generally a pragmatist. It is unreasonable to expect that things will never change, by the same token, never attempting to standardize or enforce some level of stability is equally unreasonable. From an enterprise DB perspective, a d/r of a database is a HUGE process and one that isn't taken lightly. I just think that an amount of restraint in this area would pay off well. > > Of course, you don't have to upgrade every release. Many people > (including me) don't. >
On Oct 5, 2006, at 15:46 , Mark Woodward wrote: > Not to cause any arguments, but this is sort a standard discussion > that > gets brought up periodically and I was wondering if there has been any > "softening" of the attitudes against an "in place" upgrade, or > movement to > not having to dump and restore for upgrades. > > I am aware that this is a difficult problem and I understand that > if there > is a radical restructuring of the database then a dump/restore is > justified, but wouldn't it be a laudable goal to *not* require this > with > each new release? > > Can't we use some release as a standard who's binary format "shall > not be > changed." I know the arguments about "predicting the future," and > all, but > standards and stability are important too. I'm not saying it should > never > ever change or never ever require a dump/restore, but make it, as > policy, > difficult to get past the group and the norm not to require d/r. > > The issue is that as disks get bigger and bigger, databases get > bigger and > bigger, and this process becomes more and more onerous. If you haven't > noticed, data transmission speeds are not accelerating at the rate > disk > space is growing. > > I am currently building a project that will have a huge number of > records, > 1/2tb of data. I can't see how I would ever be able to upgrade > PostgreSQL > on this system. Indeed. The main issue for me is that the dumping and replication setups require at least 2x the space of one db. That's 2x the hardware which equals 2x $$$. If there were some tool which modified the storage while postgres is down, that would save lots of people lots of money. -M
> > Indeed. The main issue for me is that the dumping and replication > setups require at least 2x the space of one db. That's 2x the > hardware which equals 2x $$$. If there were some tool which modified > the storage while postgres is down, that would save lots of people > lots of money. Its time and money. Stoping a database and staring with new software is a lot faster than dumping the data out (disallowing updates or inserts) and restoring the data can take hours or days *and* twice the hardware.
On Thu, Oct 05, 2006 at 04:39:22PM -0400, Mark Woodward wrote: > > > > Indeed. The main issue for me is that the dumping and replication > > setups require at least 2x the space of one db. That's 2x the > > hardware which equals 2x $$$. If there were some tool which modified > > the storage while postgres is down, that would save lots of people > > lots of money. > > Its time and money. Stoping a database and staring with new software is a > lot faster than dumping the data out (disallowing updates or inserts) and > restoring the data can take hours or days *and* twice the hardware. In that case there should be people willing to fund the development. There have been a few people (even in the last few weeks) who say they're looking into it, perhaps they need a "helping hand"? Someone got as far as handling catalog updates I beleive, Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
"Mark Woodward" <pgsql@mohawksoft.com> writes: > Not to cause any arguments, but this is sort a standard discussion that > gets brought up periodically and I was wondering if there has been any > "softening" of the attitudes against an "in place" upgrade, or movement to > not having to dump and restore for upgrades. Whenever someone actually writes a pg_upgrade, we'll institute a policy to restrict changes it can't handle. But until we have a credible upgrade tool it's pointless to make any such restriction. ("Credible" means "able to handle system catalog restructurings", IMHO --- without that, you'd not have any improvement over the current rules for minor releases.) regards, tom lane
Well, there is a TODO item ( somewhere only we know ...). Administration * Allow major upgrades without dump/reload, perhaps using pg_upgrade http://momjian.postgresql.org/cgi-bin/pgtodo?pg_upgrade pg_upgrade resists itself to be born, but that discussion seems to seed *certain* fundamentals for a future upgrade tool. It reached pgfoundry, at least the name :) g.- On 10/5/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Mark Woodward" <pgsql@mohawksoft.com> writes: > > Not to cause any arguments, but this is sort a standard discussion that > > gets brought up periodically and I was wondering if there has been any > > "softening" of the attitudes against an "in place" upgrade, or movement to > > not having to dump and restore for upgrades. > > Whenever someone actually writes a pg_upgrade, we'll institute a policy > to restrict changes it can't handle. But until we have a credible > upgrade tool it's pointless to make any such restriction. ("Credible" > means "able to handle system catalog restructurings", IMHO --- without > that, you'd not have any improvement over the current rules for minor > releases.) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Guido Barosio ----------------------- http://www.globant.com guido.barosio@globant.com
Martijn van Oosterhout napsal(a): > On Thu, Oct 05, 2006 at 04:39:22PM -0400, Mark Woodward wrote: >>> Indeed. The main issue for me is that the dumping and replication >>> setups require at least 2x the space of one db. That's 2x the >>> hardware which equals 2x $$$. If there were some tool which modified >>> the storage while postgres is down, that would save lots of people >>> lots of money. >> Its time and money. Stoping a database and staring with new software is a >> lot faster than dumping the data out (disallowing updates or inserts) and >> restoring the data can take hours or days *and* twice the hardware. > > In that case there should be people willing to fund the development. > There have been a few people (even in the last few weeks) who say > they're looking into it, perhaps they need a "helping hand"? There are still people who are working on it :-). I'm working on catalog conversion prototype -> it will generate helping request early ;-). Zdenek
> "Mark Woodward" <pgsql@mohawksoft.com> writes: >> Not to cause any arguments, but this is sort a standard discussion that >> gets brought up periodically and I was wondering if there has been any >> "softening" of the attitudes against an "in place" upgrade, or movement >> to >> not having to dump and restore for upgrades. > > Whenever someone actually writes a pg_upgrade, we'll institute a policy > to restrict changes it can't handle. But until we have a credible > upgrade tool it's pointless to make any such restriction. ("Credible" > means "able to handle system catalog restructurings", IMHO --- without > that, you'd not have any improvement over the current rules for minor > releases.) IMHO, *before* any such tool *can* be written, a set of rules must be enacted regulating catalog changes. If there are no rules and no process by which changes get approved, requiring a "was is" conversion strategy, then the tools has to change with every major version, which will, of course, put it at risk of losing support in the long term. Like I said, I understand the reluctance to do these things, it isn't an easy thing to do. Designing and planning for the future is, however, the hallmark of a good engineer.
"Mark Woodward" <pgsql@mohawksoft.com> writes: >> Whenever someone actually writes a pg_upgrade, we'll institute a policy >> to restrict changes it can't handle. > IMHO, *before* any such tool *can* be written, a set of rules must be > enacted regulating catalog changes. That one is easy: there are no rules. We already know how to deal with catalog restructurings --- you do the equivalent of a pg_dump -s and reload. Any proposed pg_upgrade that can't cope with this will be rejected out of hand, because that technology was already proven five years ago. The issues that are actually interesting have to do with the contents of user tables and indexes, not catalogs. regards, tom lane
> "Mark Woodward" <pgsql@mohawksoft.com> writes: >>> Whenever someone actually writes a pg_upgrade, we'll institute a policy >>> to restrict changes it can't handle. > >> IMHO, *before* any such tool *can* be written, a set of rules must be >> enacted regulating catalog changes. > > That one is easy: there are no rules. We already know how to deal with > catalog restructurings --- you do the equivalent of a pg_dump -s and > reload. Any proposed pg_upgrade that can't cope with this will be > rejected out of hand, because that technology was already proven five > years ago. > > The issues that are actually interesting have to do with the contents > of user tables and indexes, not catalogs. It is becomming virtually impossible to recreate databases. Data storage sizes are increasing faster than the transimssion speeds of the media on which they are stored or the systems by which they are connected. The world is looking at a terabyte as merely a "very large" database these days. tens of terabytes are not far from being common. Dumping out a database is bad enough, but that's only the data, and that can takes (mostly) only hours. Recreating a large database with complex indexes can take days or hours for the data, hours per index, it adds up. No one could expect that this could happen by 8.2, or the release after that, but as a direction for the project, the "directors" of the PostgreSQL project must realize that the dump/restore is becomming like the old locking vacuum problem. It is a *serious* issue for PostgreSQL adoption and arguably a real design flaw. If the barrier to upgrade it too high, people will not upgrade. If people do not upgrade, then older versions will have to be supported longer or users will have to be abandoned. If users are abandoned and there are critical bugs in previous versions of PostgreSQL, then user who eventually have to migrate their data, they will probably not use PostgreSQL in an attempt to avoid repeating this situation. While the economics of open source/ free software are different, there is still a penalty for losing customers, and word of mouth is a dangerous thing. Once or twice in the customers product usage history can you expect to get away with this sort of inconvenience, but if every new major version requres a HUGE process, then the TCO of PostgreSQL gets very high indeed. If it is a data format issue, maybe there should be a forum for a "next gen" version of the current data layout that is extensible without restructuring. This is not something that a couple people can go off and do and submit a patch, it is something that has to be supported and promoted from the core team, otherwise it won't happen. We all know that. The question is whether or not you all think it is worth doing. I've done consulting work for some very large companies that everyone has heard of. These sorts of things matter.
On Mon, Oct 09, 2006 at 11:50:10AM -0400, Mark Woodward wrote: > > That one is easy: there are no rules. We already know how to deal with > > catalog restructurings --- you do the equivalent of a pg_dump -s and > > reload. Any proposed pg_upgrade that can't cope with this will be > > rejected out of hand, because that technology was already proven five > > years ago. <snip> > Dumping out a database is bad enough, but that's only the data, and that > can takes (mostly) only hours. Recreating a large database with complex > indexes can take days or hours for the data, hours per index, it adds up. I think you missed the point of the email you replied to. *catalog* changes are quick and (relativly) easy. Even with 10,000 tables, it would only take a few moments to rewrite the entire catalog to a new version. > If it is a data format issue, maybe there should be a forum for a "next > gen" version of the current data layout that is extensible without > restructuring. This is not something that a couple people can go off and > do and submit a patch, it is something that has to be supported and > promoted from the core team, otherwise it won't happen. We all know that. Actually, the data format is not the issue either. The tuple structure hasn't changed that often. What has changed is the internal format of a few types, but postgresql could support both the old and the new types simultaneously. There has already been a statement from core-members that if someone comes up with a tool to handle the catalog upgrade, they'd be willing to keep code from older types around with the original oid so they'd be able to read the older version. > The question is whether or not you all think it is worth doing. I've done > consulting work for some very large companies that everyone has heard of. > These sorts of things matter. People are working it, someone even got so far as dealing with most catalog upgrades. The hard part going to be making sure that even if the power fails halfway through an upgrade that your data will still be readable... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
> On Mon, Oct 09, 2006 at 11:50:10AM -0400, Mark Woodward wrote: >> > That one is easy: there are no rules. We already know how to deal >> with >> > catalog restructurings --- you do the equivalent of a pg_dump -s and >> > reload. Any proposed pg_upgrade that can't cope with this will be >> > rejected out of hand, because that technology was already proven five >> > years ago. > > <snip> > >> Dumping out a database is bad enough, but that's only the data, and that >> can takes (mostly) only hours. Recreating a large database with complex >> indexes can take days or hours for the data, hours per index, it adds >> up. > > I think you missed the point of the email you replied to. *catalog* > changes are quick and (relativly) easy. Even with 10,000 tables, it > would only take a few moments to rewrite the entire catalog to a new > version. > >> If it is a data format issue, maybe there should be a forum for a "next >> gen" version of the current data layout that is extensible without >> restructuring. This is not something that a couple people can go off and >> do and submit a patch, it is something that has to be supported and >> promoted from the core team, otherwise it won't happen. We all know >> that. > > Actually, the data format is not the issue either. The tuple structure > hasn't changed that often. What has changed is the internal format of a > few types, but postgresql could support both the old and the new types > simultaneously. There has already been a statement from core-members > that if someone comes up with a tool to handle the catalog upgrade, > they'd be willing to keep code from older types around with the > original oid so they'd be able to read the older version. That's good to know. > >> The question is whether or not you all think it is worth doing. I've >> done >> consulting work for some very large companies that everyone has heard >> of. >> These sorts of things matter. > > People are working it, someone even got so far as dealing with most > catalog upgrades. The hard part going to be making sure that even if > the power fails halfway through an upgrade that your data will still be > readable... Well, I think that any *real* DBA understands and accepts that issues like power failure and hardware failure create situations where "suboptimal" conditions exist. :-) Stopping the database and copying the pg directory addresses this problem, upon failure, it is a simple mv bkdir pgdir, gets you started again. If you have a system on a good UPS and on reliable hardware, which is exactly the sort of deployment that would benefit most from an "in place" upgrade. There is no universal panacea where there is 0 risk, one can only mitigate risk. That being said, it should be the "preferred" method of upgrade with new versions not being released untill they can migrate cleanly. A dump/restore should be a last resort. Don't you think?
Martijn van Oosterhout <kleptog@svana.org> writes: > The hard part going to be making sure that even if > the power fails halfway through an upgrade that your data will still be > readable... I think we had that problem solved too in principle: build the new catalogs in a new $PGDATA directory alongside the old one, and hard-link the old user table files into that directory as you go. Then pg_upgrade never needs to change the old directory tree at all. This gets a bit more complicated in the face of tablespaces but still seems doable. (I suppose it wouldn't work in Windows for lack of hard links, but anyone trying to run a terabyte database on Windows deserves to lose anyway.) The stuff that needed rethinking in the old pg_upgrade code, IIRC, had to do with management of transaction IDs and old WAL log. regards, tom lane
Mark, > No one could expect that this could happen by 8.2, or the release after > that, but as a direction for the project, the "directors" of the > PostgreSQL project must realize that the dump/restore is becomming like > the old locking vacuum problem. It is a *serious* issue for PostgreSQL > adoption and arguably a real design flaw. "directors"? (looks around) Nobody here but us chickens, boss. If you're really interested in pg_upgrade, you're welcome to help out. Gavin Sherry, Zdenek, and Jonah Harris are working on it (the last separately, darn it). -- Josh Berkus PostgreSQL @ Sun San Francisco
Josh Berkus wrote: > Mark, > >> No one could expect that this could happen by 8.2, or the release after >> that, but as a direction for the project, the "directors" of the >> PostgreSQL project must realize that the dump/restore is becomming like >> the old locking vacuum problem. It is a *serious* issue for PostgreSQL >> adoption and arguably a real design flaw. > > "directors"? (looks around) Nobody here but us chickens, boss. Action, Action!.. no wait, I mean CUT!!!! Mark, if you really want this, join one of the many teams who have tried to do it and help them. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
> Mark, > >> No one could expect that this could happen by 8.2, or the release after >> that, but as a direction for the project, the "directors" of the >> PostgreSQL project must realize that the dump/restore is becomming like >> the old locking vacuum problem. It is a *serious* issue for PostgreSQL >> adoption and arguably a real design flaw. > > "directors"? (looks around) Nobody here but us chickens, boss. > > If you're really interested in pg_upgrade, you're welcome to help out. > Gavin > Sherry, Zdenek, and Jonah Harris are working on it (the last separately, > darn > it). This is the most frustrating thing, I *wan't* to do these things, but I can't find any companies that are willing to pay me to do it, and having kids, I don't have the spare time to do it. I *have* a recommendations system already, but I can't even find the time to do the NetFlix Prize thing.
Mark Woodward wrote: >> Mark, >> >>> No one could expect that this could happen by 8.2, or the release after >>> that, but as a direction for the project, the "directors" of the >>> PostgreSQL project must realize that the dump/restore is becomming like >>> the old locking vacuum problem. It is a *serious* issue for PostgreSQL >>> adoption and arguably a real design flaw. >> "directors"? (looks around) Nobody here but us chickens, boss. >> >> If you're really interested in pg_upgrade, you're welcome to help out. >> Gavin >> Sherry, Zdenek, and Jonah Harris are working on it (the last separately, >> darn >> it). > > This is the most frustrating thing, I *wan't* to do these things, but I > can't find any companies that are willing to pay me to do it, and having > kids, I don't have the spare time to do it. Well that pretty much sums it up doesn't. If the people / users that want this feature, want it bad enough -- they will cough up the money to get it developed. If not.... then it likely won't happen because for most users in place upgrades really isn't a big deal. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: TL> (I suppose it wouldn't work in Windows for lack of hard links, but TL> anyone trying to run a terabyte database on Windows deserves to TL> lose anyway.) Windows has hard links on NTFS, they are just rarely used. /Benny
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Benny Amorsen > Sent: 10 October 2006 13:02 > To: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Upgrading a database dump/restore > > >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > TL> (I suppose it wouldn't work in Windows for lack of hard links, but > TL> anyone trying to run a terabyte database on Windows deserves to > TL> lose anyway.) > > Windows has hard links on NTFS, they are just rarely used. We use them in PostgreSQL to support tablespaces. Regards, Dave.
> > TL> (I suppose it wouldn't work in Windows for lack of hard > links, but > > TL> anyone trying to run a terabyte database on Windows deserves > to > > TL> lose anyway.) > > > > Windows has hard links on NTFS, they are just rarely used. > > We use them in PostgreSQL to support tablespaces. No, we don't. We use NTFS Junctions which are the equivalent of directory *symlinks*. Not hardlinks. Different thing. //Magnus
> -----Original Message----- > From: Magnus Hagander [mailto:mha@sollentuna.net] > Sent: 10 October 2006 13:23 > To: Dave Page; Benny Amorsen; pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] Upgrading a database dump/restore > > > > TL> (I suppose it wouldn't work in Windows for lack of hard > > links, but > > > TL> anyone trying to run a terabyte database on Windows deserves > > to > > > TL> lose anyway.) > > > > > > Windows has hard links on NTFS, they are just rarely used. > > > > We use them in PostgreSQL to support tablespaces. > > No, we don't. We use NTFS Junctions which are the equivalent of > directory *symlinks*. Not hardlinks. Different thing. They are? Oh well, you live and learn :-) /D
Benny Amorsen wrote: >>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: >>>>>> > > TL> (I suppose it wouldn't work in Windows for lack of hard links, but > TL> anyone trying to run a terabyte database on Windows deserves to > TL> lose anyway.) > > Windows has hard links on NTFS, they are just rarely used. > And MS provides a command line utility to create them. See http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/fsutil_hardlink.mspx?mfr=true I imagine there is also a library call that can be made to achieve the same effect. cheers andrew
Hi, Mark, Mark Woodward wrote: >> People are working it, someone even got so far as dealing with most >> catalog upgrades. The hard part going to be making sure that even if >> the power fails halfway through an upgrade that your data will still be >> readable... > > Well, I think that any *real* DBA understands and accepts that issues like > power failure and hardware failure create situations where "suboptimal" > conditions exist. :-) Stopping the database and copying the pg directory > addresses this problem, upon failure, it is a simple mv bkdir pgdir, gets > you started again. But when people have enough bandwith and disk space to copy the pg directory, they also have enough to create and store a bzip2 compressed dump of the database. Or did I miss something? HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
On Oct 11, 2006, at 7:57 AM, Markus Schaber wrote: > Hi, Mark, > > Mark Woodward wrote: > >>> People are working it, someone even got so far as dealing with most >>> catalog upgrades. The hard part going to be making sure that even if >>> the power fails halfway through an upgrade that your data will >>> still be >>> readable... >> >> Well, I think that any *real* DBA understands and accepts that >> issues like >> power failure and hardware failure create situations where >> "suboptimal" >> conditions exist. :-) Stopping the database and copying the pg >> directory >> addresses this problem, upon failure, it is a simple mv bkdir >> pgdir, gets >> you started again. > > But when people have enough bandwith and disk space to copy the pg > directory, they also have enough to create and store a bzip2 > compressed > dump of the database. > > Or did I miss something? Not necessarily. "copying" a directory on most modern unix systems can be accomplished by snapshotting the filesystem. In this case, you only pay the space and performance cost for blocks that are changed between the time of the snap and the time it is discarded. An actual copy of the database is often too large to juggle (which is why we write stuff straight to tape libraries). The real problem with a "dump" of the database is that you want to be able to quickly switch back to a known working copy in the event of a failure. A dump is the furthest possible thing from a working copy as one has to rebuild the database (indexes, etc.) and in doing so, you (1) spend the better part of a week running pg_restore and (2) ANALYZE stats change, so your system's behavior changes in hard-to- understand ways. Best regards, Theo // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Theo Schlossnagle <jesus@omniti.com> writes: > The real problem with a "dump" of the database is that you want to be > able to quickly switch back to a known working copy in the event of a > failure. A dump is the furthest possible thing from a working copy > as one has to rebuild the database (indexes, etc.) and in doing so, > you (1) spend the better part of a week running pg_restore and (2) > ANALYZE stats change, so your system's behavior changes in hard-to- > understand ways. Seems like you should be looking into maintaining a hot spare via PITR, if your requirement is for a bit-for-bit clone of your database. regards, tom lane
On Oct 11, 2006, at 9:36 AM, Tom Lane wrote: > Theo Schlossnagle <jesus@omniti.com> writes: >> The real problem with a "dump" of the database is that you want to be >> able to quickly switch back to a known working copy in the event of a >> failure. A dump is the furthest possible thing from a working copy >> as one has to rebuild the database (indexes, etc.) and in doing so, >> you (1) spend the better part of a week running pg_restore and (2) >> ANALYZE stats change, so your system's behavior changes in hard-to- >> understand ways. > > Seems like you should be looking into maintaining a hot spare via > PITR, > if your requirement is for a bit-for-bit clone of your database. The features in 8.2 that allow for that look excellent. Prior to that, it is a bit clunky. But we do this already. However, PITR and a second machine doesn't help during upgrades so much. It doesn't allow for an easy rollback. I'd like an in-place upgrade that is "supposed" to work. And then I'd do: Phase 1 (confidence): clone my filesystems upgrade the clones run regression tests to obtain confidence in a flawless upgrade. drop the clones Phase 1 (abort): drop clones Phase 2 (upgrade): snapshot the filesystems upgrade the base Phase 2 (abort): rollback to snapshots Phase 2 (commit): drop the snapshots // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Theo, Would you be able to help me, Zdenek & Gavin in work on a new pg_upgrade? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
What type of help did you envision? The answer is likely yes. On Oct 11, 2006, at 5:02 PM, Josh Berkus wrote: > Theo, > > Would you be able to help me, Zdenek & Gavin in work on a new > pg_upgrade? > > -- > --Josh > > Josh Berkus > PostgreSQL @ Sun > San Francisco // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Theo, > What type of help did you envision? The answer is likely yes. I don't know, whatever you have available. Design advice, at the very least. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On Oct 11, 2006, at 5:06 PM, Josh Berkus wrote: > >> What type of help did you envision? The answer is likely yes. > > I don't know, whatever you have available. Design advice, at the very > least. Absolutely. I might be able to contribute some coding time as well. Testing time too. // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
-----Original Message----- I think we had that problem solved too in principle: build the new catalogs in a new $PGDATA directory alongside the old one, and hard-link the old user table files into that directory as you go. Then pg_upgrade never needs to change the old directory tree at all. This gets a bit more complicated in the face of tablespaces but still seems doable. (I suppose it wouldn't work in Windows for lack of hard links, but anyone trying to run a terabyte database on Windows deserves to lose . regards, tom lane ---------------------------(end of broadcast)--------------------------- FYI: Windows NTFS has always supported hard links. It was symlinks it didn't support until recently (now it has both). And there isn't any reason Terabyte databases shouldn't work as well on Windows as on Linux, other than limitations in PostgreSQL itself.