Thread: PG_UPGRADE status?
I know Tom Lane has done some work on pg_upgrade -- the last message was on 8/2/99, and left the thread hanging. What is the current status of pg_upgrade in 6.5.x?? I ask because the presence of a working pg_upgrade drastically reduces the work necessary to get postgresql upgrading working prpoerly in an RPM environment. I particular, the upgrade from RedHat 6.0 to RedHat 6.1 is going to be from postgresql 6.4.2 to 6.5.1. I do not forsee anyone successfully upgrading a RedHat 5.x installation to 6.1, as other things will break -- although I could be entirely wrong. If pg_upgrade is hopelessly broken in 6.5.x, that's ok -- just means a little more work. Lamar Owen WGCR Internet Radio
> I know Tom Lane has done some work on pg_upgrade -- the last message was > on 8/2/99, and left the thread hanging. > > What is the current status of pg_upgrade in 6.5.x?? > > I ask because the presence of a working pg_upgrade drastically reduces > the work necessary to get postgresql upgrading working prpoerly in an > RPM environment. I particular, the upgrade from RedHat 6.0 to RedHat > 6.1 is going to be from postgresql 6.4.2 to 6.5.1. I do not forsee > anyone successfully upgrading a RedHat 5.x installation to 6.1, as other > things will break -- although I could be entirely wrong. pg_upgrade will not work in converting from <= 6.4.* to 6.5.* because the on-disk date format changed in 6.5. Hopefully, 6.6 will allow pg_upgrade for 6.5.* databases. We try not to change the on-disk format, but sometimes we have to. MVCC required it for 6.5.*. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > pg_upgrade will not work in converting from <= 6.4.* to 6.5.* because > the on-disk date format changed in 6.5. Hopefully, 6.6 will allow > pg_upgrade for 6.5.* databases. We try not to change the on-disk > format, but sometimes we have to. MVCC required it for 6.5.*. Ok, answers my question. It would be nice to be able to say: pg_upgrade --source-pgdata=/var/lib/pgsql-old --pgdata=/var/lib/pgsql and have any version PostgreSQL database converted to the newest, but maybe that's a pipe dream. Sure would make upgrades easier, on everybody, not just RedHatters -- such as those who have large amounts of large objects. If I were a better C coder, and had more experience with the various versions' on-disk formats, I'd be happy to try to tackle it myself. But, I'm not that great of a C coder, nor do I know the data structures well enough. Oh well. Thanks much! Lamar Owen WGCR Internet Radio
> Bruce Momjian wrote: > > pg_upgrade will not work in converting from <= 6.4.* to 6.5.* because > > the on-disk date format changed in 6.5. Hopefully, 6.6 will allow > > pg_upgrade for 6.5.* databases. We try not to change the on-disk > > format, but sometimes we have to. MVCC required it for 6.5.*. > > Ok, answers my question. It would be nice to be able to say: > pg_upgrade --source-pgdata=/var/lib/pgsql-old --pgdata=/var/lib/pgsql > and have any version PostgreSQL database converted to the newest, but > maybe that's a pipe dream. Sure would make upgrades easier, on > everybody, not just RedHatters -- such as those who have large amounts > of large objects. > > If I were a better C coder, and had more experience with the various > versions' on-disk formats, I'd be happy to try to tackle it myself. > But, I'm not that great of a C coder, nor do I know the data structures > well enough. Oh well. You would have to convert tons of rows of data in raw format. Seems like dump/reload would be easier. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: >Lamar Owen wrote: > > If I were a better C coder, and had more experience with the various > > versions' on-disk formats, I'd be happy to try to tackle it myself. > > But, I'm not that great of a C coder, nor do I know the data structures > > well enough. Oh well. > > You would have to convert tons of rows of data in raw format. Seems > like dump/reload would be easier. For normal situations, it is. However, in an RPM upgrade that occurs as part of an OS upgrade (say, from RedHat 6.0 to RedHat 6.1), NO daemons can be run during a package upgrade. That doesn't seem too bad until you realize just what an RPM upgrade does.... The nastiness gets nastier: the RPM upgrade procedure (currently) deletes the old package contents after installing the new package contents, removing the backend version that can read the database. You rpm -Uvh postgresql*.rpm across major versions, and you lose data (technically, you don't lose the data per se, you just lose the ability to read it...). And you possibly lose a postgresql user as a result. I know -- it happened to me with mission-critical data. Fortunately, I had been doing pg_dumpall's, so it wasn't too bad -- but it sure caught me off-guard! (admittedly, I was quite a newbie at the time....) I am working around that -- backing up (using an extremely restrictive set of commands, because this script MIGHT be running under a floppy install image...) the executables and libraries necessary to run the older version BEFORE the newer executables are brought in, backing up the older version's PGDATA, running the older postmaster against the older PGDATA with the older backend on a different port DURING the startup of the NEWER version's init, initdb with the newer version's backend, run the newer postmaster WHILE the older one is running, then pipe the output of the older pg_dumpall into a newer psql -e template1 session. Then, I have to verify the integrity of the transfered data, stop the older postmaster...etc. Piece of cake? Not quite. Why not let the user do all that? Because most users can't fathom doing all of that. You can see how pg_upgrade would be useful in such a scenario, no? I'm not complaining, just curious. With pg_upgrade, during the startup script for the new version, I detect the version of the PGDATA I am running with, if it's an older version I first make a backup and then pg_upgrade PGDATA. Simpler, with less likelihood of failure, IMHO. If I need to do an initdb first, not a problem -- I'm already going to have that in there for the case of a fresh install. Lamar Owen WGCR Internet Radio
Lamar Owen <lamar.owen@wgcr.org> writes: > [ messiness required to upgrade versions by piping data from a > pg_dumpall to a psql talking to the new version ] It'd be considerably less messy, and safer, if you were willing to stick the pg_dump output into a file rather than piping it on the fly. Then (a) you wouldn't need to run both versions concurrently, and (b) you'd have a dump backup if something went wrong during the install. If you compressed the dump file, which is easy enough, it'd probably also take less disk space than doing it the other way. A compressed dump should usually be a good deal smaller than the database equivalent; if you do an on-the-fly transfer then the peak usage is two full on-disk copies of the database... > You can see how pg_upgrade would be useful in such a scenario, no? pg_upgrade is hardly a magic panacea --- if the on-disk formats are at all different, then you really have little choice short of a dump under the old version and reload under the new. At most pg_upgrade might help automate that process a little more. We may have lost the option of pg_upgrade-like upgrades anyway. I'm still waiting to hear Vadim's opinion about whether pg_upgrade can be made safe under MVCC. regards, tom lane
> Bruce Momjian wrote: > >Lamar Owen wrote: > > > If I were a better C coder, and had more experience with the various > > > versions' on-disk formats, I'd be happy to try to tackle it myself. > > > But, I'm not that great of a C coder, nor do I know the data structures > > > well enough. Oh well. > > > > You would have to convert tons of rows of data in raw format. Seems > > like dump/reload would be easier. > > For normal situations, it is. However, in an RPM upgrade that occurs as > part of an OS upgrade (say, from RedHat 6.0 to RedHat 6.1), NO daemons > can be run during a package upgrade. That doesn't seem too bad until you > realize just what an RPM upgrade does.... Wow, doing a database upgrade inside an automated RPM. That's quite a task. From your description, running pg_dumpall and psql to load the data is a real chore in an automated system. Considering the changes in aligment of row elements, and index table changes, it would be quite difficult to write a program to convert that data from one format to another. Not impossible, but quite hard. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > Lamar Owen <lamar.owen@wgcr.org> writes: > > [ messiness required to upgrade versions by piping data from a > > pg_dumpall to a psql talking to the new version ] > > It'd be considerably less messy, and safer, if you were willing to > stick the pg_dump output into a file rather than piping it on the fly. > Then (a) you wouldn't need to run both versions concurrently, and > (b) you'd have a dump backup if something went wrong during the install. Pipe or file, both versions have to be installed at the same time, so, either way, it's messy. But, you are right that putting it in a file (which is the way I manually update now) is a little less hairy. But not by much. > > You can see how pg_upgrade would be useful in such a scenario, no? > > We may have lost the option of pg_upgrade-like upgrades anyway. > I'm still waiting to hear Vadim's opinion about whether pg_upgrade > can be made safe under MVCC. I'm curious as to how difficult it would be to rewrite pg_upgrade to be substantially more intelligent in its work. Thanks to CVS, we can access the on-disk formats for any version since creation -- ergo, why can't a program be written that can understand all of those formats and convert to the latest and greatest without a backend running? All of the code to deal with any version is out there in CVS already. It's just a matter of writing conversion routines that: 0.) Backup PGDATA. 1.) Determine the source PGDATA version. 2.) Load a storage manager (for reading) corresponding to that version. 3.) Load a storage manager (for writing) corresponding to latest version. 4.) Transfer tuples sequentially from old to new. 5.) Walk the PGDATA hierarchy for each and every database directory, then update PG_VERSION and other needed files. What am I missing (in concept -- I know there are alot of details that I'm skimming over)? The hard part is getting storage readers for every major version -- and there's not been THAT many on-disk format changes, has there? Now, I realize that this upgrading would HAVE to be done with no backends running and no transactions outstanding -- IOW, you only want the latest version of a tuple anyway. Was this the issue with pg_upgrade and MVCC, or am I misunderstanding it? Just the ramblings of a packager trying to make upgrades a little less painful for the masses. Lamar Owen WGCR Internet Radio
Bruce Momjian wrote: > Lamar Owen wrote: > > For normal situations, it is. However, in an RPM upgrade that occurs as > > part of an OS upgrade (say, from RedHat 6.0 to RedHat 6.1), NO daemons > > can be run during a package upgrade. That doesn't seem too bad until you > > realize just what an RPM upgrade does.... > > Wow, doing a database upgrade inside an automated RPM. That's quite a > task. From your description, running pg_dumpall and psql to load the > data is a real chore in an automated system. Oliver Elphik has done this for the Debian packages -- but debs don't have some of the draconian restrictions RPM's do. In particular, and RPM that is packaged in the Official Boxed Set CANNOT under any circumstances ask for input from the user, nor can it output anything to the user. RPM's that do so get kicked out of the boxed set. And, frankly, PostgreSQL's position in the boxed set is a Big Win. > Considering the changes in aligment of row elements, and index table > changes, it would be quite difficult to write a program to convert that > data from one format to another. Not impossible, but quite hard. Reference my message to Tom Lane. Yes, such a program would be hard -- but most of it is already written and available in CVS -- thank God for CVS! -- all that's needed is to extract the storage managers for each major version, extract the reading code, etc, to get the on-disk representation to an intermediate in memory form, then write it out with the latest and greatest storage manager (into a different file, of course, until the upgrade is finished). Unless I badly misunderstand the way PostgreSQL does things, that should work -- but I may not have expressed it the same way I see it in my mind. I'm talking about a stripped down backend, in essence, whose only purpose in life is to copy in and copy out -- but who has the unique ability to read with one storage manager and write with another. You simply choose which storge manager is used for reading by the version of the PGDATA tree. Piecing together the right CVS code snippets will be a challenge. Lamar Owen WGCR Internet Radio
Lamar Owen <lamar.owen@wgcr.org> writes: > Tom Lane wrote: >> It'd be considerably less messy, and safer, if you were willing to >> stick the pg_dump output into a file rather than piping it on the fly. >> Then (a) you wouldn't need to run both versions concurrently, and >> (b) you'd have a dump backup if something went wrong during the install. > Pipe or file, both versions have to be installed at the same time, so, > either way, it's messy. Er, no, that's the whole point. The easy way to attack this is(1) While running old installation, pg_dumpall into a file.(2)Shut down old postmaster, blow away old database files.(3) Install new version, initdb, start new postmaster.(4)Restore from pg_dump output file. > I'm curious as to how difficult it would be to rewrite pg_upgrade to be > substantially more intelligent in its work. Thanks to CVS, we can > access the on-disk formats for any version since creation -- ergo, why > can't a program be written that can understand all of those formats and > convert to the latest and greatest without a backend running? All of > the code to deal with any version is out there in CVS already. Go for it ;-). > Now, I realize that this upgrading would HAVE to be done with no > backends running and no transactions outstanding -- IOW, you only want > the latest version of a tuple anyway. Was this the issue with > pg_upgrade and MVCC, or am I misunderstanding it? The issue with MVCC is that the state of a tuple isn't solely determined by what is in the disk file for its table; you have to also consult pg_log to see whether recent transactions have been committed or not. pg_upgrade doesn't import the old pg_log into the new database (and can't very easily, since the new database will have its own), so there's a problem with recent tuples possibly getting lost. OTOH, it seems to me that this was true in older releases as well (pg_log has always been critical data), so I guess I'm not clear on why pg_upgrade worked at all, ever... regards, tom lane
> Reference my message to Tom Lane. Yes, such a program would be hard -- > but most of it is already written and available in CVS -- thank God for > CVS! -- all that's needed is to extract the storage managers for each > major version, extract the reading code, etc, to get the on-disk > representation to an intermediate in memory form, then write it out with > the latest and greatest storage manager (into a different file, of > course, until the upgrade is finished). Unless I badly misunderstand > the way PostgreSQL does things, that should work -- but I may not have > expressed it the same way I see it in my mind. Do a cost/benefit analysis on that one. :-) -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> The issue with MVCC is that the state of a tuple isn't solely determined > by what is in the disk file for its table; you have to also consult > pg_log to see whether recent transactions have been committed or not. > pg_upgrade doesn't import the old pg_log into the new database (and > can't very easily, since the new database will have its own), so there's > a problem with recent tuples possibly getting lost. > > OTOH, it seems to me that this was true in older releases as well > (pg_log has always been critical data), so I guess I'm not clear on > why pg_upgrade worked at all, ever... At the end of pg_upgrade, there are the lines: mv -f $OLDDIR/pg_log datamv -f $OLDDIR/pg_variable dataecho "You may remove the $OLDDIR directory with 'rm -r $OLDDIR'."exit0 This is used to get the proper transaction status into the new installation. Is the VACUUM added to pg_upgrade necessary? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <maillist@candle.pha.pa.us> writes: >> pg_upgrade doesn't import the old pg_log into the new database (and >> can't very easily, since the new database will have its own), so there's >> a problem with recent tuples possibly getting lost. > At the end of pg_upgrade, there are the lines: > mv -f $OLDDIR/pg_log data > mv -f $OLDDIR/pg_variable data > This is used to get the proper transaction status into the new > installation. Is the VACUUM added to pg_upgrade necessary? I'm sorry, I had that backwards (knew I shoulda checked the code). pg_upgrade *does* overwrite the destination pg_log, and what that means is that incoming tuples in user relations should be fine. What's at risk is recently-committed tuples in the system relations, notably the metadata that pg_upgrade has just inserted for those user relations. The point of the VACUUM is to try to ensure that everything in the system relations is marked as certainly committed (or certainly dead) before we discard the pg_log information. I don't recall ever hearing from Vadim about whether that is a trustworthy way of doing it, however. One thing that occurs to me just now is that we probably need to vacuum *each* database in the new installation. The patch I added to pg_dump doesn't do the job because it only vacuums whichever database was dumped last by pg_dumpall... regards, tom lane
> pg_upgrade *does* overwrite the destination pg_log, and what that > means is that incoming tuples in user relations should be fine. > What's at risk is recently-committed tuples in the system relations, > notably the metadata that pg_upgrade has just inserted for those > user relations. > > The point of the VACUUM is to try to ensure that everything > in the system relations is marked as certainly committed (or > certainly dead) before we discard the pg_log information. > I don't recall ever hearing from Vadim about whether that > is a trustworthy way of doing it, however. > > One thing that occurs to me just now is that we probably need > to vacuum *each* database in the new installation. The patch > I added to pg_dump doesn't do the job because it only vacuums > whichever database was dumped last by pg_dumpall... I see what you are saying now. pg_upgrade basically replaces the system tables, but keeps the user data and pg_log. So, if you do initdb, and create your user table, then recover the user data tables and pg_log, and if pg_log has a transaction marked as aborted that has the same number as one of the user create table statements, it would not see the table. I see why the vacuum is needed. I wrote pg_upgrade as an attempt to do upgrades without dumping. I heard so little about it when it was introduced, I thought it was not really being used. When I disabled it for 6.5, I found out how many people were using it without incident. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > At the end of pg_upgrade, there are the lines: > > mv -f $OLDDIR/pg_log data > mv -f $OLDDIR/pg_variable data > > echo "You may remove the $OLDDIR directory with 'rm -r $OLDDIR'." > exit 0 > > This is used to get the proper transaction status into the new > installation. Is the VACUUM added to pg_upgrade necessary? You know, up until this message I had the mistaken impression that pg_upgrade was a C program... Boy was I wrong. And no wonder it's hairy. I should have read the source first -- but nooo, I couldn't do that. Open mouth, insert foot. I _am_ contemplating a C version that would do far more than just upgrades. I'm thinking of a pg_repair utility that could rebuild and repair the on-disk structures. It would also facilitate database recovery after a crash -- might be a real bear to do right. Comments? Lamar Owen
Tom Lane wrote: > Lamar Owen <lamar.owen@wgcr.org> writes: > > Pipe or file, both versions have to be installed at the same time, so, > > either way, it's messy. > > Er, no, that's the whole point. The easy way to attack this is > (1) While running old installation, pg_dumpall into a file. > (2) Shut down old postmaster, blow away old database files. > (3) Install new version, initdb, start new postmaster. > (4) Restore from pg_dump output file. Would to God it were that easy! During an RPM upgrade, I have to observer the following: 1.) The user types rpm -Uvh postgresql*.i386.rpm, or executes an upgrade from an older RedHat version to a newer RedHat version. 2.) The first rpm's preinstall script starts running. The old version of that rpm is still installed at this point, BUT I CAN'T EXECUTE ANY DAEMONS -- the upgrade MIGHT be running in the wicked chroot environment of the RedHat installer, with its restrictive set of commands. So, I CANNOT start a postmaster, nor can I be assured that a postmaster is running -- according to RedHat, since it could be running in the chroot installer, I can't even run a ps to SEE if postmaster is running (problems with a chrooted /proc...). Therefore, the preinstall script CANNOT execute pg_dumpall. I can't even run a standalone backend -- postmaster MIGHT be running.... And, I can't test to see if I'm running in the installer or not... ;-( The only thing I CAN do is check /tmp for the lock file. 3.) Once the preinstall script is finished, rpm blows in the first rpm's files. This of course overwrites the previous version. 4.) Once all files are blown in, the postinstall script can run. It has the same restrictions that the preinstall script does, since the rpm COULD be running in the chroot installer. 5.) Repeat 2-4 for the remainder of the rpms. If it weren't for the restrictions, it wouldn't be too hard. I think I have it mostly solved -- I just have to clean up some code and do testing. I'm using a two-stage plan -- the preinstall of the main package (which only contains clients, client libraries, and documentation) detects whether an old version of PGDATA is there or not. If it is, a backup of the PGDATA tree is performed. The hard part there is making sure a backend isn't running -- I haven't figured out how to reliably detect a running postmaster without /proc or ps. The lock file would seem to be a reliable flag -- but, what if the last invocation of postmaster crashed for some reason, left the lockfile, and the user, on the next boot, decides to upgrade versions of RedHat.... Stage two is performed in the server package's startup script (/etc/rc.d/init.d/postgresql) -- it detects the backup, cleans up PGDATA, initdb's, dumps the data from the old PGDATA (with the old binaries), and restores the data with the new binaries. > > convert to the latest and greatest without a backend running? All of > > the code to deal with any version is out there in CVS already. > > Go for it ;-). For some reason, I just KNEW you'd say that :-). Given six months of spare time, I probably could. But, in the meantime, people's databases are getting farkled by rpm upgrades, so I have to solve the problem. > > the latest version of a tuple anyway. Was this the issue with > > pg_upgrade and MVCC, or am I misunderstanding it? > > The issue with MVCC is that the state of a tuple isn't solely determined > by what is in the disk file for its table; you have to also consult > pg_log to see whether recent transactions have been committed or not. > pg_upgrade doesn't import the old pg_log into the new database (and > can't very easily, since the new database will have its own), so there's > a problem with recent tuples possibly getting lost. The behavior I'm describing for pg_upgrade (let me name my program something different, for clarity, pg_data_uprev) is to take an old PGDATA tree, and convert it to new format into a blank, non-initdbed tree, and get a consistent new format PGDATA tree. Thus, there are no existing files at all to worry with. Visualize a filter -- old-PGDATA -> pg_data_uprev -> new-PGDATA, with no backends involved at all. Lamar Owen WGCR Internet Radio
> You know, up until this message I had the mistaken impression that > pg_upgrade was a C program... Boy was I wrong. And no wonder it's > hairy. I should have read the source first -- but nooo, I couldn't do > that. Open mouth, insert foot. Yes, a quick few hour hack to do a quick upgrade. Worked better than I thought it would. > I _am_ contemplating a C version that would do far more than just > upgrades. I'm thinking of a pg_repair utility that could rebuild and > repair the on-disk structures. It would also facilitate database > recovery after a crash -- might be a real bear to do right. Comments? A bear. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> 2.) The first rpm's preinstall script starts running. The old version > of that rpm is still installed at this point, BUT I CAN'T EXECUTE ANY > DAEMONS -- the upgrade MIGHT be running in the wicked chroot environment > of the RedHat installer, with its restrictive set of commands. So, I > CANNOT start a postmaster, nor can I be assured that a postmaster is > running -- according to RedHat, since it could be running in the chroot > installer, I can't even run a ps to SEE if postmaster is running > (problems with a chrooted /proc...). Therefore, the preinstall script > CANNOT execute pg_dumpall. I can't even run a standalone backend -- > postmaster MIGHT be running.... And, I can't test to see if I'm running > in the installer or not... ;-( The only thing I CAN do is check /tmp for > the lock file. This seems almost impossible to handle. I have enough trouble wrinting PostgreSQL C code when I have total control over the environment. BTW, you can check for a running backend by trying to telnet to the 5432 port, or trying to do a connection to the unix domain socket. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
>> I _am_ contemplating a C version that would do far more than just >> upgrades. I'm thinking of a pg_repair utility that could rebuild and >> repair the on-disk structures. It would also facilitate database >> recovery after a crash -- might be a real bear to do right. Comments? > A bear. Indeed, but also an incredibly valuable contribution if you can pull it off. If you want to tackle this task, don't let us discourage you! regards, tom lane
Lamar Owen <lamar.owen@wgcr.org> writes: > 2.) The first rpm's preinstall script starts running. The old version > of that rpm is still installed at this point, BUT I CAN'T EXECUTE ANY > DAEMONS -- the upgrade MIGHT be running in the wicked chroot environment > of the RedHat installer, with its restrictive set of commands. So, I > CANNOT start a postmaster, nor can I be assured that a postmaster is > running -- according to RedHat, since it could be running in the chroot > installer, I can't even run a ps to SEE if postmaster is running > (problems with a chrooted /proc...). Therefore, the preinstall script > CANNOT execute pg_dumpall. chroot? Where are you chrooted to? It would seem from your description that neither the preinstall nor postinstall scripts can even see the /usr/local/pgsql directory tree, which would make it impossible to do anything --- and would be an incredibly stupid way to design an installer system, so I have to assume I'm misreading what you wrote. Also, if the pre/postinstall scripts cannot contact existing processes, then there is no hope of killing/restarting any kind of daemon process, not just Postgres in particular. The restrictions you claim are there would make RPMs unusable for upgrading *anything* that has a continuously running server process. Is Red Hat really that far out in left field? > I can't even run a standalone backend -- > postmaster MIGHT be running.... And, I can't test to see if I'm running > in the installer or not... ;-( The only thing I CAN do is check /tmp for > the lock file. chroot would generally imply that you can't see the regular /tmp dir, either. regards, tom lane
Tom Lane wrote: > Lamar Owen <lamar.owen@wgcr.org> writes: > > DAEMONS -- the upgrade MIGHT be running in the wicked chroot environment > > of the RedHat installer, with its restrictive set of commands. So, I > chroot? Where are you chrooted to? It would seem from your description > that neither the preinstall nor postinstall scripts can even see the > /usr/local/pgsql directory tree, which would make it impossible to do > anything --- and would be an incredibly stupid way to design an > installer system, so I have to assume I'm misreading what you wrote. I think you are misreading what I wrote, which is not at all surprising -- it took me awhile to grok it. No, during the installation of a version of RedHat Linux, the installer (which boots off of either a floppy set or a virtual El Torito image on CD) installs all the RPM's to the new root filesystem under chroot to that new root filesystem. Thus, the real root is /dev/fd0 or whatever the El Torito image's /dev entry is. The new root is mounted in a directory off of the real root, and the rpm is installed with a chroot to the new very incomplete root. Fortunately, PostgreSQL gets installed down the list quite a ways, as P is after the halfway point. To add to the confusion, there IS no /usr/local/pgsql -- RedHat has munged the installation around to conform to the FSSTND for Linux -- meaning that the PostgreSQL binaries go in /usr/bin, the libraries go in /usr/lib, the templates and other libraries that would ordinarily go in PGLIB go in /usr/lib/pgsql, and PGDATA is /var/lib/pgsql. The goal is a read-only /usr, but they are a little ways from that. And that is OK, as RPM keeps a database of what file belongs to what package. > Also, if the pre/postinstall scripts cannot contact existing processes, > then there is no hope of killing/restarting any kind of daemon process, > not just Postgres in particular. The restrictions you claim are there > would make RPMs unusable for upgrading *anything* that has a > continuously running server process. The restrictions are only on RPM's that ship as part of the Official Boxed Set. RPM's are designed to be totally self-contained -- dependencies are rigorously specified (such as the PostgreSQL RPM's dependency upon chkconfig to set the init sequence number), and assumptions are nil. I can do very little in the pre and post scripts -- making an offline backup of PGDATA and the essential executables and libraries needed to restore the old PGDATA is the extent of it. Of course, I then have to contend with the user who upgrades with postmaster running.... To summarize: RPM's that ship as part of the RedHat Official Boxed Set (OBS) (which PostgreSQL does), must contend with two very different installation environments: 1.) The chroot installer at initial operating system install time, and its OS upgrade alter ego; 2.) The environment of rpm -U, whether initiated by the user or by proxy (such as AutoRPM), which is an entirely NORMAL environment where you can do anything you want. Other RPM's that do not ship as part of the OBS do not have the restrictions of 1. However, being in the OBS is a very desireable place, as that assures that ALL RedHat users have the opportunity to use PostgreSQL -- and, in fact, PostgreSQL is the ONLY RDBMS RedHat is shipping, giving us tremendous exposure. > Is Red Hat really that far out > in left field? If you want to call it left field, yes, they are. RPM's are the HTML of the package managers -- the author has little to no control over presentation -- that is, package installation order, or, for that matter, whether the install time scripts even get run (rpm --noscripts, anyone...). It is a very _different_ environment. > chroot would generally imply that you can't see the regular /tmp dir, > either. The mounted root /tmp is visible BECAUSE of the chroot in the installer -- but Bruce's suggestion of connecting to port 5432 is a better idea. Although, in the installer, I can't do that either... ;-(. I guess I need to first detect whether we're in the installer or not. And RedHat doesn't want me to be able to do that. Catch 22. Thanks -- the discussion is helping me find holes in my strategy. Lamar Owen WGCR Internet Radio
> Bruce Momjian <maillist@candle.pha.pa.us> writes: > >> pg_upgrade doesn't import the old pg_log into the new database (and > >> can't very easily, since the new database will have its own), so there's > >> a problem with recent tuples possibly getting lost. > > > At the end of pg_upgrade, there are the lines: > > mv -f $OLDDIR/pg_log data > > mv -f $OLDDIR/pg_variable data > > This is used to get the proper transaction status into the new > > installation. Is the VACUUM added to pg_upgrade necessary? > > I'm sorry, I had that backwards (knew I shoulda checked the code). > > pg_upgrade *does* overwrite the destination pg_log, and what that > means is that incoming tuples in user relations should be fine. > What's at risk is recently-committed tuples in the system relations, > notably the metadata that pg_upgrade has just inserted for those > user relations. > > The point of the VACUUM is to try to ensure that everything > in the system relations is marked as certainly committed (or > certainly dead) before we discard the pg_log information. > I don't recall ever hearing from Vadim about whether that > is a trustworthy way of doing it, however. > > One thing that occurs to me just now is that we probably need > to vacuum *each* database in the new installation. The patch > I added to pg_dump doesn't do the job because it only vacuums > whichever database was dumped last by pg_dumpall... > I have modified pg_upgrade to vacuum all databases, as you suggested. copy pg_shadow from stdin;\. -> VACUUM;\connect template1 postgrescreate database test;\connect test postgres\connect - postgresCREATE TABLE "t1" ( I left your vacuum in there to vacuum the last database. This should help. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026