Thread: Upgrading rant.
It's that time of year again, when I remind everyone just how difficult life in the trenches with PostgreSQL can be, when the life in the trenches involves upgrades. If you don't want to read about it, then please hit DELETE in you e-mail (or nntp) client. I'll not get too vehement this time (well, I'll try not to), but I am continuously bombarded with requests for help in upgrading. These requests really bother me, particularly since I believe PostgreSQL is the finest Open Source database, period. I have attempted to help in this by building some older PostgreSQL versions on more modern Red Hat distributions; alas and alack, some relatively recent versions of PostgreSQL will simply not build on more recent Red Hat. Case in point: I tried to help out a fellow (Tony) who upgraded from Red Hat 7.1 (I believe) to Red Hat 8.0. Red Hat 8.0 includes PostgreSQL 7.2.2, and Red Hat 7.1 has PostgreSQL 7.1.something. Good thing he didn't go back to Red Hat 7.0 (PG 7.0....). So I figured I'd roll a 7.1.3 RPMset for him to install onto Red Hat 8. It was very bad. It simply would not build -- I guess it's the gcc 3 stuff. He can't downgrade! (Really! Red Hat 8 upgrades more than just PostgreSQL -- the upgrade wiped out libraries that PostgreSQL 7.1 on the previous Red Hat needed to function. The RPM installation of PostgreSQL 7.1 from the previous Red Hat would NOT reinstall.). So this man is up the creek, without a paddle, in a chicken-wire canoe WRT his 7.1 data. This is unacceptable. THIS DOESN'T HAPPEN WITH MySQL. I'm more than a little perturbed that, as MySQL adds features, it doesn't make you upgrade your database each release: it simply doesn't allow the features your database doesn't support. You can then migrate each table as you need the new features. While he really should have read our documentation and been a little more careful, we shouldn't be so anal about upgradability, either. I know it's been hashed to death, but the problem isn't going away anytime soon. I'm afraid that this is going to become a key feature, and one we are missing, but our primary Open Source competition isn't missing. And I _know_ some are just going to fingerpoint and blame Red Hat. Any such replies I will rather quickly redirect to /dev/null, as it isn't Red Hat's fault we can't do a sane upgrade. Others are going to handwave and say 'we're so advanced we can't upgrade', and still others are going to say 'Oracle can't do it; why whould we?' These replies also will meet the bottomless bit bucket -- I'm not interested in arguing whether we should allow good upgrading or not, so don't bother trying to convince me upgrades aren't important, or 'dump/initdb/restore IS an upgrade!' I am interested in sane discussion of how to make it happen. Red Hat at least has a data dumper, but even at that it isn't an easy task to upgrade. (source.redhat.com/rhdb) I believe, as I have said before, that the biggest problem preventing easy upgrades is our tight coupling of system catalog data with user data, in the same tablespace. If the system catalog data were less tightly coupled, then it might be an easier job. I also know, from the last time this was discussed, that drawing the line between 'system' and 'user' data is very difficult due to our highly extensible nature. I thought the last time through would be the _last_ time through -- but I also thought I'd be able to build older PostgreSQL packages for newer dists, which would prevent much of this problem. (OS upgrade hosed your PostgreSQL? Here's packages for your old PostgreSQL built for your shiny new OS!) In my opinion, upgrading shouldn't be something a user should have to even think about. It should just happen. Kindof like 'space reuse should just happen' too.... Postmaster should have a fallback mode when it starts up in PGDATA where PGVERSION is < postmaster version. This would take care of ninety percent or more of upgrades -- the user can dump and restore later if need be, or a migration tool can be written, or... this is where I'd like to see more discussion and less of a back burner approach. And I'd love to see someone who has the time to do so (not me) grab this bull by the horns and make it happen. (Yes, I realize the use of certain of our extensibility features will be impossible to upgrade cleanly, but that's what you get when you allow embedded C code in the backend.) I'm talking about the majority of cases where a user simply has some relational data (no custom functions, types, or operators) that is critical to their business that they need to move over QUICKLY. (dump/restore is anything but quick). And some are going to do this upgrade on their production server, regardless of how many times we tell people not to do so. Not every entity who uses PostgreSQL has on staff a professional DBA and an extra server to do the migration with. MySQL is even touting the ability to quickly upgrade, at this point (January 2003 Linux Magazine article on MySQL 4). I'm sorry, but that just gets under my skin. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen <lamar.owen@wgcr.org> writes: > So I figured I'd roll a 7.1.3 RPMset for him to install onto Red Hat 8. It > was very bad. It simply would not build -- I guess it's the gcc 3 > stuff. If you don't know *exactly* why it doesn't build, I don't think you should be blaming us for it. (FWIW, I just tried to build 7.1 branch on RHL 8.0 --- the main problem seems to be that 7.1's configure isn't expecting multiline output from "gcc --version", and it produces a PG_VERSION_STR definition that's missing a trailing quote. There are some other issues, none that look very difficult to fix, all indicative of incompatible changes in either gcc or system include files.) > THIS DOESN'T HAPPEN WITH MySQL. Oh? Do they have a crystal ball that lets them predict incompatible future platform changes? (I'm not very sure I believe your assertion above, anyway. We are painfully aware of our own compatibility issues, but I wonder how many people on this list pay close enough attention to MySQL to know what their version-to-version compatibility track record *really* is.) > I thought the last time through would be the _last_ time through -- > but I also thought I'd be able to build older PostgreSQL packages for > newer dists, which would prevent much of this problem. You could do so if you were willing to track the platform changes. 7.1 isn't hopelessly broken for RHL 8.0, but it's definitely suffering bit rot. Someone would have to expend effort on updating obsolete branches, if we wanted them to keep working in the face of incompatible platform changes. > And I'd love to see someone who has the time to do so (not me) grab > this bull by the horns and make it happen. Well, this is exactly the issue: someone would have to put substantial amounts of time into update mechanisms and/or maintenance of obsolete versions, as opposed to features, performance improvements, or bug fixes. Personally, I feel that if we weren't working as hard as we could on features/performance/bugfixes, the upgrade issue would be moot because there wouldn't *be* any reason to upgrade. So I'm not planning to redirect my priorities. But this is an open source project and every developer gets to set their own priorities. If you can persuade someone to put their time into that, go for it. > And I _know_ some are just going to fingerpoint and blame Red Hat. Any such > replies I will rather quickly redirect to /dev/null, as it isn't Red Hat's > fault we can't do a sane upgrade. I think you're wasting your time trying to hold us to a higher standard of backwards-compatibility than is maintained by the OSes and tools we must sit on top of. regards, tom lane
On Thu, Jan 02, 2003 at 07:26:06PM -0500, Tom Lane wrote: > Lamar Owen <lamar.owen@wgcr.org> writes: > > replies I will rather quickly redirect to /dev/null, as it isn't Red Hat's > > fault we can't do a sane upgrade. > > I think you're wasting your time trying to hold us to a higher standard > of backwards-compatibility than is maintained by the OSes and tools we > must sit on top of. Case in point: even though the Debian upgrade scripts have occasionally given me a near-heart attack by claiming that they didn't succcessfully upgrade when they did, I've never had this problem. Is this because Oliver is smarter than you? Or Debian is 'superior'? No, it's because _incremental upgradability_ is _the_ design goal for the Debian distribution. Lots of other stuff may work better on RedHat (auto hardware detection, etc.) but this is the design case for Debian, so the facilities are mostly there for Oliver to use to do incremental, rollbackable, upgrades. What does that mean for PostgreSQL? Perhaps Tom's right: you can't fix it in the program if the underlying system doesn't support it. Ross
Tom Lane wrote: > > >Well, this is exactly the issue: someone would have to put substantial >amounts of time into update mechanisms and/or maintenance of obsolete >versions, as opposed to features, performance improvements, or bug >fixes. > >Personally, I feel that if we weren't working as hard as we could on >features/performance/bugfixes, the upgrade issue would be moot because >there wouldn't *be* any reason to upgrade. So I'm not planning to >redirect my priorities. But this is an open source project and every >developer gets to set their own priorities. If you can persuade someone >to put their time into that, go for it. > Do not under estimate the upgrade issue. I think it is huge and a LOT of people have problems with it. Personally, I never understood why the dump/restore needed to happen in the first place. Can't the data and index file format be more rigidly defined and stuck too? Can't there just be some BKI process to add new data entries? I had the same issues with 7.1 and 7.2,
On Fri, 2003-01-03 at 13:45, mlw wrote: > Tom Lane wrote: > > >Personally, I feel that if we weren't working as hard as we could on > >features/performance/bugfixes, the upgrade issue would be moot because > >there wouldn't *be* any reason to upgrade. What about the standard Microsoft reason for upgrades - the bug fixes ;) > > So I'm not planning to > >redirect my priorities. But this is an open source project and every > >developer gets to set their own priorities. If you can persuade someone > >to put their time into that, go for it. > > > Do not under estimate the upgrade issue. Very true! If upgrading is hard, users will surely expect us to keep maintaining all non-upgradable old versions for the foreseeable future ;( > I think it is huge and a LOT of > people have problems with it. Personally, I never understood why the > dump/restore needed to happen in the first place. > > Can't the data and index file format be more rigidly defined and stuck > too? I don't think the main issues are with file _formats_ but rather with system file structures - AFAIK it is a fundamental design decision (arguably a design flaw ;( ) that we use system tables straight from page cache via C structure pointers, even though there seems to be a layer called "storage Manager" which should hide the on-disk format completely. > Can't there just be some BKI process to add new data entries? I had > the same issues with 7.1 and 7.2, -- Hannu Krosing <hannu@tm.ee>
Hannu Krosing wrote: >I don't think the main issues are with file _formats_ but rather with >system file structures - AFAIK it is a fundamental design decision >(arguably a design flaw ;( ) that we use system tables straight from >page cache via C structure pointers, even though there seems to be a >layer called "storage Manager" which should hide the on-disk format >completely. > I don't think that is a big issue, no one is saying the file format should "change" or be used any differently, just that the structures be more rigid, and anyone wishing to make a change, had better also have an upgrade strategy. Perhaps there could be a review/document phase for 7.4 where the structures are cleaned up, checked, and perhaps have a couple "reserved" entries added. (As governed by efficiecy) and have one last "speak now, or forever hold your peace" and cast them in stone. I think, as professional standards go, this is probably LONG over due.
On Thursday 02 January 2003 19:26, Tom Lane wrote: > Lamar Owen <lamar.owen@wgcr.org> writes: > > So I figured I'd roll a 7.1.3 RPMset for him to install onto Red Hat 8. > > It was very bad. It simply would not build -- I guess it's the gcc 3 > > stuff. > If you don't know *exactly* why it doesn't build, I don't think you > should be blaming us for it. Who did I blame? (no one). I just made a statement -- which you pretty much agreed with later (gcc --version being multiline IS a gcc 3 problem, no? :-)). No blame at all. None of the message was a blame game of any kind, although you seem to take it personally every time I bring up upgrading. But, then again, I take it personally when someone e-mails me ranting (and sometimes cussing) about my stupid program's not upgrading (and PostgreSQL isn't 'my' program!). > > THIS DOESN'T HAPPEN WITH MySQL. > Oh? Do they have a crystal ball that lets them predict incompatible > future platform changes? No, they just allow for the old format, while making a new format. At least that's the way it looks from reading the docs and a cursory install run. The Linux Magazine article states it in a quite pointed way -- I'll grab that magazine when I get back home and post a quote if you'd like. > (I'm not very sure I believe your assertion above, anyway. We are > painfully aware of our own compatibility issues, but I wonder how many > people on this list pay close enough attention to MySQL to know what > their version-to-version compatibility track record *really* is.) I pay close enough attention that I was asked by a publisher to do a technical review of a MySQL reference book. > > And I'd love to see someone who has the time to do so (not me) grab > > this bull by the horns and make it happen. > Well, this is exactly the issue: someone would have to put substantial > amounts of time into update mechanisms and/or maintenance of obsolete > versions, as opposed to features, performance improvements, or bug > fixes. Fixing the upgrade 'bug' is a bugfix, IMHO. A _big_ bugfix. > Personally, I feel that if we weren't working as hard as we could on > features/performance/bugfixes, the upgrade issue would be moot because > there wouldn't *be* any reason to upgrade. However, I'm sure there are people who hesitate to upgrade BECAUSE of the difficulty, thereby causing them to miss features. And potentially bugfixes, too. > So I'm not planning to > redirect my priorities. But this is an open source project and every > developer gets to set their own priorities. If you can persuade someone > to put their time into that, go for it. Which is why I shake the branches periodically, to see if I can persuade someone who can do this (in practice this means they either have a great deal of free time, or they are paid fulltime to work on PostgreSQL). > I think you're wasting your time trying to hold us to a higher standard > of backwards-compatibility than is maintained by the OSes and tools we > must sit on top of. I think PostgreSQL already sets a higher standard in many ways. Particularly in the release cycle (we don't 'release early and release often'). -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Hannu Krosing <hannu@tm.ee> writes: > I don't think the main issues are with file _formats_ but rather with > system file structures - AFAIK it is a fundamental design decision > (arguably a design flaw ;( ) that we use system tables straight from > page cache via C structure pointers, The system tables are not the problem. pg_upgrade has shown how we can have cross-version upgrades no matter how much the system catalogs change (a good thing too, because we cannot freeze the system catalog layout without bringing development to a standstill). A schema-only dump and restore is cheap enough that there's no real reason to look for any other solution. Changes in the on-disk representation of user tables would be harder to deal with, but they are also much rarer (AFAIR we've only done that twice: WAL required additions to page and tuple headers, and then there were Manfred's space-saving changes in 7.3). And as of 7.3 there is a version field in page headers, which would in theory allow for a page-at-a-time update process to work. There isn't any fundamental reason why we cannot have a pg_upgrade utility; claiming that there is something wrong with how we handle catalog changes misses the point. The point is that *someone would have to do the work*. Unless someone wants to step up and volunteer, there's little value in discussing it. regards, tom lane
On Friday 03 January 2003 15:16, Lamar Owen wrote: > On Thursday 02 January 2003 19:26, Tom Lane wrote: >> Lamar Owen Wrote > > > THIS DOESN'T HAPPEN WITH MySQL. > > Oh? Do they have a crystal ball that lets them predict incompatible > > future platform changes? > No, they just allow for the old format, while making a new format. At > least that's the way it looks from reading the docs and a cursory install > run. The Linux Magazine article states it in a quite pointed way -- I'll > grab that magazine when I get back home and post a quote if you'd like. Linux Magazine, January 2003, page 28. Article by Jeremy Zawodny. Too long to quote directly. Relevant section is headed 'Compatability'. The short of it: the on-disk format didn't change between MySQL 3.23 and MySQL 4.0, so you don't need to do a massive export and import process. You will, however, need to run mysql_fix_privilege_tables to update the grant tables to use the new privileges features in 4.0. Also, and I'll quote this one: "Moving to MySQL 4.1 will require you to either dump and re-import your data or run various ALTER commands to take advantage of its new features. However, if you don't need those features, your existing tables will work fine." It's that last sentence that has me perturbed. It is just so obvious a feature... -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Tom Lane wrote: > The system tables are not the problem. pg_upgrade has shown how we > can have cross-version upgrades no matter how much the system catalogs > change (a good thing too, because we cannot freeze the system catalog > layout without bringing development to a standstill). A schema-only > dump and restore is cheap enough that there's no real reason to look > for any other solution. > > Changes in the on-disk representation of user tables would be harder to > deal with, but they are also much rarer (AFAIR we've only done that > twice: WAL required additions to page and tuple headers, and then there > were Manfred's space-saving changes in 7.3). And as of 7.3 there is a > version field in page headers, which would in theory allow for a > page-at-a-time update process to work. > > There isn't any fundamental reason why we cannot have a pg_upgrade > utility; claiming that there is something wrong with how we handle > catalog changes misses the point. The point is that *someone would > have to do the work*. Unless someone wants to step up and volunteer, > there's little value in discussing it. pg_upgrade does work, assuming there are no changes to the index or heap file formats. (However, I now need to update it for schemas.) However, the last time I worked on it for 7.2, no one was really interested in testing it, so it never got done. In fact, there was a bug in the handling of clog or wal files, but I didn't find out about it until long after 7.2 because no one was using it. Is pg_upgrade too hard to run? Is no one really interested in it? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
----- Original Message ----- From: "Bruce Momjian" <pgman@candle.pha.pa.us> Sent: January 03, 2003 6:31 PM > Tom Lane wrote: > > The system tables are not the problem. pg_upgrade has shown how we > > can have cross-version upgrades no matter how much the system catalogs > > change (a good thing too, because we cannot freeze the system catalog > > layout without bringing development to a standstill). A schema-only > > dump and restore is cheap enough that there's no real reason to look > > for any other solution. > > > > Changes in the on-disk representation of user tables would be harder to > > deal with, but they are also much rarer (AFAIR we've only done that > > twice: WAL required additions to page and tuple headers, and then there > > were Manfred's space-saving changes in 7.3). And as of 7.3 there is a > > version field in page headers, which would in theory allow for a > > page-at-a-time update process to work. > > > > There isn't any fundamental reason why we cannot have a pg_upgrade > > utility; claiming that there is something wrong with how we handle > > catalog changes misses the point. The point is that *someone would > > have to do the work*. Unless someone wants to step up and volunteer, > > there's little value in discussing it. > > pg_upgrade does work, assuming there are no changes to the index or heap > file formats. (However, I now need to update it for schemas.) However, > the last time I worked on it for 7.2, no one was really interested in > testing it, so it never got done. In fact, there was a bug in the > handling of clog or wal files, but I didn't find out about it until long > after 7.2 because no one was using it. I guess the main point here was to have co-existing versions of data in the /data dierectory and the sever be able to the talk to all of them, with an option to upgrade the data to the most recent format at some point. If it's done that way, there won't be much of a pain if at all to upgrade the server only and benifit from the general bug-fixes and general new features. The specific features that require more recent data models, will have to be tracked down, and dealt with if attempted for the older data model. The other main point was to find smb to do the dirty job of implementing, maintaining, and tracking down format changes of that concurrent data storage "model". pg_upgrade would be the tool to envetually upgrade the data to the recent model in that case. > Is pg_upgrade too hard to run? Is no one really interested in it? Would still be nice to have some GUI to it in Java/Web/*Admin/TclTk etc. It pg_upgrade documented anywhere besided man pages and the script itself? -s
Serguei Mokhov wrote: > > pg_upgrade does work, assuming there are no changes to the index or heap > > file formats. (However, I now need to update it for schemas.) However, > > the last time I worked on it for 7.2, no one was really interested in > > testing it, so it never got done. In fact, there was a bug in the > > handling of clog or wal files, but I didn't find out about it until long > > after 7.2 because no one was using it. > > I guess the main point here was to have co-existing versions of data > in the /data dierectory and the sever be able to the talk to all of them, > with an option to upgrade the data to the most recent format at some point. > > If it's done that way, there won't be much of a pain if at all to upgrade > the server only and benifit from the general bug-fixes and general new features. > The specific features that require more recent data models, will have to > be tracked down, and dealt with if attempted for the older data model. We have to update the system tables for each release, and that is going to require something like pg_upgrade. I don't see how we can avoid that. We could enable reading of old heap/index file formats if we tried hard, but that still requires pg_upgrade to update the system catalogs. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> There isn't any fundamental reason why we cannot have a pg_upgrade >> utility; claiming that there is something wrong with how we handle >> catalog changes misses the point. > pg_upgrade does work, assuming there are no changes to the index or heap > file formats. Does it really work? I had thought that there were some issues associated with adjusting transaction numbers that couldn't be solved by pg_upgrade in its present shell-script form; I was anticipating that pg_upgrade would have to be rewritten as a C program so that it could get at stuff at the necessary low level. I cannot recall the details right at the moment though. > In fact, there was a bug in the > handling of clog or wal files, but I didn't find out about it until long > after 7.2 because no one was using it. This may be what I was recalling. Did you find a bulletproof fix? regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> There isn't any fundamental reason why we cannot have a pg_upgrade > >> utility; claiming that there is something wrong with how we handle > >> catalog changes misses the point. > > > pg_upgrade does work, assuming there are no changes to the index or heap > > file formats. > > Does it really work? I had thought that there were some issues > associated with adjusting transaction numbers that couldn't be solved > by pg_upgrade in its present shell-script form; I was anticipating that > pg_upgrade would have to be rewritten as a C program so that it could > get at stuff at the necessary low level. I cannot recall the details > right at the moment though. It just needs to be able to create an empty clog file, I think. You actually gave me the steps, but weren't sure it would work, and it was post-7.2 anyway, so I never did it. The issue is that once we increment the transaction log counter, we need to create the proper clog file. I didn't see in my testing because my testing db wasn't big enough. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Friday 03 January 2003 18:31, Bruce Momjian wrote: > Tom Lane wrote: > > There isn't any fundamental reason why we cannot have a pg_upgrade > > utility; claiming that there is something wrong with how we handle > > catalog changes misses the point. The point is that *someone would > > have to do the work*. Unless someone wants to step up and volunteer, > > there's little value in discussing it. > pg_upgrade does work, assuming there are no changes to the index or heap > file formats. (However, I now need to update it for schemas.) However, > the last time I worked on it for 7.2, no one was really interested in > testing it, so it never got done. In fact, there was a bug in the > handling of clog or wal files, but I didn't find out about it until long > after 7.2 because no one was using it. > Is pg_upgrade too hard to run? Is no one really interested in it? It has been considered 'experimental' in the past, Bruce. It needs more credibility from the development group, as in 'We recommend you try to use pg_upgrade (after making a backup), then attempt to do a dump/restore if pg_upgrade doesn't work" (and pg_upgrade needs to be more robust in its failure modes). I am very interested in pg_upgrade, as I believe I mentioned the last go through this topic. But it's the 'red-headed stepchild' utility here. (I'm red-headed, my mother's red-headed, so no slight meant to those of fiery folicles.) But it's also all or nothing -- you go the whole way through. It's again our tremendous dependence upon the contents of the system catalogs that does us in. That is one of our greatest strengths, until you have to upgrade for some reason. Then it becomes our biggest liability. And unlike Tom I think it is worthwhile to discuss it periodically, to remind the group as a whole (which composition and membership changes frequently) that there's a problem waiting to be solved. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
OK, taking up the pg_upgrade banner, I think there are two things missing from the current code: 1) schema awareness -- easily fixed with some code 2) need to creat clog files to match incremented xid I can do 1, and I think Tom can help me with 2. Then folks can test it and see how it works. It is that last part that made me stop around 7.2, and 7.3 had a heap format change that guarantteed it wouldn't work. Also, I think we make index format changes more frequently that Tom recollects. Tom? --------------------------------------------------------------------------- Lamar Owen wrote: > On Friday 03 January 2003 18:31, Bruce Momjian wrote: > > Tom Lane wrote: > > > There isn't any fundamental reason why we cannot have a pg_upgrade > > > utility; claiming that there is something wrong with how we handle > > > catalog changes misses the point. The point is that *someone would > > > have to do the work*. Unless someone wants to step up and volunteer, > > > there's little value in discussing it. > > > pg_upgrade does work, assuming there are no changes to the index or heap > > file formats. (However, I now need to update it for schemas.) However, > > the last time I worked on it for 7.2, no one was really interested in > > testing it, so it never got done. In fact, there was a bug in the > > handling of clog or wal files, but I didn't find out about it until long > > after 7.2 because no one was using it. > > > Is pg_upgrade too hard to run? Is no one really interested in it? > > It has been considered 'experimental' in the past, Bruce. It needs more > credibility from the development group, as in 'We recommend you try to use > pg_upgrade (after making a backup), then attempt to do a dump/restore if > pg_upgrade doesn't work" (and pg_upgrade needs to be more robust in its > failure modes). > > I am very interested in pg_upgrade, as I believe I mentioned the last go > through this topic. But it's the 'red-headed stepchild' utility here. (I'm > red-headed, my mother's red-headed, so no slight meant to those of fiery > folicles.) But it's also all or nothing -- you go the whole way through. > > It's again our tremendous dependence upon the contents of the system catalogs > that does us in. That is one of our greatest strengths, until you have to > upgrade for some reason. Then it becomes our biggest liability. > > And unlike Tom I think it is worthwhile to discuss it periodically, to remind > the group as a whole (which composition and membership changes frequently) > that there's a problem waiting to be solved. > -- > Lamar Owen > WGCR Internet Radio > 1 Peter 4:11 > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > OK, taking up the pg_upgrade banner, I think there are two things > missing from the current code: > 1) schema awareness -- easily fixed with some code > 2) need to creat clog files to match incremented xid > I can do 1, and I think Tom can help me with 2. I was just now wondering whether we really need to do that at all. We're already vacuuming the user tables before we bring 'em over. What if we VACUUM FREEZE them instead? Then there are *no* xids of interest in the tables being brought over, and no need to screw around with the xid counter in the new installation. That in turn would mean no need to mess with its pg_clog files. I think we'd still need to advance the xlog position past the old installation's xlog end, but we have the tool for that (pg_resetxlog) already. > Also, I think we make index format changes more frequently that Tom > recollects. Tom? Oh? Name one... not that they'd be a critical problem anyway, as we could easily reconstruct indexes via REINDEX rather than moving them over, any time we made such a change. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Is pg_upgrade too hard to run? It is hard and dangerous to run (IMHO). One thing I would like to see that would make it much safer to run is to recast it to operate through standalone backends. That is, the idea is bring down old postmaster install new version run new version's pg_upgrade -- all work done without a postmaster start new postmaster It's much harder to get this wrong than it is to mess up with the current situation (where pg_upgrade talks to live postmasters). There isn't any simple way to lock *everyone* out of the DB and still allow pg_upgrade to connect via the postmaster, and even if there were, the DBA could too easily forget to do it. This would require a nontrivial amount of work (notably, we'd have to be able to get pg_dump to run against a standalone backend) but I don't think I'd trust pg_upgrade as a production-grade tool until its invocation method looks like the above. regards, tom lane
On Sat, 2003-01-04 at 02:17, Tom Lane wrote: > There isn't any simple way to lock *everyone* out of the DB and still > allow pg_upgrade to connect via the postmaster, and even if there were, > the DBA could too easily forget to do it. I tackled this issue in the Debian upgrade scripts. I close the running postmaster and open a new postmaster using a different port, so that normal connection attempts will fail because there is no postmaster running on the normal port. -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "But because of his great love for us, God, who is rich in mercy, made usalive with Christ even when we were dead in transgressions-it is by grace you have been saved." Ephesians2:4,5
On Fri, 3 Jan 2003, Bruce Momjian wrote: > Is pg_upgrade too hard to run? Is no one really interested in it? All of my boxes are still on 7.2.3. Does that represent a viable test base?
> pg_upgrade does work, assuming there are no changes to the index or heap > file formats. (However, I now need to update it for schemas.) However, > the last time I worked on it for 7.2, no one was really interested in > testing it, so it never got done. In fact, there was a bug in the > handling of clog or wal files, but I didn't find out about it until long > after 7.2 because no one was using it. > > Is pg_upgrade too hard to run? Is no one really interested in it? I'm interested. I might look into it a bit more this release cycle as it seems that I don't really have the time for major features like I did last cycle. Still looking for something straightforward to work on... Chris
Oliver Elphick <olly@lfix.co.uk> writes: > On Sat, 2003-01-04 at 02:17, Tom Lane wrote: >> There isn't any simple way to lock *everyone* out of the DB and still >> allow pg_upgrade to connect via the postmaster, and even if there were, >> the DBA could too easily forget to do it. > I tackled this issue in the Debian upgrade scripts. > I close the running postmaster and open a new postmaster using a > different port, so that normal connection attempts will fail because > there is no postmaster running on the normal port. That's a good kluge, but still a kluge: it doesn't completely guarantee that no one else connects while pg_upgrade is trying to do its thing. I am also concerned about the consequences of automatic background activities. Even the periodic auto-CHECKPOINT done by current code is not obviously safe to run behind pg_upgrade's back (it does make WAL entries). And the auto-VACUUM that we are currently thinking of is even less obviously safe. I think that someday, running pg_upgrade standalone will become *necessary*, not just a good safety feature. regards, tom lane
On Sat, 2003-01-04 at 09:53, Tom Lane wrote: > Oliver Elphick <olly@lfix.co.uk> writes: > > On Sat, 2003-01-04 at 02:17, Tom Lane wrote: > >> There isn't any simple way to lock *everyone* out of the DB and still > >> allow pg_upgrade to connect via the postmaster, and even if there were, > >> the DBA could too easily forget to do it. > > > I tackled this issue in the Debian upgrade scripts. > > > I close the running postmaster and open a new postmaster using a > > different port, so that normal connection attempts will fail because > > there is no postmaster running on the normal port. > > That's a good kluge, but still a kluge: it doesn't completely guarantee > that no one else connects while pg_upgrade is trying to do its thing. > > I am also concerned about the consequences of automatic background > activities. Even the periodic auto-CHECKPOINT done by current code > is not obviously safe to run behind pg_upgrade's back (it does make > WAL entries). And the auto-VACUUM that we are currently thinking of > is even less obviously safe. I think that someday, running pg_upgrade > standalone will become *necessary*, not just a good safety feature. > > regards, tom lane I thought there was talk of adding a "single user"/admin only mode. That is, where only the administrator can connect to the database. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Tom Lane writes: > This would require a nontrivial amount of work (notably, we'd have to > be able to get pg_dump to run against a standalone backend) but I don't > think I'd trust pg_upgrade as a production-grade tool until its > invocation method looks like the above. I would recommend requiring users to do the schema dump before upgrading the binaries, so they'd do pg_dumpall -s > schemadump pg_ctl stop [upgrade binaries] pg_upgrade --option=schemadump pg_ctl start -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> This would require a nontrivial amount of work (notably, we'd have to >> be able to get pg_dump to run against a standalone backend) but I don't >> think I'd trust pg_upgrade as a production-grade tool until its >> invocation method looks like the above. > I would recommend requiring users to do the schema dump before upgrading > the binaries, so they'd do Hm. I think we'd need to have the new version's pg_upgrade (we cannot be sure that an older version would be aware of what information must be dumped), and the newer pg_dump is also likely to be a better bet than the old (we've seen several times in the past that this allows us to get around problems in a prior version of pg_dump). But we'd need access to the old version's postmaster/postgres executable, since by assumption the newer one will not run in the old data directory. I recall Lamar complaining often about the need for multiple executable versions to be available simultaneously. Can we perhaps alter the packaging to make this easier? I'm envisioning for example installing the postmaster/postgres executable as "postgres.7.5", with symlinks from "postmaster" and "postgres"; then we need not overwrite "postgres.7.4" and that could be invoked by pg_upgrade. (I'm just brainstorming here, this isn't necessarily a serious proposal. Better ideas anyone?) regards, tom lane
Tom Lane wrote: > Oliver Elphick <olly@lfix.co.uk> writes: > > On Sat, 2003-01-04 at 02:17, Tom Lane wrote: > >> There isn't any simple way to lock *everyone* out of the DB and still > >> allow pg_upgrade to connect via the postmaster, and even if there were, > >> the DBA could too easily forget to do it. > > > I tackled this issue in the Debian upgrade scripts. > > > I close the running postmaster and open a new postmaster using a > > different port, so that normal connection attempts will fail because > > there is no postmaster running on the normal port. > > That's a good kluge, but still a kluge: it doesn't completely guarantee > that no one else connects while pg_upgrade is trying to do its thing. I was thinking about using GUC: #max_connections = 32#superuser_reserved_connections = 2 Set both of those to 1, and you lock out everyone but the super-user. In fact, we can specify those on postmaster start with -c max_connections=1, etc. > I am also concerned about the consequences of automatic background > activities. Even the periodic auto-CHECKPOINT done by current code > is not obviously safe to run behind pg_upgrade's back (it does make > WAL entries). And the auto-VACUUM that we are currently thinking of > is even less obviously safe. I think that someday, running pg_upgrade > standalone will become *necessary*, not just a good safety feature. Yes, certainly we are in major hack mode with pg_upgrade. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> That's a good kluge, but still a kluge: it doesn't completely guarantee >> that no one else connects while pg_upgrade is trying to do its thing. > I was thinking about using GUC: > #max_connections = 32 > #superuser_reserved_connections = 2 > Set both of those to 1, and you lock out everyone but the super-user. You're missing the point: I don't want to lock out everyone but the super-user, I want to lock out everyone, period. Superusers are just as likely to screw up pg_upgrade as anyone else. BTW: $ postmaster -N 1 -c superuser_reserved_connections=1 postmaster: superuser_reserved_connections must be less than max_connections. $ regards, tom lane
On Sat, 2003-01-04 at 22:37, Tom Lane wrote: > You're missing the point: I don't want to lock out everyone but the > super-user, I want to lock out everyone, period. Superusers are just > as likely to screw up pg_upgrade as anyone else. > > BTW: > > $ postmaster -N 1 -c superuser_reserved_connections=1 > postmaster: superuser_reserved_connections must be less than max_connections. > $ > Well, first, let me say that the above just seems wrong. I can't think of any valid reason why reserved shouldn't be allowed to equal max. I also assumed that pg_update would be attempting to connect as the superuser. Therefore, if you only allow a single connection from the superuser and pg_upgrade is using it, that would seem fairly hard to mess things up. On top of that, that's also the risk of someone being a superuser. They will ALWAYS have the power to hose things. Period. As such, I don't consider that to be a valid argument. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Bruce Momjian wrote: >pg_upgrade does work, assuming there are no changes to the index or heap >file formats. (However, I now need to update it for schemas.) However, >the last time I worked on it for 7.2, no one was really interested in >testing it, so it never got done. In fact, there was a bug in the >handling of clog or wal files, but I didn't find out about it until long >after 7.2 because no one was using it. > >Is pg_upgrade too hard to run? Is no one really interested in it? > > > As far as I've seen, it is a cool idea, but I can't trust it. I have the USA tiger census data in a database, it is over 60G with indexes, 30G+ of just data. Do you know how long that will take to dump and restore? Making one index on some of the tables takes 20 minutes. IMHO: (1) The PostgreSQL core development team has to commit to an in place upgrade. I think the in-place upgrade strategy is very important, and it will take an effort and commitment from the core development team. I doubt seriously it can be done in a robust and safe way if the feature is not a stated design goal. (2) Upgrade HAS HAS HAS to be fool proof. No one is going to use it if you say, backup your data just in case. It should be as trust worthy as postgresql itself. If it can't be that it is not a valid tool. Anything less will not be used by professionals and wouldn't be worth the effort. (3) It should be able to span more than one version. If I upgrade from two versions back, it should work. It should not balk. The above is simply my opinion, and probably not possible with previous versions, but moving forward, it should be, if it is a priority. If it is not a priority, then it is not worth doing. Again, just my opinion.
> -----Original Message----- > From: mlw [mailto:pgsql@mohawksoft.com] > Sent: 05 January 2003 16:36 > To: Bruce Momjian > Cc: Tom Lane; Hannu Krosing; Lamar Owen; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Upgrading rant. > > (2) Upgrade HAS HAS HAS to be fool proof. Agreed. > No one is going to use it if you say, backup your data just > in case. I think we would be irresponsible to _not_ say that, no matter how good we thought the code was. > It > should be as trust worthy as postgresql itself. If it can't > be that it > is not a valid tool. Anything less will not be used by > professionals and > wouldn't be worth the effort. Agreed again. But I still backup my PostgreSQL boxes, and would advise anyone else to as well. Frankly I'm glad I do, especially having had yet another Fujitsu disk die the other day. Regards, Dave.
mlw wrote: > I have the USA tiger census data in a database, it is over 60G with > indexes, 30G+ of just data. Do you know how long that will take to dump > and restore? Making one index on some of the tables takes 20 minutes. Oh, come on. How many tigers are their in the USA? Certainly not 30G+ worth. ;-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Greg Copeland wrote: > On Sat, 2003-01-04 at 22:37, Tom Lane wrote: > > You're missing the point: I don't want to lock out everyone but the > > super-user, I want to lock out everyone, period. Superusers are just > > as likely to screw up pg_upgrade as anyone else. > > > > BTW: > > > > $ postmaster -N 1 -c superuser_reserved_connections=1 > > postmaster: superuser_reserved_connections must be less than max_connections. > > $ > > > > Well, first, let me say that the above just seems wrong. I can't think > of any valid reason why reserved shouldn't be allowed to equal max. > > I also assumed that pg_update would be attempting to connect as the > superuser. Therefore, if you only allow a single connection from the > superuser and pg_upgrade is using it, that would seem fairly hard to > mess things up. On top of that, that's also the risk of someone being a > superuser. They will ALWAYS have the power to hose things. Period. As > such, I don't consider that to be a valid argument. That was my feeling too. If you can't trust the other admins, it is hard for us to trust them either. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > OK, taking up the pg_upgrade banner, I think there are two things > > missing from the current code: > > > 1) schema awareness -- easily fixed with some code > > 2) need to creat clog files to match incremented xid > > > I can do 1, and I think Tom can help me with 2. > > I was just now wondering whether we really need to do that at all. > We're already vacuuming the user tables before we bring 'em over. > What if we VACUUM FREEZE them instead? Then there are *no* xids of > interest in the tables being brought over, and no need to screw around > with the xid counter in the new installation. That in turn would mean > no need to mess with its pg_clog files. I think we'd still need to > advance the xlog position past the old installation's xlog end, but we > have the tool for that (pg_resetxlog) already. VACUUM FREEZE. Interesting idea. Did we have that in 7.2? I never thought of using it. Good idea. Why do we have to do WAL? Do we have WAL log id's in the tuple headers? I don't remember. I don't see them in a quick look. > > Also, I think we make index format changes more frequently that Tom > > recollects. Tom? > > Oh? Name one... not that they'd be a critical problem anyway, as we > could easily reconstruct indexes via REINDEX rather than moving them > over, any time we made such a change. I remember fixing index problems, and asking folks to rebuild indexes _after_ we fixed them, so I thought they had a new format. I guess they were just broken indexes that had to be rebuilt to get the fix. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > VACUUM FREEZE. Interesting idea. Did we have that in 7.2? I never > thought of using it. Good idea. IIRC, it was new in 7.2 --- but pg_upgrade goes back further than that. I am not sure if this idea just escaped us before, or if there's a hole in it. We need to study it carefully. > Why do we have to do WAL? Do we have WAL log id's in the tuple headers? Not in tuple headers, but in page headers. The buffer manager will choke if the LSN fields point past the end of WAL. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> ... On top of that, that's also the risk of someone being a >> superuser. They will ALWAYS have the power to hose things. Period. As >> such, I don't consider that to be a valid argument. > That was my feeling too. If you can't trust the other admins, it is > hard for us to trust them either. Sigh. It's not about trust: it's about whether pg_upgrade can enforce or at least check its assumptions. I don't feel that it's a production-grade tool as long as it has to cross its fingers that the DBA made no mistakes. Also, if the previous example had no impact on you, try this one: $ postmaster -N 1 -c superuser_reserved_connections=0 & $ pg_dumpall pg_dump: [archiver (db)] connection to database "regression" failed: FATAL: Sorry, too many clients already pg_dumpall: pg_dump failed on regression, exiting $ -N 1 *will* cause problems. regards, tom lane
On Saturday 04 January 2003 21:12, Peter Eisentraut wrote: > I would recommend requiring users to do the schema dump before upgrading > the binaries, so they'd do Nice theory. Won't work in RPM practice. I can't require the user to do _anything_. Due to the rules of RPM's, I can't even ask the user to do anything. All I have been able to do is prevent the upgrade from happening -- but that has its cost, too, as then older library versions have to be held back just for PostgreSQL, taking up the hapless user's disk space. While I _can_ version the binaries as Tom mentioned (and that I had thought about before), in an OS upgrade environment (where my RPMset lives more than by command line rpm invocation) I can't force the older set to be kept in a runnable form. It is very possible that the supporting libc shared libraries will be removed by the OS upgrade -- the old binaries may not even run when it is critical that they do run. In place post-binary-upgrade is the only way this is going to work properly in the environment I have to live with. That's why dump/restore is such a pain, as Tom remembers. If I can get older versions building again on newer systems, that will help buy some breathing room from my point of view. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen <lamar.owen@wgcr.org> writes: > It is very possible that the supporting libc shared libraries > will be removed by the OS upgrade -- the old binaries may not even run when > it is critical that they do run. Urgh, that's a mess. > If I can get older versions building again on newer systems, that will help > buy some breathing room from my point of view. Worst-case, we could include a back-rev postmaster binary in new distributions. However, that still requires us to keep the back revs buildable on newer OS releases, which evidently is less than trivial in the Linux world :-( regards, tom lane
On Sunday 05 January 2003 23:10, Tom Lane wrote: > Lamar Owen <lamar.owen@wgcr.org> writes: > > It is very possible that the supporting libc shared libraries > > will be removed by the OS upgrade -- the old binaries may not even run > > when it is critical that they do run. > Urgh, that's a mess. Yah, it is a mess. I've been playing in that mudpit for three years.... > > If I can get older versions building again on newer systems, that will > > help buy some breathing room from my point of view. > Worst-case, we could include a back-rev postmaster binary in new > distributions. While I have in the past suggested this, I'm not happy with the idea, even though it may be the best short-term solution. > However, that still requires us to keep the back revs > buildable on newer OS releases, which evidently is less than trivial > in the Linux world :-( That is the wonderful result of having so many rapidly developing applications, including our own. We have, I'm sure, given a few ulcers to Linux distributors too. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
In article <200301032331.h03NVfc22523@candle.pha.pa.us>,pgman@candle.pha.pa.us (Bruce Momjian) wrote: > Is pg_upgrade too hard to run? Is no one really interested in it? As an end-user, I'm very interested in pg_upgrade, but I think it's kind of a chicken and egg problem. Without much of a guarantee that it's fail-safe, I'm not inclined to test it. As kind of a hamstrung DBA, I don't have a lot of time to do mock-upgrading of my postgres installations. I would imagine there are plenty of DBAs in the user community in this position: wanting a good upgrade utility but not having the time/inclination to do testing on what is available. For commercial products, QA is usually internal. One of the slight drawbacks to the free software community is that more of the burden of QA is placed on the at-large community of users. All this being said, if an outline were posted of what steps to take to test pg_upgrade and ensure that everything were working properly, I'd be more inclined to take the plunge. As it is, I'd have to do a fair amount of research on my own to make sure nothing broke. And that is the step I don't have time for. -tfo
I think you are on to something here. Clearly dump/reload works, and testing pg_upgrade is time-consuming, so people aren't as inclined to jump into testing. It isn't quite like testing a bugfix or new feature. --------------------------------------------------------------------------- Thomas O'Connell wrote: > In article <200301032331.h03NVfc22523@candle.pha.pa.us>, > pgman@candle.pha.pa.us (Bruce Momjian) wrote: > > > Is pg_upgrade too hard to run? Is no one really interested in it? > > As an end-user, I'm very interested in pg_upgrade, but I think it's kind > of a chicken and egg problem. > > Without much of a guarantee that it's fail-safe, I'm not inclined to > test it. As kind of a hamstrung DBA, I don't have a lot of time to do > mock-upgrading of my postgres installations. I would imagine there are > plenty of DBAs in the user community in this position: wanting a good > upgrade utility but not having the time/inclination to do testing on > what is available. > > For commercial products, QA is usually internal. One of the slight > drawbacks to the free software community is that more of the burden of > QA is placed on the at-large community of users. > > All this being said, if an outline were posted of what steps to take to > test pg_upgrade and ensure that everything were working properly, I'd be > more inclined to take the plunge. As it is, I'd have to do a fair amount > of research on my own to make sure nothing broke. And that is the step I > don't have time for. > > -tfo > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Exactly. I've got something that works and is, in fact, the recommended method for upgrading, currently. For me to switch, I'd need something in which the developers were confident enough to recommend. And even to test, I'd need something more than what is available right now. -tfo In article <200301061836.h06Ia4m27264@candle.pha.pa.us>,pgman@candle.pha.pa.us (Bruce Momjian) wrote: > I think you are on to something here. Clearly dump/reload works, and > testing pg_upgrade is time-consuming, so people aren't as inclined to > jump into testing. It isn't quite like testing a bugfix or new feature.
On Fri, 03 Jan 2003 15:37:56 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >The system tables are not the problem. [...] > >Changes in the on-disk representation of user tables would be harder to >deal with, but they are also much rarer (AFAIR we've only done that >twice: WAL required additions to page and tuple headers, and then there >were Manfred's space-saving changes in 7.3). So I'm the bad guy? ;-) AFAICS handling the page and tuple format changes doesn't need much more than what I have hacked together yesterday afternoon: #include <access/htup.h> typedef struct HeapTupleHeader72Data {Oid t_oid; /* OID of this tuple -- 4 bytes */CommandId t_cmin; /* insert CID stamp --4 bytes each */CommandId t_cmax; /* delete CommandId stamp */TransactionId t_xmin; /* insert XID stamp-- 4 bytes each */TransactionId t_xmax; /* delete XID stamp */ItemPointerData t_ctid; /* current TID ofthis or newer tuple */int16 t_natts; /* number of attributes */uint16 t_infomask; /* variousinfos */uint8 t_hoff; /* sizeof() tuple header *//* ^ - 31 bytes - ^ */bits8 t_bits[1];/*bit map of NULLs */ } HeapTupleHeader72Data; typedef HeapTupleHeader72Data *HeapTupleHeader72; /* ** Convert a pre-7.3 heap tuple header to 7.3 format. ** ** On entry ht points to a heap tuple header in 7.2 format, ** which will be converted to the new format in place. ** If compact is true, the size of the heap tuple header ** (t_hoff) is reduced, otherwise enough padding bytes are ** inserted to keep the old length. ** ** The return value is the new size. */ Size HeapTupleHeader_To73Format(HeapTupleHeader ht, bool compact) {HeapTupleHeaderData newdata;Oid oid;HeapTupleHeader72 ht72;int len; ht72 = (HeapTupleHeader72) ht;oid = ht72->t_oid;MemSet(&newdata, 0, sizeof(HeapTupleHeaderData)); /* copy fixed fields */ItemPointerCopy(&ht72->t_ctid, &newdata.t_ctid);newdata.t_natts = ht72->t_natts;newdata.t_infomask= ht72->t_infomask; HeapTupleHeaderSetXmin(&newdata, ht72->t_xmin);if (newdata.t_infomask & HEAP_XMAX_INVALID) { HeapTupleHeaderSetCmin(&newdata,ht72->t_cmin);}/*if*/else { HeapTupleHeaderSetXmax(&newdata, ht72->t_xmax);}/*else*/ if (newdata.t_infomask & HEAP_MOVED) { HeapTupleHeaderSetXvac(&newdata, ht72->t_cmin);}/*if*/else { HeapTupleHeaderSetCmax(&newdata,ht72->t_cmax);}/*else*/ /* move new structure into original position */len = offsetof(HeapTupleHeaderData, t_bits);memcpy(ht, &newdata, len); /* copy bitmap (if there is one) */if (ht->t_infomask & HEAP_HASNULL) { int bitmaplen = BITMAPLEN(ht->t_natts); intoff = offsetof(HeapTupleHeader72Data, t_bits); char *p = (char *) ht; int i; Assert(len < off); for (i = 0; i < bitmaplen; ++i) { p[len + i] = p[off + i]; }/*for*/ len += bitmaplen;}/*if*/ /* pad rest with 0 */Assert(len < ht->t_hoff);memset((char *)ht + len, 0, ht->t_hoff - len); /* change length, if requested */if (compact) { if (oid != 0) { len += sizeof(Oid); }/*if*/ ht->t_hoff = MAXALIGN(len);}/*if*/ /* copy oid (if there is one) */if (oid != 0) { ht->t_infomask |= HEAP_HASOID; HeapTupleHeaderSetOid(ht, oid);}/*if*/ return ht->t_hoff; } #include <storage/bufpage.h> #include <access/htup.h> /* ** Convert a pre 7.3 heap page to 7.3 format, ** or leave the page alone, if it is already in 7.3 format. ** ** The page is converted in place. ** ** We should have exclusive access to the page, either per ** LockBufferForCleanup() or because we a running in a standalone ** tool. */ void HeapPage_To73Format(Page page, bool compact) {PageHeader phdr = (PageHeader)page;int version = PageGetPageLayoutVersion(page);Size size = PageGetPageSize(page);int maxoff= PageGetMaxOffsetNumber(page);int i; if (version == PG_PAGE_LAYOUT_VERSION) { /* already converted */ return;}/*if*/ Assert(version == 0); for (i = 1; i <= maxoff; ++i) { ItemId itid = PageGetItemId(page, i); // ??? if (ItemIdIsUsed(itid)) ... HeapTupleHeaderht = PageGetItem(page, itid); Size oldsz = ht->t_hoff; Size newsz; newsz = HeapTupleHeader_To73Format(ht, compact); if (newsz < oldsz) { int diff = oldsz - newsz; ItemOffsetoff = ItemIdGetOffset(itid); char *addr; int lng; int j; /* move tuple header to the right */ addr = (char *)ht; memmove(addr + diff, addr, newsz); itid->lp_off+= diff; itid->lp_len -= diff; /* ** Move all tuples that lie to the left of our tuple header. ** (Shamelessly copied from PageIndexTupleDelete()). */ addr = (char *) page + phdr->pd_upper; lng = (int) (off - phdr->pd_upper); if (lng > 0) memmove(addr + diff, addr, lng); memset(addr, 0, diff); /* ** Adjust upper free space boundary pointer, ** lower is not affected. */ phdr->pd_upper+= diff; /* Adjust linp entries. */ for (j = 1; j <= maxoff; ++j) { ItemId ii = PageGetItemId(page, j); if (ii->lp_off < off) ii->lp_off += diff; }/*for*/ }/*if*/ else Assert(newsz== oldsz);}/*for*/ PageSetPageSizeAndVersion(page, size, PG_PAGE_LAYOUT_VERSION); } /* ** Convert a pre 7.3 page to 7.3 format, ** or leave the page alone, if it is already in 7.3 format. ** ** The page is converted in place. ** ** We should have exclusive access to the page, either per ** LockBufferForCleanup() or because we a running in a standalone ** tool. */ void Page_To73Format(Page page) {int version = PageGetPageLayoutVersion(page);Size size = PageGetPageSize(page); if (version == PG_PAGE_LAYOUT_VERSION) { /* already converted */ return;}/*if*/ Assert(version == 0);if (PageGetSpecialSize(page) == 0) { /* ** Heap page. ** XXX Sure? ** XXX Is there a betterway to tell? */ HeapPage_To73Format(page, true);}/*if*/else { /* ** Not a heap page: no format change,just adjust version */ PageSetPageSizeAndVersion(page, size, PG_PAGE_LAYOUT_VERSION);}/*else*/ } This should handle all format changes I'm aware of:. bitmap length. overlaying fields. optional oid. page format version Am I missing something? Above code is completely untested, I've not even run it through a compiler; please consider it as a basis for discussion. If there is agreement, that we want 7.2 -> 7.3.x pg_upgrade, I'll put more work into it. What's missing is mainly a call to Page_To73Format() somewhere. I can think of. an input-file-to-output-file-converter run by pg_upgrade instead of copying/moving the files. an in-place-converterrun by pg_upgrade after copying/moving the files. converting each page during normal operation immediatelyafter it is fetched from disk. #include <access/htup.h> typedef struct HeapTupleHeader72Data {Oid t_oid; /* OID of this tuple -- 4 bytes */CommandId t_cmin; /* insert CID stamp --4 bytes each */CommandId t_cmax; /* delete CommandId stamp */TransactionId t_xmin; /* insert XID stamp-- 4 bytes each */TransactionId t_xmax; /* delete XID stamp */ItemPointerData t_ctid; /* current TID ofthis or newer tuple */int16 t_natts; /* number of attributes */uint16 t_infomask; /* variousinfos */uint8 t_hoff; /* sizeof() tuple header *//* ^ - 31 bytes - ^ */bits8 t_bits[1];/*bit map of NULLs */ } HeapTupleHeader72Data; typedef HeapTupleHeader72Data *HeapTupleHeader72; /* ** Convert a pre-7.3 heap tuple header to 7.3 format. ** ** On entry ht points to a heap tuple header in 7.2 format, ** which will be converted to the new format in place. ** If compact is true, the size of the heap tuple header ** (t_hoff) is reduced, otherwise enough padding bytes are ** inserted to keep the old length. ** ** The return value is the new size. */ Size HeapTupleHeader_To73Format(HeapTupleHeader ht, bool compact) {HeapTupleHeaderData newdata;Oid oid;HeapTupleHeader72 ht72;int len; ht72 = (HeapTupleHeader72) ht;oid = ht72->t_oid;MemSet(&newdata, 0, sizeof(HeapTupleHeaderData)); /* copy fixed fields */ItemPointerCopy(&ht72->t_ctid, &newdata.t_ctid);newdata.t_natts = ht72->t_natts;newdata.t_infomask= ht72->t_infomask; HeapTupleHeaderSetXmin(&newdata, ht72->t_xmin);if (newdata.t_infomask & HEAP_XMAX_INVALID) { HeapTupleHeaderSetCmin(&newdata,ht72->t_cmin);}/*if*/else { HeapTupleHeaderSetXmax(&newdata, ht72->t_xmax);}/*else*/ if (newdata.t_infomask & HEAP_MOVED) { HeapTupleHeaderSetXvac(&newdata, ht72->t_cmin);}/*if*/else { HeapTupleHeaderSetCmax(&newdata,ht72->t_cmax);}/*else*/ /* move new structure into original position */len = offsetof(HeapTupleHeaderData, t_bits);memcpy(ht, &newdata, len); /* copy bitmap (if there is one) */if (ht->t_infomask & HEAP_HASNULL) { int bitmaplen = BITMAPLEN(ht->t_natts); intoff = offsetof(HeapTupleHeader72Data, t_bits); char *p = (char *) ht; int i; Assert(len < off); for (i = 0; i < bitmaplen; ++i) { p[len + i] = p[off + i]; }/*for*/ len += bitmaplen;}/*if*/ /* pad rest with 0 */Assert(len < ht->t_hoff);memset((char *)ht + len, 0, ht->t_hoff - len); /* change length, if requested */if (compact) { if (oid != 0) { len += sizeof(Oid); }/*if*/ ht->t_hoff = MAXALIGN(len);}/*if*/ /* copy oid (if there is one) */if (oid != 0) { ht->t_infomask |= HEAP_HASOID; HeapTupleHeaderSetOid(ht, oid);}/*if*/ return ht->t_hoff; } #include <storage/bufpage.h> #include <access/htup.h> /* ** Convert a pre 7.3 heap page to 7.3 format, ** or leave the page alone, if it is already in 7.3 format. ** ** The page is converted in place. ** ** We should have exclusive access to the page, either per ** LockBufferForCleanup() or because we a running in a standalone ** tool. */ void HeapPage_To73Format(Page page, bool compact) {PageHeader phdr = (PageHeader)page;int version = PageGetPageLayoutVersion(page);Size size = PageGetPageSize(page);int maxoff= PageGetMaxOffsetNumber(page);int i; if (version == PG_PAGE_LAYOUT_VERSION) { /* already converted */ return;}/*if*/ Assert(version == 0); for (i = 1; i <= maxoff; ++i) { ItemId itid = PageGetItemId(page, i); // ??? if (ItemIdIsUsed(itid)) ... HeapTupleHeaderht = PageGetItem(page, itid); Size oldsz = ht->t_hoff; Size newsz; newsz = HeapTupleHeader_To73Format(ht, compact); if (newsz < oldsz) { int diff = oldsz - newsz; ItemOffsetoff = ItemIdGetOffset(itid); char *addr; int lng; int j; /* move tuple header to the right */ addr = (char *)ht; memmove(addr + diff, addr, newsz); itid->lp_off+= diff; itid->lp_len -= diff; /* ** Move all tuples that lie to the left of our tuple header. ** (Shamelessly copied from PageIndexTupleDelete()). */ addr = (char *) page + phdr->pd_upper; lng = (int) (off - phdr->pd_upper); if (lng > 0) memmove(addr + diff, addr, lng); memset(addr, 0, diff); /* ** Adjust upper free space boundary pointer, ** lower is not affected. */ phdr->pd_upper+= diff; /* Adjust linp entries. */ for (j = 1; j <= maxoff; ++j) { ItemId ii = PageGetItemId(page, j); if (ii->lp_off < off) ii->lp_off += diff; }/*for*/ }/*if*/ else Assert(newsz== oldsz);}/*for*/ PageSetPageSizeAndVersion(page, size, PG_PAGE_LAYOUT_VERSION); } /* ** Convert a pre 7.3 page to 7.3 format, ** or leave the page alone, if it is already in 7.3 format. ** ** The page is converted in place. ** ** We should have exclusive access to the page, either per ** LockBufferForCleanup() or because we a running in a standalone ** tool. */ void Page_To73Format(Page page) {int version = PageGetPageLayoutVersion(page);Size size = PageGetPageSize(page); if (version == PG_PAGE_LAYOUT_VERSION) { /* already converted */ return;}/*if*/ Assert(version == 0);if (PageGetSpecialSize(page) == 0) { /* ** Heap page. ** XXX Sure? ** XXX Is there a betterway to tell? */ HeapPage_To73Format(page, true);}/*if*/else { /* ** Not a heap page: no format change,just adjust version */ PageSetPageSizeAndVersion(page, size, PG_PAGE_LAYOUT_VERSION);}/*else*/ } ServusManfred
On Tue, 07 Jan 2003 11:18:15 +0100, I wrote: >what I have hacked together yesterday afternoon: [included it twice] Sorry! ServusManfred