Thread: (A) native Windows port
Hackers, as some of you figured already, Katie Ward and I are working fulltime on PostgreSQL and are actually doing a native Win32 port. This port is not based on CygWIN, Apache or any other compatibility library but uses 100% native Windows functionality only. We already have it far enough to create and drop databases, tables and of course do the usual stuff (like INSERT, UPDATE, DELETE and SELECT). But there is still plenty of work, so don't worry, all of you will have a chance to leave their finger- and/or footprints. What I want to start today is discussion about project coordination and code management. Our proposal is to provide a diff first. I have no clue when exactly this will happen, but assuming the usual PostgreSQL schedule behaviour I would say it's measured in weeks :-). A given is that we will contribute this work under the BSD license. We will upload the diff to developer.postgresql.org and post a link together with build instructions to hackers. After some discussion we can create a CVS branch and apply that patch to there. Everyone who wants to contribute to the Win32 port can then work in that branch. Katie and I will take care that changes in trunk will periodically get merged into the Win32 branch. This model guarantees that we don't change the mainstream PostgreSQL until the developers community decides to follow this road and choose this implementation as the PostgreSQL Win32 port. At that point we can merge the Win32 port into the trunk and ship it with the next release. As for project coordination, I am willing to setup and maintain a page similar to the (horribly outdated) ones that I did for Toast and RI. Summarizing project status, pointing to resources, instructions, maybe a roadmap, TODO, you name it. Comments? Suggestions? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> -----Original Message----- > From: Jan Wieck [mailto:JanWieck@Yahoo.com] > Sent: 26 June 2002 15:45 > To: HACKERS > Subject: [HACKERS] (A) native Windows port > > > As for project coordination, I am willing to setup and > maintain a page similar to the (horribly outdated) ones that > I did for Toast and RI. Summarizing project status, pointing > to resources, instructions, maybe a roadmap, TODO, you name it. > > Comments? Suggestions? Great, can't wait to see your work. I can probably sort out an installer shortly after you have the first code available - that way we can work out kinks in a binary distribution, as well as hopefully get some more testers who may not have compilers etc on their windows boxes. Let me know if you'd like me to work on that... Regards, Dave.
Jan Wieck wrote: > As for project coordination, I am willing to setup and maintain a page > similar to the (horribly outdated) ones that I did for Toast and RI. > Summarizing project status, pointing to resources, instructions, maybe a > roadmap, TODO, you name it. Great. Please see roadmap in TODO.detail/win32 for a list of items and possible approaches. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> As for project coordination, I am willing to setup and maintain a page > similar to the (horribly outdated) ones that I did for Toast and RI. > Summarizing project status, pointing to resources, instructions, maybe a > roadmap, TODO, you name it. I am willing to supply a complete, friendly, powerful and pretty installer program, based on NSIS. http://www.winamp.com/nsdn/nsis/index.jhtml I suggest that pgAdmin is included in the install process. Imagine it - a win32 person downloads a single .exe, with contents bzip2'd. They run the installer, it asks them to agree to license, shows splash screen, asks them where to install it, gets them to supply an installation password and installs pgadmin. It could set up a folder in their start menu with start/stop, edit configs, uninstall and run pgadmin. It would all work out of the box and would do wonderful things for the Postgres community. Chris
On Wednesday 26 June 2002 11:48 pm, Christopher Kings-Lynne wrote: > I suggest that pgAdmin is included in the install process. Imagine it - a > win32 person downloads a single .exe, with contents bzip2'd. They run the > installer, it asks them to agree to license, shows splash screen, asks them > where to install it, gets them to supply an installation password and > installs pgadmin. It could set up a folder in their start menu with > start/stop, edit configs, uninstall and run pgadmin. > It would all work out of the box and would do wonderful things for the > Postgres community. I like this idea, but let me just bring one little issue to note: are you going to handle upgrades, and if so, how? How are you going to do a major version upgrade? -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
> > It would all work out of the box and would do wonderful things for the > > Postgres community. > > I like this idea, but let me just bring one little issue to note: are you > going to handle upgrades, and if so, how? How are you going to > do a major > version upgrade? Well, the easiest way would be to get them to uninstall the old version first, but I'm sure it can be worked out. Perhaps even we shouldn't overwrite the old version anyway? Chris
How does the upgrade work on UNIX? Is there anything available apart from reading the release note? ----- Original Message ----- From: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> To: "Lamar Owen" <lamar.owen@wgcr.org>; "Jan Wieck" <JanWieck@Yahoo.com>; "HACKERS" <pgsql-hackers@postgresql.org> Sent: Tuesday, July 02, 2002 12:48 PM Subject: Re: [HACKERS] (A) native Windows port > > > It would all work out of the box and would do wonderful things for the > > > Postgres community. > > > > I like this idea, but let me just bring one little issue to note: are you > > going to handle upgrades, and if so, how? How are you going to > > do a major > > version upgrade? > > Well, the easiest way would be to get them to uninstall the old version > first, but I'm sure it can be worked out. Perhaps even we shouldn't > overwrite the old version anyway? > > Chris > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > > >
Christopher Kings-Lynne wrote: > > > > It would all work out of the box and would do wonderful things for the > > > Postgres community. > > > > I like this idea, but let me just bring one little issue to note: are you > > going to handle upgrades, and if so, how? How are you going to > > do a major > > version upgrade? > > Well, the easiest way would be to get them to uninstall the old version > first, but I'm sure it can be worked out. Perhaps even we shouldn't > overwrite the old version anyway? The question is not how to replace some .EXE and .DLL files or modify something in the registry. The question is what to do with the existing databases in the case of a catalog version change. You have to dump and restore. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Tuesday 02 July 2002 09:52 am, Jan Wieck wrote: > Christopher Kings-Lynne wrote: > > > > It would all work out of the box and would do wonderful things for > > > > the Postgres community. > > > I like this idea, but let me just bring one little issue to note: are > > > you going to handle upgrades, and if so, how? How are you going to do > > > a major > > > version upgrade? > > Well, the easiest way would be to get them to uninstall the old version > > first, but I'm sure it can be worked out. Perhaps even we shouldn't > > overwrite the old version anyway? > The question is not how to replace some .EXE and .DLL files or modify > something in the registry. The question is what to do with the existing > databases in the case of a catalog version change. You have to dump and > restore. Now, riddle me this: we're going to explain the vagaries of dump/initdb/restore to a typical Windows user, and further explain why the dump won't necessarily restore because of a bug in the older version's dump.... The typical Windows user is going to barf when confronted with our extant 'upgrade' process. While I really could not care less if PostgreSQL goes to Windows or not, I am of a mind to support the Win32 effort if it gets an upgrade path done so that everyone can upgrade sanely. At least the Windows installer can check for existing database structures and ask what to do -- the RPM install cannot do this. In fact, the Windows installer *must* check for an existing database installation, or we're going to get fried by typical Windows users. And if having a working, usable, Win32 native port gets the subject of good upgrading higher up the priority list, BY ALL MEANS LET'S SUPPORT WIN32 NATIVELY! :-) (and I despise Win32....) But it shouldn't be an installer issue -- this is an issue which cause pain for all of our users, not just Windows or RPM (or Debian) users. Upgrading (pg_upgrade is a start -- but it's not going to work as written on Windows) needs to be core functionality. If I can't easily upgrade my database, what good are new features going to do for me? Martin O has come up with a 'pg_fsck' utility that, IMHO, holds a great deal of promise for seamless binary 'in place' upgrading. He has been able to write code to read multiple versions' database structures -- proving that it CAN be done. Windows programs such as Lotus Organizer, Microsoft Access, Lotus Approach, and others, allow you to convert the old to the new as part of initial startup. This will be a prerequisite for wide acceptance in the Windows world, methinks. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
> The question is not how to replace some .EXE and .DLL files or modify > something in the registry. The question is what to do with the existing > databases in the case of a catalog version change. You have to dump and > restore. pg_upgrade? Otherwise: no upgrades persay, but you can intall the new version into a new directory and then have an automated pg_dump / restore between the old and the new. This would require a lot of disk space, but I don't see any other clean way to automate it.
Lamar Owen wrote: > [...] > > And if having a working, usable, Win32 native port gets the subject of good > upgrading higher up the priority list, BY ALL MEANS LET'S SUPPORT WIN32 > NATIVELY! :-) (and I despise Win32....) Hehehe :-) > [...] > Martin O has come up with a 'pg_fsck' utility that, IMHO, holds a great deal > of promise for seamless binary 'in place' upgrading. He has been able to > write code to read multiple versions' database structures -- proving that it > CAN be done. Unfortunately it's not the on-disk binary format of files that causes the big problems. Our dump/initdb/restore sequence is also the solution for system catalog changes. If we add/remove internal functions, there will be changes to pg_proc. When the representation of parsetrees changes, there will be changes to pg_rewrite (dunno how to convert that). Consider adding another attribute to pg_class. You'd have to add a row in pg_attribute, possibly (because it likely isn't added at the end) increment the attno for 50% of all pg_attribute entries, and of course insert an attribute in the middle of all existing pg_class rows ... ewe. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Tuesday 02 July 2002 03:14 pm, Jan Wieck wrote: > Lamar Owen wrote: > > [...] > > Martin O has come up with a 'pg_fsck' utility that, IMHO, holds a great > > deal of promise for seamless binary 'in place' upgrading. He has been > > able to write code to read multiple versions' database structures -- > > proving that it CAN be done. > Unfortunately it's not the on-disk binary format of files that causes > the big problems. Our dump/initdb/restore sequence is also the solution > for system catalog changes. Hmmm. They get in there via the bki interface, right? Is there an OID issue with these? Could differential BKI files be possible, with known system catalog changes that can be applied via a 'patchdb' utility? I know pretty much how pg_upgrade is doing things now -- and, frankly, it's a little bit of a kludge. Yes, I do understand the things a dump restore does on somewhat of a detailed level. I know the restore repopulates the entries in the system catalogs for the restored data, etc, etc. Currently dump/restore handles the catalog changes. But by what other means could we upgrade the system catalog in place? Our very extensibility is our weakness for upgrades. Can it be worked around? Anyone have any ideas? Improving pg_upgrade may be the ticket -- but if the on-disk binary format changes (like it has before), then something will have to do the binary format translation -- something like pg_fsck. Incidentally, pg_fsck, or a program like it, should be in the core distribution. Maybe not named pg_fsck, as our database isn't a filesystem, but pg_dbck, or pg_dbcheck, pr pg_dbfix, or similar. Although pg_fsck is more of a pg_dbdump. I've seen too many people bitten by upgrades gone awry. The more we can do in the regard, the better. And the Windows user will likely demand it. I never thought I'd be grateful for a Win32 native PostgreSQL port... :-) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Le Jeudi 27 Juin 2002 05:48, Christopher Kings-Lynne a écrit : > I am willing to supply a complete, friendly, powerful and pretty installer > program, based on NSIS. Maybe you should contact Dave Page, who wrote pgAdmin2 and the ODBC installers. Maybe you can both work on the installer. By the way, when will Dave be added to the main developper list? He wrote 99% of pgAdmin on his own. Cheers, Jean-Michel POURE
On Tue, 2002-07-02 at 21:50, Lamar Owen wrote: > On Tuesday 02 July 2002 03:14 pm, Jan Wieck wrote: > > Lamar Owen wrote: > > > [...] > > > Martin O has come up with a 'pg_fsck' utility that, IMHO, holds a great > > > deal of promise for seamless binary 'in place' upgrading. He has been > > > able to write code to read multiple versions' database structures -- > > > proving that it CAN be done. > > > Unfortunately it's not the on-disk binary format of files that causes > > the big problems. Our dump/initdb/restore sequence is also the solution > > for system catalog changes. > > Hmmm. They get in there via the bki interface, right? Is there an OID issue > with these? Could differential BKI files be possible, with known system > catalog changes that can be applied via a 'patchdb' utility? I know pretty > much how pg_upgrade is doing things now -- and, frankly, it's a little bit of > a kludge. > > Yes, I do understand the things a dump restore does on somewhat of a detailed > level. I know the restore repopulates the entries in the system catalogs for > the restored data, etc, etc. > > Currently dump/restore handles the catalog changes. But by what other means > could we upgrade the system catalog in place? > > Our very extensibility is our weakness for upgrades. Can it be worked around? > Anyone have any ideas? Perhaps we can keep an old postgres binary + old backend around and then use it in single-user mode to do a pg_dump into our running backend. IIRC Access does its upgrade databse by copying old databse to new. Our approach could be like $OLD/postgres -D $OLD_DATA <pg_dump_cmds | $NEW/postgres -D NEW_BACKEND or perhaps, while old backend is still running: pg_dumpall | path_to_new_backend/bin/postgres I dont think we should assume that we will be able to do an upgrade while we have less free space than currently used by databases (or at least by data - indexes can be added later) Trying to do an in-place upgrade is an interesting CS project, but any serious DBA will have backups, so they can do $ psql < dumpfile Speeding up COPY FROM could be a good thing (perhaps enabling it to run without any checks and outside transactions when used in loading dumps) And home users will have databases small enough that they should have enough free space to have both old and new version for some time. What we do need is more-or-less solid upgrade path using pg_dump BTW, how hard would it be to move pg_dump inside the backend (perhaps using a dynamically loaded function to save space when not used) so that it could be used like COPY ? pg> DUMP table [ WITH 'other cmdline options' ] TO stdout ; pg> DUMP * [ WITH 'other cmdline options' ] TO stdout ; ---------------- Hannu
Lamar Owen wrote: > On Tuesday 02 July 2002 03:14 pm, Jan Wieck wrote: > > Lamar Owen wrote: > > > [...] > > > Martin O has come up with a 'pg_fsck' utility that, IMHO, holds a great > > > deal of promise for seamless binary 'in place' upgrading. He has been > > > able to write code to read multiple versions' database structures -- > > > proving that it CAN be done. > > > Unfortunately it's not the on-disk binary format of files that causes > > the big problems. Our dump/initdb/restore sequence is also the solution > > for system catalog changes. > > Hmmm. They get in there via the bki interface, right? Is there an OID issue > with these? Could differential BKI files be possible, with known system > catalog changes that can be applied via a 'patchdb' utility? I know pretty > much how pg_upgrade is doing things now -- and, frankly, it's a little bit of > a kludge. Sure, if it wasn't a kludge, I wouldn't have written it. ;-) Does everyone remember my LIKE indexing kludge in gram.y? Until people found a way to get it into the optimizer, it did its job. I guess that's where pg_upgrade is at this point. Actually, how can pg_upgrade be improved? Also, we have committed to making file format changes for 7.3, so it seems pg_upgrade will not be useful for that release unless we get some binary conversion tool working. > Yes, I do understand the things a dump restore does on somewhat of a detailed > level. I know the restore repopulates the entries in the system catalogs for > the restored data, etc, etc. > > Currently dump/restore handles the catalog changes. But by what other means > could we upgrade the system catalog in place? > > Our very extensibility is our weakness for upgrades. Can it be worked around? > Anyone have any ideas? > > Improving pg_upgrade may be the ticket -- but if the on-disk binary format > changes (like it has before), then something will have to do the binary > format translation -- something like pg_fsck. Yep. > Incidentally, pg_fsck, or a program like it, should be in the core > distribution. Maybe not named pg_fsck, as our database isn't a filesystem, > but pg_dbck, or pg_dbcheck, pr pg_dbfix, or similar. Although pg_fsck is > more of a pg_dbdump. > > I've seen too many people bitten by upgrades gone awry. The more we can do in > the regard, the better. I should mention that 7.3 will have pg_depend, which should make our post-7.3 reload process much cleaner because we will not have dangling objects as often. > And the Windows user will likely demand it. I never thought I'd be grateful > for a Win32 native PostgreSQL port... :-) Yea, the trick is to get an something working that will require minimal change from release to release. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Hannu Krosing wrote: > > Our very extensibility is our weakness for upgrades. Can it be worked around? > > Anyone have any ideas? > > Perhaps we can keep an old postgres binary + old backend around and then > use it in single-user mode to do a pg_dump into our running backend. That brings up an interesting idea. Right now we dump the entire database out to a file, delete the old database, and load in the file. What if we could move over one table at a time? Copy out the table, load it into the new database, then delete the old table and move on to the next. That would allow use to upgrade having free space for just the largest table. Another idea would be to record and remove all indexes in the old database. That certainly would save disk space during the upgrade. However, the limiting factor is that we don't have a mechanism to have both databases running at the same time currently. Seems this may be the direction to head in. > BTW, how hard would it be to move pg_dump inside the backend (perhaps > using a dynamically loaded function to save space when not used) so that > it could be used like COPY ? > > pg> DUMP table [ WITH 'other cmdline options' ] TO stdout ; > > pg> DUMP * [ WITH 'other cmdline options' ] TO stdout ; Intersting idea, but I am not sure what that buys us. Having pg_dump separate makes maintenance easier. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Wed, 2002-07-03 at 17:28, Bruce Momjian wrote: > Hannu Krosing wrote: > > > Our very extensibility is our weakness for upgrades. Can it be worked around? > > > Anyone have any ideas? > > > > Perhaps we can keep an old postgres binary + old backend around and then > > use it in single-user mode to do a pg_dump into our running backend. > > That brings up an interesting idea. Right now we dump the entire > database out to a file, delete the old database, and load in the file. > > What if we could move over one table at a time? Copy out the table, > load it into the new database, then delete the old table and move on to > the next. That would allow use to upgrade having free space for just > the largest table. Another idea would be to record and remove all > indexes in the old database. That certainly would save disk space > during the upgrade. > > However, the limiting factor is that we don't have a mechanism to have > both databases running at the same time currently. How so ? AFAIK I can run as many backends as I like (up to some practical limit) on the same comuter at the same time, as long as they use different ports and different data directories. > Seems this may be > the direction to head in. > > > BTW, how hard would it be to move pg_dump inside the backend (perhaps > > using a dynamically loaded function to save space when not used) so that > > it could be used like COPY ? > > > > pg> DUMP table [ WITH 'other cmdline options' ] TO stdout ; > > > > pg> DUMP * [ WITH 'other cmdline options' ] TO stdout ; > > Intersting idea, but I am not sure what that buys us. Having pg_dump > separate makes maintenance easier. can pg_dump connect to single-user-mode backend ? -------------------- Hannu
Hannu Krosing wrote: > > > > However, the limiting factor is that we don't have a mechanism to have > > both databases running at the same time currently. > > How so ? > > AFAIK I can run as many backends as I like (up to some practical limit) > on the same comuter at the same time, as long as they use different > ports and different data directories. We don't have an automated system for doing this. Certainly it is done all the time. > > Intersting idea, but I am not sure what that buys us. Having pg_dump > > separate makes maintenance easier. > > can pg_dump connect to single-user-mode backend ? Uh, no, I don't think so. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Wednesday 03 July 2002 12:09 pm, Bruce Momjian wrote: > Hannu Krosing wrote: > > AFAIK I can run as many backends as I like (up to some practical limit) > > on the same comuter at the same time, as long as they use different > > ports and different data directories. > We don't have an automated system for doing this. Certainly it is done > all the time. Good. Dialog. This is better than what I am used to when I bring up upgrading. :-) Bruce, pg_upgrade isn't as kludgey as what I have been doing with the RPMset for these nearly three years. No, what I envisioned was a standalone dumper that can produce dump output without having a backend at all. If this dumper knows about the various binary formats, and knows how to get my data into a form I can then restore reliably, I will be satisfied. If it can be easily automated so much the better. Doing it table by table would be ok as well. I'm looking for a sequence such as: ---- PGDATA=location/of/data/base TEMPDATA=location/of/temp/space/on/same/file/system mv $PGDATA/* $TEMPDATA initdb -D $PGDATA pg_dbdump $TEMPDATA |pg_restore {with its associated options, etc} With an rm -rf of $TEMPDATA much further down the pike..... Keys to this working: 1.) Must not require the old version executable backend. There are a number of reasons why this might be, but the biggest is due to the way much upgrading works in practice -- the old executables are typically gone by the time the new package is installed. 2.) Uses pg_dbdump of the new version. This dumper can be tailored to provide the input pg_restore wants to see. The dump-restore sequence has always had dumped-data version mismatch as its biggest problem -- there have been issues before where you would have to install the new version of pg_dump to run against the old backend. This is unacceptable in the real world of binary packages. One other usability note: why can't postmaster perform the steps of an initdb if -D points to an empty directory? It's not that much code, is it? (I know that one extra step isn't backbreaking, but I'm looking at this from a rank newbie's point of view -- or at least I'm trying to look at it in that way, as it's been a while since I was a rank newbie at PostgreSQL) Oh well, just a random thought. But I believe a backend-independent data dumper would be very useful in many contexts, particularly those where a backend cannot be run for whatever reason, but you need your data (corrupted system catalogs, high system load, whatever). Upgrading is just one of those contexts. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen wrote: > On Wednesday 03 July 2002 12:09 pm, Bruce Momjian wrote: > > Hannu Krosing wrote: > > > AFAIK I can run as many backends as I like (up to some practical limit) > > > on the same comuter at the same time, as long as they use different > > > ports and different data directories. > > > We don't have an automated system for doing this. Certainly it is done > > all the time. > > Good. Dialog. This is better than what I am used to when I bring up > upgrading. :-) > > Bruce, pg_upgrade isn't as kludgey as what I have been doing with the RPMset > for these nearly three years. > > No, what I envisioned was a standalone dumper that can produce dump output > without having a backend at all. If this dumper knows about the various > binary formats, and knows how to get my data into a form I can then restore > reliably, I will be satisfied. If it can be easily automated so much the > better. Doing it table by table would be ok as well. The problem with a standalone dumper is that you would have to recode this for every release, with little testing possible. Having the old backend active saves us that step. If we get it working, we can use it over and over again for each release with little work on our part. > Keys to this working: > 1.) Must not require the old version executable backend. There are a number > of reasons why this might be, but the biggest is due to the way much > upgrading works in practice -- the old executables are typically gone by the > time the new package is installed. Oh, that is a problem. We would have to require the old executables. > 2.) Uses pg_dbdump of the new version. This dumper can be tailored to provide > the input pg_restore wants to see. The dump-restore sequence has always had > dumped-data version mismatch as its biggest problem -- there have been issues > before where you would have to install the new version of pg_dump to run > against the old backend. This is unacceptable in the real world of binary > packages. > > One other usability note: why can't postmaster perform the steps of an initdb > if -D points to an empty directory? It's not that much code, is it? (I know > that one extra step isn't backbreaking, but I'm looking at this from a rank > newbie's point of view -- or at least I'm trying to look at it in that way, > as it's been a while since I was a rank newbie at PostgreSQL) Oh well, just > a random thought. The issue is that if you have PGDATA pointed to the wrong place, it creates a new instance automatically. Could be strange for people, but we could prompt them to run initdb I guess. > But I believe a backend-independent data dumper would be very useful in many > contexts, particularly those where a backend cannot be run for whatever > reason, but you need your data (corrupted system catalogs, high system load, > whatever). Upgrading is just one of those contexts. Yes, but who wants to write one of those for every release? That is where we get stuck, and with our limited resources, it is desirable to encourage people to work on it? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Fri, 2002-07-05 at 17:39, Lamar Owen wrote: > No, what I envisioned was a standalone dumper that can produce dump output > without having a backend at all. If this dumper knows about the various > binary formats, and knows how to get my data into a form I can then restore > reliably, I will be satisfied. If it can be easily automated so much the > better. Doing it table by table would be ok as well. ... > 1.) Must not require the old version executable backend. There are a number > of reasons why this might be, but the biggest is due to the way much > upgrading works in practice -- the old executables are typically gone by the > time the new package is installed. > > 2.) Uses pg_dbdump of the new version. This dumper can be tailored to provide > the input pg_restore wants to see. The dump-restore sequence has always had > dumped-data version mismatch as its biggest problem -- there have been issues > before where you would have to install the new version of pg_dump to run > against the old backend. This is unacceptable in the real world of binary > packages. I concur completely! As a package maintainer, this would remove my biggest problem. Oliver Elphick (Debian maintainer)
On Fri, Jul 05, 2002 at 12:39:13PM -0400, Lamar Owen wrote: > One other usability note: why can't postmaster perform the steps of > an initdb if -D points to an empty directory? It's not that much > code, is it? (I know that one extra step isn't backbreaking, but > I'm looking at this from a rank newbie's point of view -- or at > least I'm trying to look at it in that way, as it's been a while > since I was a rank newbie at PostgreSQL) Oh well, just a random > thought. Rank newbies shouldn't be protected in this way, partly because if something goes wrong, _they won't know what to do_. Please, please, don't be putting automagic, database destroying functions like that into the postmaster. It's a sure way to cause a disaster at aome point. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On generic recovery... What is wrong with this strategy... 0. Put the database in single user mode. 1. Dump the Schema, with creation order properly defined, and with all constraints written to a separate file. (IOW, one file contains the bare tables with no index, constraint or trigger stuff, and the other contains all the RI stuff.) 2. Dump the tables (one by one) to text files with "copy" 3. Create a new database in a new location. 4. Feed it the bare table schema 5. Pump in the table data using "copy" from the saved text files 6. Run the RI script to rebuild index, trigger, PKey, FKey, etc. I find that is the most trouble free way to do it with most DBMS systems. Am attempted dump from DBMS X.Y and a load to DBMS (X+1).Y is always a pile of trouble waiting to happen -- no matter what the system is.
Andrew Sullivan <andrew@libertyrms.info> writes: > On Fri, Jul 05, 2002 at 12:39:13PM -0400, Lamar Owen wrote: >> One other usability note: why can't postmaster perform the steps of >> an initdb if -D points to an empty directory? > Rank newbies shouldn't be protected in this way, partly because if > something goes wrong, _they won't know what to do_. Please, please, > don't be putting automagic, database destroying functions like that > into the postmaster. I agree completely with Andrew, even though an auto-initdb on an empty directory presumably won't destroy any data. What it *does* do is effectively mask a DBA error. We'll be getting panic-stricken support calls/emails saying "all my databases are gone! Postgres sucks!" when the problem is just that PG was restarted with the wrong -D pointer. The existing behavior points that out loud and clear, in a context where the DBA shouldn't have too much trouble figuring out what he did wrong. regards, tom lane
On Saturday 06 July 2002 11:15 am, Tom Lane wrote: > Andrew Sullivan <andrew@libertyrms.info> writes: > > On Fri, Jul 05, 2002 at 12:39:13PM -0400, Lamar Owen wrote: > >> One other usability note: why can't postmaster perform the steps of > >> an initdb if -D points to an empty directory? > > Rank newbies shouldn't be protected in this way, partly because if > > something goes wrong, _they won't know what to do_. > I agree completely with Andrew, even though an auto-initdb on an empty > directory presumably won't destroy any data. Good grief, I was just asking a question. :-) > What it *does* do is > effectively mask a DBA error. This is a satisfactory answer. In the context of the RPM distribution, if the initscript is used the DBA error probability is greatly reduced, thus the initscript can safely initdb. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Sat, 6 Jul 2002, Tom Lane wrote: > Andrew Sullivan <andrew@libertyrms.info> writes: > > On Fri, Jul 05, 2002 at 12:39:13PM -0400, Lamar Owen wrote: > >> One other usability note: why can't postmaster perform the steps of > >> an initdb if -D points to an empty directory? > > > Rank newbies shouldn't be protected in this way, partly because if > > something goes wrong, _they won't know what to do_. Please, please, > > don't be putting automagic, database destroying functions like that > > into the postmaster. > > I agree completely with Andrew, even though an auto-initdb on an empty > directory presumably won't destroy any data. What it *does* do is > effectively mask a DBA error. We'll be getting panic-stricken support > calls/emails saying "all my databases are gone! Postgres sucks!" when > the problem is just that PG was restarted with the wrong -D pointer. The > existing behavior points that out loud and clear, in a context where the > DBA shouldn't have too much trouble figuring out what he did wrong. Okay, I'm sitting on the fence on this one ... but, as DBA for several PgSQL installs on at least a half dozen machines or more, if someone restarts PG with the wrong -D pointer, they haven't setup their machine to live through a reboot ... first thing any DBA *should* be doing after they have 'initdb'd their system is add the appropriate start-up scripts for after the reboot ... Also, what is the difference between forgetting where you put it in an initdb or on the first postmaster? Why not put in a 'safety'? If you start up postmaster with -D on a directory that doesn't yet exist, it prompts the DBA as to whether they are certain that they wish to do this? Just thoughts ... I'm happy enough with initdb *shrug*
Lamar Owen <lamar.owen@wgcr.org> writes: >> What it *does* do is effectively mask a DBA error. > This is a satisfactory answer. In the context of the RPM distribution, if the > initscript is used the DBA error probability is greatly reduced, thus the > initscript can safely initdb. Fair enough --- if the upper-layer script thinks it has enough safeties in place, let it auto-initdb. I just don't think the postmaster should do that. regards, tom lane
> No, what I envisioned was a standalone dumper that can produce dump output > without having a backend at all. If this dumper knows about the various > binary formats, and knows how to get my data into a form I can then restore > reliably, I will be satisfied. If it can be easily automated so much the > better. Doing it table by table would be ok as well. Unless it dumps binary representation of columns, a standalone dumper would still need to load all the output function shared libs for custom types (or not support custom types which would imho not be good). Andreas
Zeugswetter Andreas SB SD wrote: > > > No, what I envisioned was a standalone dumper that can produce dump output > > without having a backend at all. If this dumper knows about the various > > binary formats, and knows how to get my data into a form I can then restore > > reliably, I will be satisfied. If it can be easily automated so much the > > better. Doing it table by table would be ok as well. > > Unless it dumps binary representation of columns, a standalone dumper > would still need to load all the output function shared libs for custom types > (or not support custom types which would imho not be good). And now we change the internal representation of NUMERIC to a short integer array holding the number in base 10,000 and what exactly does the standalone dumpster do with our data? Another good example: let's add a field to some parsenode struct (was there a release where this didn't happen?). This causes the NodeOut() results to become a little different, which actually changes the textual content of a likely toasted pg_rewrite attribute. Stored compressed and sliced. I am quite a bit familiar with TOAST and the rewrite system. Yet, someone has to help me a little to understand how we can do this conversion in binary on the fly with an external tool. Especially where this conversion results in different raw and compressed sizes of the TOASTed attribute, which has to propagate up into the TOAST reference in the main table ... not to speak of possible required block splits in the toast table and index because of needing one more slice! Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Tue, 2002-07-09 at 00:20, Jan Wieck wrote: > Zeugswetter Andreas SB SD wrote: > > > > > No, what I envisioned was a standalone dumper that can produce dump output > > > without having a backend at all. If this dumper knows about the various > > > binary formats, and knows how to get my data into a form I can then restore > > > reliably, I will be satisfied. If it can be easily automated so much the > > > better. Doing it table by table would be ok as well. > > > > Unless it dumps binary representation of columns, a standalone dumper > > would still need to load all the output function shared libs for custom types > > (or not support custom types which would imho not be good). > > And now we change the internal representation of NUMERIC to a short > integer array holding the number in base 10,000 and what exactly does > the standalone dumpster do with our data? > > Another good example: let's add a field to some parsenode struct (was > there a release where this didn't happen?). This causes the NodeOut() > results to become a little different, which actually changes the textual > content of a likely toasted pg_rewrite attribute. Stored compressed and > sliced. I am quite a bit familiar with TOAST and the rewrite system. > Yet, someone has to help me a little to understand how we can do this > conversion in binary on the fly with an external tool. Especially where > this conversion results in different raw and compressed sizes of the > TOASTed attribute, which has to propagate up into the TOAST reference in > the main table ... not to speak of possible required block splits in the > toast table and index because of needing one more slice! This brings us back to my original proposal : this "external tool" needs to be either a full postgres backend with added DUMP command or something that can use a (possibly single-user) backend as either a library or, yes, a "backend" ;) --------------------- Hannu
> > Keys to this working: > > 1.) Must not require the old version executable backend. There are a number > > of reasons why this might be, but the biggest is due to the way much > > upgrading works in practice -- the old executables are typically gone by the > > time the new package is installed. > > Oh, that is a problem. We would have to require the old executables. Could this be solved with packaging? Meaning can postmasters from old versions be packed with a new release strictly for the purpose of upgrading? It is my understanding that the only old executable needed is the postmaster is that correct? Perhaps this also requires adding functionality so that pg_dump can run against a singer user postmaster. Example: When PG 7.3 is released, the RPM / deb / setup.exe include the postmaster binary for v 7.2 (perhaps two or three older versions...). An upgrade script is included that does the automatic dump / restore described eariler in this thread. Effectivly, you are using old versions of the postmaster as your standalone dumper. I think this could sidestep the problem of having to create / test / maintain new version of a dumper or pg_upgrade for every release. By default perhaps the postmaster for the previous version of postgres is included, and postmasters from older versions are distrubuted in separate packages, so if I am still runnig 6.5.3 and I want to upgrade to 7.3, I have do install the 6.5.3 upgrade package. Or perhaps there i one pg_upgrade rpm package that includes every postmaster since 6.4. This would allow the upgrade script to know that it all backends are availble to it depeding on what it finds in PG_VERSION, it also allows the admin to removed them all easily once they are no longer needed.
On Tue, 2002-07-09 at 01:30, Matthew T. O'Connor wrote: > > Oh, that is a problem. We would have to require the old executables. > > Could this be solved with packaging? Meaning can postmasters from old versions > be packed with a new release strictly for the purpose of upgrading? It is my > understanding that the only old executable needed is the postmaster is that > correct? Perhaps this also requires adding functionality so that pg_dump can > run against a singer user postmaster. > > Example: When PG 7.3 is released, the RPM / deb / setup.exe include the > postmaster binary for v 7.2 (perhaps two or three older versions...). That isn't usable for Debian. A package must be buildable from source; so I would have to include separate (though possibly cut-down) source for n previous packages. It's a horrid prospect and a dreadful kludge of a solution - a maintainer's nightmare. Oliver
On Monday 08 July 2002 03:20 pm, Jan Wieck wrote: > Zeugswetter Andreas SB SD wrote: > > Unless it dumps binary representation of columns, a standalone dumper > > would still need to load all the output function shared libs for custom > > types (or not support custom types which would imho not be good). > And now we change the internal representation of NUMERIC to a short > integer array holding the number in base 10,000 and what exactly does > the standalone dumpster do with our data? What does a standard dump/restore do then as well? Is the restore process complicated by a rebuild of the function(s) involved in custom types? This, IMHO, is a pathological case even for a standard dump/restore. Someone doing this sort of thing is going to have more to do that a simple package upgrade. > Another good example: let's add a field to some parsenode struct (was > there a release where this didn't happen?). This causes the NodeOut() > results to become a little different, which actually changes the textual > content of a likely toasted pg_rewrite attribute. Stored compressed and > sliced. I am quite a bit familiar with TOAST and the rewrite system. > Yet, someone has to help me a little to understand how we can do this > conversion in binary on the fly with an external tool. Especially where > this conversion results in different raw and compressed sizes of the > TOASTed attribute, which has to propagate up into the TOAST reference in > the main table ... not to speak of possible required block splits in the > toast table and index because of needing one more slice! This is more difficult, certainly. Martijn, how does pg_fsck handle such things now? Again, this tool has utility outside upgrading. And I'm talking about dumping the binary down to ASCII to be restored, not binary to binary on the fly. This is the best dialog yet on the issue of upgrading. Keep it coming! :-) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Tue, 2002-07-09 at 17:19, Lamar Owen wrote: > On Monday 08 July 2002 03:20 pm, Jan Wieck wrote: > > Another good example: let's add a field to some parsenode struct (was > > there a release where this didn't happen?). This causes the NodeOut() > > results to become a little different, which actually changes the textual > > content of a likely toasted pg_rewrite attribute. Stored compressed and > > sliced. I am quite a bit familiar with TOAST and the rewrite system. > > Yet, someone has to help me a little to understand how we can do this > > conversion in binary on the fly with an external tool. Especially where > > this conversion results in different raw and compressed sizes of the > > TOASTed attribute, which has to propagate up into the TOAST reference in > > the main table ... not to speak of possible required block splits in the > > toast table and index because of needing one more slice! > > This is more difficult, certainly. Martijn, how does pg_fsck handle such > things now? > > Again, this tool has utility outside upgrading. And I'm talking about dumping > the binary down to ASCII to be restored, not binary to binary on the fly. You seem to be talking about pg_dump + old backend, no ? ;) For me it seems a given that you need old binary-reading code to read old binary format data. The most convenient place to keep it is inside an old backend. It may be possible to migrate simple user table data from old version to new, but it gets complicated real fast for most other things, especially for stuff in system tables. --------------- Hannu
On Tuesday 09 July 2002 11:41 am, Hannu Krosing wrote: > The old postmaster should not be built/distributed. As it is for > _upgrading_ only, you just have to _keep_ it when doing an upgrade, not > build a new "old" one ;) Let me reiterate one thing about this. In the midst of a total OS upgrade, during which PostgreSQL is being upgraded as well (the new OS release includes a 'better' PostgreSQL), you also get library upgrades. If the upgrade is from an old enough version of the OS, the old postmaster/postgres may not even be able to execute AT ALL. Some may say that this is a problem for the vendor. Well I know of one vendor that has thrown up its hands in disgust over our lack of upgradability that they have now quit supporting even the kludgy semi-automatic upgrade process I did up three years ago. They will refuse to support any mechanism that requires any portion of an old package to remain around. The new package must be self-contained and must be able to upgrade the old data, or they will not accept it. Their statement now is simply that PostgreSQL upgrading is broken; dump before upgrading and complain to the PostgreSQL developers. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Tue, 2002-07-09 at 16:41, Hannu Krosing wrote: > On Tue, 2002-07-09 at 13:48, Oliver Elphick wrote: > > On Tue, 2002-07-09 at 01:30, Matthew T. O'Connor wrote: > > > Example: When PG 7.3 is released, the RPM / deb / setup.exe include the > > > postmaster binary for v 7.2 (perhaps two or three older versions...). > > > > That isn't usable for Debian. A package must be buildable from source; > > so I would have to include separate (though possibly cut-down) source > > for n previous packages. It's a horrid prospect and a dreadful kludge > > of a solution - a maintainer's nightmare. > > The old postmaster should not be built/distributed. As it is for > _upgrading_ only, you just have to _keep_ it when doing an upgrade, not > build a new "old" one ;) No, it doesn't work like that. You cannot rely on anything's being left from an old distribution; apt is quite likely to delete it altogether before installing the new version (to enable dependencies to be satisfied). At present I have the preremoval script copy the old binaries to a special location in case they will be needed, but that fails if the version is very old (and doesn't contain that code), and it's a very fragile mechanism. I never have understood why the basic table structure changes so much that it can't be read; just what is involved in getting the ability to read old versions?
On Tue, 2002-07-09 at 17:49, Oliver Elphick wrote: > On Tue, 2002-07-09 at 16:41, Hannu Krosing wrote: > > On Tue, 2002-07-09 at 13:48, Oliver Elphick wrote: > > > On Tue, 2002-07-09 at 01:30, Matthew T. O'Connor wrote: > > > > Example: When PG 7.3 is released, the RPM / deb / setup.exe include the > > > > postmaster binary for v 7.2 (perhaps two or three older versions...). > > > > > > That isn't usable for Debian. A package must be buildable from source; > > > so I would have to include separate (though possibly cut-down) source > > > for n previous packages. It's a horrid prospect and a dreadful kludge > > > of a solution - a maintainer's nightmare. > > > > The old postmaster should not be built/distributed. As it is for > > _upgrading_ only, you just have to _keep_ it when doing an upgrade, not > > build a new "old" one ;) > > No, it doesn't work like that. You cannot rely on anything's being left > from an old distribution; apt is quite likely to delete it altogether > before installing the new version (to enable dependencies to be > satisfied). At present I have the preremoval script copy the old > binaries to a special location in case they will be needed, but that > fails if the version is very old (and doesn't contain that code), and > it's a very fragile mechanism. > > I never have understood why the basic table structure changes so much > that it can't be read; just what is involved in getting the ability to > read old versions? The big change was from 6.x to 7.x where a chunk of data moved from end of page to start of page and tableoid column was added. Otherways the table structure is quite simple. The difficulties with user _data_ can be mainly because of binary format changes for some types and such. But I still can't see how will having a binary dumper that does mostly the work of [ old_backend -c "COPY tablex TO STDOUT" ] help us here. IIRC the main difficulties in upgrading have always been elsewhere, like migrating always changing system table data. ---------- Hannu
On Tue, 2002-07-09 at 18:05, Hannu Krosing wrote: > The big change was from 6.x to 7.x where a chunk of data moved from end > of page to start of page and tableoid column was added. Otherways the > table structure is quite simple. The difficulties with user _data_ can > be mainly because of binary format changes for some types and such. > > But I still can't see how will having a binary dumper that does mostly > the work of [ old_backend -c "COPY tablex TO STDOUT" ] help us here. > > IIRC the main difficulties in upgrading have always been elsewhere, like > migrating always changing system table data. The main problem is getting access to the user data after an upgrade. There's no particular problem in having to do an initdb, though it is an inconvenience; the difficulty is simply that any packaged distribution (rpm, deb, xxx) is going to have to replace all the old binaries. So by the time the package is ready to do the database upgrade, it has destroyed the means of dumping the old data. Lamar and I have to jump through hoops to get round this -- small hoops with flaming rags round them! The current upgrade process for PostgreSQL is founded on the idea that people build from source. With binary distributions, half the users wouldn't know what to do with source; they expect (and are entitled to expect) that an upgrade will progress without the need for significant intervention on their part. PostgreSQL makes this really difficult for the package maintainers, and this has a knock-on effect on the reliability of the upgrade process and thus on PostgreSQL itself.
On Tue, 2002-07-09 at 18:30, Oliver Elphick wrote: > On Tue, 2002-07-09 at 18:05, Hannu Krosing wrote: > > The big change was from 6.x to 7.x where a chunk of data moved from end > > of page to start of page and tableoid column was added. Otherways the > > table structure is quite simple. The difficulties with user _data_ can > > be mainly because of binary format changes for some types and such. > > > > But I still can't see how will having a binary dumper that does mostly > > the work of [ old_backend -c "COPY tablex TO STDOUT" ] help us here. > > > > IIRC the main difficulties in upgrading have always been elsewhere, like > > migrating always changing system table data. > > The main problem is getting access to the user data after an upgrade. Can't it be dumped in pre-upgrade script ? > There's no particular problem in having to do an initdb, though it is an > inconvenience; the difficulty is simply that any packaged distribution > (rpm, deb, xxx) is going to have to replace all the old binaries. So by > the time the package is ready to do the database upgrade, it has > destroyed the means of dumping the old data. Lamar and I have to jump > through hoops to get round this -- small hoops with flaming rags round > them! IMHO, if rpm and apt can't run a pre-install script before deleting the old binaries they are going to replace/upgrade then you should complain to authors of rpm and apt. It seems that they are doing things in wrong order. The right order should of course be 1) run pre-upgrade (pg_dumpall >dumpfile) 2) upgrade 3) run post-upgrade (initdb; psql < dumpfile) --------------- Hannu
On Tuesday 09 July 2002 01:46 pm, Hannu Krosing wrote: > On Tue, 2002-07-09 at 18:30, Oliver Elphick wrote: > > The main problem is getting access to the user data after an upgrade. > Can't it be dumped in pre-upgrade script ? The pre-upgrade script is run in an environment that isn't robust enough to handle that. What if you run out of disk space during the dump? What if a postmaster is running -- and many people stop their postmaster before upgrading their version of PostgreSQL? Besides, at least in the case of the RPM, during OS upgrade time the %pre scriptlet (the one you allude to) isn't running in a system with all the normal tools available. Nor is there a postmaster running. Due to a largish RAMdisk, a postmaster running might cause all manners of problems. And an error in the scriptlet could potentially cause the OS upgrade to abort in midstream -- not a nice thing to do to users, having a package during upgrade abort their OS upgrade when it is a little over half through, and in an unbootable state.... No, any dumping of data cannot happen during the %pre script -- too many issues there. > IMHO, if rpm and apt can't run a pre-install script before deleting the > old binaries they are going to replace/upgrade then you should complain > to authors of rpm and apt. Oh, so it's RPM's and APT's problem that we require so many resources during upgrade.... :-) > The right order should of course be > 1) run pre-upgrade (pg_dumpall >dumpfile) > 2) upgrade > 3) run post-upgrade (initdb; psql < dumpfile) All but the first step works fine. The first step is impossible in the environment in which the %pre script runs. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Tue, 2002-07-09 at 13:48, Oliver Elphick wrote: > On Tue, 2002-07-09 at 01:30, Matthew T. O'Connor wrote: > > > Oh, that is a problem. We would have to require the old executables. > > > > Could this be solved with packaging? Meaning can postmasters from old versions > > be packed with a new release strictly for the purpose of upgrading? It is my > > understanding that the only old executable needed is the postmaster is that > > correct? Perhaps this also requires adding functionality so that pg_dump can > > run against a singer user postmaster. > > > > Example: When PG 7.3 is released, the RPM / deb / setup.exe include the > > postmaster binary for v 7.2 (perhaps two or three older versions...). > > That isn't usable for Debian. A package must be buildable from source; > so I would have to include separate (though possibly cut-down) source > for n previous packages. It's a horrid prospect and a dreadful kludge > of a solution - a maintainer's nightmare. The old postmaster should not be built/distributed. As it is for _upgrading_ only, you just have to _keep_ it when doing an upgrade, not build a new "old" one ;) -------------- Hannu
Oliver Elphick writes: > I never have understood why the basic table structure changes so much > that it can't be read; just what is involved in getting the ability to > read old versions? The problem in an extensible system such as PostgreSQL is that virtually every feature change is reflected by a change in the structure of the system catalogs. It wouldn't be such a terribly big problem in theory to make the backend handle these changes, but you'd end up with a huge bunch of if (dataVersion == 1) do this; else if (dataVersion == 2) do that; ... which would become slow and unwieldy, and would scare away developers. That would of course be a self-serving scheme, because if the development progress slowed down, you would have to update less frequently. -- Peter Eisentraut peter_e@gmx.net
On Tue, 2002-07-09 at 22:10, Lamar Owen wrote: > On Tuesday 09 July 2002 01:46 pm, Hannu Krosing wrote: > > On Tue, 2002-07-09 at 18:30, Oliver Elphick wrote: > > > The main problem is getting access to the user data after an upgrade. > > > Can't it be dumped in pre-upgrade script ? > > The pre-upgrade script is run in an environment that isn't robust enough to > handle that. What if you run out of disk space during the dump? You can either check beforehand or abort and delete the offending dumpfile. > What if a postmaster is running -- and many people stop their postmaster before > upgrading their version of PostgreSQL? It is quite easy to both check for a running postmaster and start/stop one. > Besides, at least in the case of the RPM, during OS upgrade time the %pre > scriptlet (the one you allude to) isn't running in a system with all the > normal tools available. I don't think that postmaster needs very many normal tools - it should be quite independent, except for compat libs for larger version upgrades > Nor is there a postmaster running. Due to a largish > RAMdisk, a postmaster running might cause all manners of problems. I don't know anything about the largish RAMdisk,what I meant was that postmaster (a 2.7 MB program with ~4 MB RAM footprint) could include the functionality of pg_dump and be runnable in single-user mode for dumping old databases. > And an error in the scriptlet could potentially cause the OS upgrade to abort > in midstream -- not a nice thing to do to users, having a package during > upgrade abort their OS upgrade when it is a little over half through, and in > an unbootable state.... No, any dumping of data cannot happen during the %pre > script -- too many issues there. But is it not the same with _every_ package ? Is there any actual upgrading done in the pre/post scripts or are they generally not to be trusted ? > > IMHO, if rpm and apt can't run a pre-install script before deleting the > > old binaries they are going to replace/upgrade then you should complain > > to authors of rpm and apt. > > Oh, so it's RPM's and APT's problem that we require so many resources during > upgrade.... :-) As you said: "The pre-upgrade script is run in an environment that isn't robust enough to handle that". Ok, maybe it's the environmental issue then ;) But more seriously - it is a DATAbase upgrade, not a usual program upgrade which has a minuscule data part, usually not more than a configuration file. Postgres, as a very extensible database, has an ability to keep much of its functionality in the database. We already do a pretty good job with pg_dump, but I would still not trust it to do everything automatically and erase the originals. If we start claiming that postgresql can do automatic "binary" upgrades there will be much fun with people who have some application that runs fine on 7.0.3 but barfs on 7.1.2, even if it is due to stricter adherence to SQL99 and the SQL is completely out of control or rpm/apt. There may be even some lazy people who will think that now is the time to auto-upgrade from 6.x ;/ > > The right order should of course be > > > 1) run pre-upgrade (pg_dumpall >dumpfile) > > 2) upgrade > > 3) run post-upgrade (initdb; psql < dumpfile) > > All but the first step works fine. The first step is impossible in the > environment in which the %pre script runs. Ok. But would it be impossible to move the old postmaster to some other place, or is the environment too fragile even for that ? If we move the old postmaster instead of copying then there will be a lot less issues about running out of disk space :) What we are facing here is a problem similar to trying upgrade all users C programs when upgrading gcc. While it would be a good thing, nobody actually tries to do it - we require them to have source code and to do the "upgrade" manually. That's what I propose - dump all databases in pre-upgrade (if you are concerned about disk usage, run it twice, first to | wc and then to a file) and try to load in post-upgrade. There will still be some things that are be impossible to "upgrade" like upgrading a.out "C" functions to elf format backend. Perhaps we will be able to detect what we can actually upgrade and bail out if we find something unupgradable ? ------------------- Hannu
On Tuesday 09 July 2002 04:17 pm, Hannu Krosing wrote: > On Tue, 2002-07-09 at 22:10, Lamar Owen wrote: > > The pre-upgrade script is run in an environment that isn't robust enough > > to handle that. What if you run out of disk space during the dump? > You can either check beforehand or abort and delete the offending > dumpfile. And what if you have enough disk space to do the dump, but then that causes the OS upgrade to abort because there wasn't enough space left to finish upgrading (larger packages, perhaps)? The system's hosed, and it's our fault. > > What if a postmaster is running -- and many people stop their postmaster > > before upgrading their version of PostgreSQL? > It is quite easy to both check for a running postmaster and start/stop > one. Not when there is no ps in your path. Or pg_ctl for that matter. Nor is there necessarily a /proc tree waiting to be exploited. We're talking the anaconda environment, which is tailored for OS installation and upgrading. You cannot start a postmaster; you cannot check to see if one is running -- you can't even check to see if you're in the anaconda chroot or not, so that you can use more tools if not in the OS installation mode. Again -- the total OS upgrade path is a big part of this scenario, as far as the RPM's are concerned. The Debian package may or may not have as grievous a structure. The only tool you can really use under the anaconda chroot is busybox, and it may not do what you want it to. > > Besides, at least in the case of the RPM, during OS upgrade time the %pre > > scriptlet (the one you allude to) isn't running in a system with all the > > normal tools available. > I don't think that postmaster needs very many normal tools - it should > be quite independent, except for compat libs for larger version > upgrades The problem there is that you really have no way to tell the system which sets of libraries you want. More to the point: RPM dependencies cannot take conditionals and have no concept of if..then. Nor can you tell the system to _install_ the new postgresql instead of _upgrade_ (incidentally, in the RPM context an upgrade is an install of the new version followed by an uninstall of the old one -- if the new one overwrote files their traces are just wiped from the RPM database, if they weren't overwritten, the files get wiped along with their respective database entries). If I could _force_ no upgrades, it would be much easier -- but I can't. Nor can I be sure the %pre scriptlet will be run -- some people are so paranoid that they use rpm -U --no-scripts religiously. Thus, when the old postgresql rpm's database entries (in practice virtually every old executable gets overwritten) are removed, its dependency information is also removed. As the install/upgrade path builds a complete dependency tree of the final installation as part of the process, it knows whether the compat libs are needed or not. If no other program needs them, you don't get them, even if you kept an old backend around that does need them. But you really can't make the -server subpackage Require the compat packages, because you don't necessarily know what they will be named, or anything else they will provide. If compat libs are even available for the version you're upgrading from. > > Nor is there a postmaster running. Due to a largish > > RAMdisk, a postmaster running might cause all manners of problems. > I don't know anything about the largish RAMdisk,what I meant was that > postmaster (a 2.7 MB program with ~4 MB RAM footprint) could include the > functionality of pg_dump and be runnable in single-user mode for dumping > old databases. If a standalone backend could reliably dump the database without needing networking and many of the other things we take for granted (the install mode is a cut-down single-user mode of sorts, running in a chroot of a sort), then it might be worth looking at. > > And an error in the scriptlet could potentially cause the OS upgrade to > > abort in midstream -- not a nice thing to do to users, having a package > But is it not the same with _every_ package ? Is there any actual > upgrading done in the pre/post scripts or are they generally not to be > trusted ? No other package is so *different* to require such a complicated upgrade process. Some packages do more with their scriptlets than others, but no package does anything near as complicated as dumping a database. > We already do a pretty good job with pg_dump, but I would still not > trust it to do everything automatically and erase the originals. And that's a big problem. We shouldn't have that ambivalence. IOW, I think we need more upgrade testing. I don't think I've seen a cycle yet that didn't have upgrade problems. > If we start claiming that postgresql can do automatic "binary" upgrades > there will be much fun with people who have some application that runs > fine on 7.0.3 but barfs on 7.1.2, even if it is due to stricter > adherence to SQL99 and the SQL is completely out of control or rpm/apt. That's just us not being backward compatible. I'm impacted by those things, being that I'm running OpenACS here on 7.2.1, when OACS is optimized for 7.1. Certain things are very broken. > There may be even some lazy people who will think that now is the time > to auto-upgrade from 6.x ;/ And why not? If Red Hat Linux can upgrade a whole operating environment from version 2.0 all the way up to 7.3 (which they claim), why can't we? If we can just give people the tools to deal with potential problems after the upgrade, then I think we can do it. Such a tool as a old-version dumper would be a lifesaver to people, I believe. > Ok. But would it be impossible to move the old postmaster to some other > place, or is the environment too fragile even for that ? That is what I have done in the past -- the old backend got copied over (the executable), then a special script was run (after upgrade, manually, by the user) that tried to pull a dump using the old backend. It wasn't reliable. The biggest problem is that I have no way of insuring that the old backend's dependencies stay satisfied -- 'satisfied' meaning that the old glibc stays installed for compatibility. Glibc, after all, is being upgraded out from under us, and I can't stop it or even slow it down. And this could even be that most pathological of cases, where an a.out based system is being upgraded to an elf system without a.out kernel support. (point of note: PostgreSQL first appeared in official Red Hat Linux as version 6.2.1, released with Red Hat Linux 5.0, which was ELF/glibc (contrasted to 3.0.3 which was a.out/libc4 and 4.x which was ELF/libc5) -- but I don't know about the Debian situation and its pathology.) > If we move the old postmaster instead of copying then there will be a > lot less issues about running out of disk space :) The disk space issue is with the ASCII dump file itself. Furthermore, what happens if the dumpfile is greater than MAXFILESIZE? Again, wc isn't in the path (because it too is being upgraded out from under us -- nothing is left untouched by the upgrade EXCEPT that install image RAMdisk, which has a very limited set of tools (and a nonstandard kernel to boot). Networking might or might not be available. Unix domain sockets might or might not be available. But the crux is that the OS upgrade environment is designed to do one thing and one thing alone -- get the OS installed and/or upgraded. General purpose tools just take up space on the install media, a place where space is at a very high premium. > What we are facing here is a problem similar to trying upgrade all users > C programs when upgrading gcc. While it would be a good thing, nobody > actually tries to do it - we require them to have source code and to do > the "upgrade" manually. Is that directly comparable? If you have alot of user functions written in C then possibly. But I'm not interested in pathological cases -- I'm interested in something that works OK for the majority of users. As long as it works properly for users who aren't sophisticated enough to need the pathological cases handled, then it should be available. Besides, one can always dump and restore if one wants to. And just how well does the venerable dump/restore cycle work in the presence of these pathological cases? Red Hat Linux doesn't claim upgradability in the presence of highly pathological cases (such as rogue software installed from non-RPM sources, or non-Red Hat RPM's installed (particularly Ximian Gnome). So you have to go through a process with that. But it is something you can recover from after the upgrade is complete. That's what I'm after. I don't hold out hope for a fully automatic upgrade -- it would be nice, but we are too extensible for it to be practical. No -- I want tools to be able to recover my old data without the old version backend held-over from the previous install. And I think this is a very resonable expectation. > That's what I propose - dump all databases in pre-upgrade (if you are > concerned about disk usage, run it twice, first to | wc and then to a > file) and try to load in post-upgrade. The wc utility isn't in the path in an OS install situation. The df utility isn't in the path, either. You can use python, though. :-) Not that that would be a good thing in this context, however. > There will still be some things that are be impossible to "upgrade" like > upgrading a.out "C" functions to elf format backend. If a user is sophisticated enough to write such, that user is sophisticated enough to take responsibility for the upgrade. I'm not talking about users of that level here. But even then, it would be nice to at least get the data back out -- the function can then be rebuilt easily enough from source. > Perhaps we will be able to detect what we can actually upgrade and bail > out if we find something unupgradable ? All is alleviated if I can run a utility after the fact to read in my old data, without requiring the old packaged binaries. I don't have to workaround ANYTHING. Again I say -- would such a data dumper not be useful in cases of system catalog corruption that prevents a postmaster from starting? I'm talking about a multipurpose utility here, not just something to make my life as RPM maintainer easy. The pg_fsck program is a good beginning to such a program. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Tue, 2002-07-09 at 19:09, Lamar Owen wrote: > On Tuesday 09 July 2002 04:17 pm, Hannu Krosing wrote: > > On Tue, 2002-07-09 at 22:10, Lamar Owen wrote: > > > The pre-upgrade script is run in an environment that isn't robust enough > > > to handle that. What if you run out of disk space during the dump? > > > You can either check beforehand or abort and delete the offending > > dumpfile. > > And what if you have enough disk space to do the dump, but then that causes > the OS upgrade to abort because there wasn't enough space left to finish > upgrading (larger packages, perhaps)? The system's hosed, and it's our > fault. What normally happens when you have low amounts of free diskspace and attempt to upgrade the system? On FreeBSD (portupgrade) it rolls back any changes it was attempting. I don't know other systems to be able to say. Postgresql may require more diskspace to upgrade than most packages, but if the tools cannot fail cleanly it is already a problem that needs to be addressed.
On Tuesday 09 July 2002 06:20 pm, Peter Eisentraut wrote: > The problem in an extensible system such as PostgreSQL is that virtually > every feature change is reflected by a change in the structure of the > system catalogs. It wouldn't be such a terribly big problem in theory to > make the backend handle these changes, but you'd end up with a huge bunch > of > if (dataVersion == 1) > do this; > else if (dataVersion == 2) > do that; Ok, pardon me while I take a moment to braindump here. And Peter, you of all people caused this braindump, so, 'hold on to your hat' :-). You know, it occurs to me that we are indeed an Object RDBMS, but not in the conventional sense. Our whole system is object oriented -- we are extensible by the data and the methods (functions) that operate on that data. In fact, the base system is simply a set of objects, all the way down to the base data types and their functions. So the problem jells down to this: How does one upgrade the method portion of the object, bringing in new object data if necessary, while leaving non-impacted data alone? Is there a way of partitioning the method-dependent object data from the non-object data? This would require a complete system catalog redesign -- or would it? Can such a migration be object-oriented in itself, with the new version inheriting the old version and extending it.... (like I said, I'm braindumping here -- this may not be at all coherent -- but my stream of consciousness rarely is [coherent]). Can our core be written/rewritten in such a way as to be _completely_ object driven? Someone steeped a little better in object theory please take over now.... Or am I totally out in left field here? -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Tuesday 09 July 2002 07:19 pm, Rod Taylor wrote: > On Tue, 2002-07-09 at 19:09, Lamar Owen wrote: > > And what if you have enough disk space to do the dump, but then that > > causes the OS upgrade to abort because there wasn't enough space left to > > finish upgrading (larger packages, perhaps)? The system's hosed, and > > it's our fault. > What normally happens when you have low amounts of free diskspace and > attempt to upgrade the system? Anaconda calculates (internally -- it's a Python program) the space required by the upgrade and won't let you proceed if you don't have enough space as reported by the RPM headers. It's impossible to know ahead of time how much space will be required by an ASCII dump of the PostgreSQL database, and thus it cannot be taken into account by that algorithm. As to failing cleanly, work is underway to allow RPM to rollback entire OS upgrades. But again the disk space requirement shoots through the ceiling if you do this. Already RPM can rollback the transaction being done on the RPM database (it's a db3 database system), but rolling back the filesystem is a little different. But anaconda (which doesn't use the command line RPM anymore, it uses librpm to do its own RPM processing) checks beforehand how much space is needed and won't let you overspend disk space during the system upgrade. The command-line RPM will also do this, and won't let you upgrade RPM's if there's not enough disk space, as calculated by reading the RPM header, which has the amount of space the uncompressed package takes (calculated as part of the RPM build process). But if you throw in an unknown increase in space that anaconda/rpm cannot grok, then you cause a situation. Can the ports system take into account the space required for a dumpfile?? :-) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
[replying to myself] On Tuesday 09 July 2002 07:34 pm, Lamar Owen wrote: > if you do this. Already RPM can rollback the transaction being done on the > RPM database (it's a db3 database system), but rolling back the filesystem > is a little different. As a note of interest, RPM itself is backed by a database, db3. Prior to version 4.x, it was backed by db1. Upgrading between the versions of RPM is simply -- installing db3 and dependenies, upgrade RPM, and run 'rpm --rebuilddb' -- which works most of the time, but there are pathological cases..... You now are running db3 instead of db1, if you didn't get bit by a pathological case. :-) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
> Can the ports system take into account the space required for a dumpfile?? :-) It cheats by keeping a backup of the old version -- makes an installable package out of the currently installed version. This is removed once the package has been successfully upgraded (including dependencies). On failure, it rolls back any packages (and those that depend on it) to prior versions it backed up and continues on trying to upgrade other parts of the system which don't depend on the rolled back portion. Portupgrade regularly upgrades part of the system if the ports tree is broken, won't build (architecture issues), couldn't download XYZ item, or has run into other problems. PostgreSQL in this case simply wouldn't get upgraded with everything else -- reporting errors at the end. That said, Postgresql also may no longer work after the upgrade -- but I guess thats what the 'test' mode is used to prevent.
On Wed, 2002-07-10 at 00:09, Lamar Owen wrote: > On Tuesday 09 July 2002 04:17 pm, Hannu Krosing > > It is quite easy to both check for a running postmaster and start/stop > > one. > > Not when there is no ps in your path. Or pg_ctl for that matter. Nor is > there necessarily a /proc tree waiting to be exploited. We're talking the > anaconda environment, which is tailored for OS installation and upgrading. > You cannot start a postmaster; you cannot check to see if one is running -- > you can't even check to see if you're in the anaconda chroot or not, so that > you can use more tools if not in the OS installation mode. Again -- the > total OS upgrade path is a big part of this scenario, as far as the RPM's are > concerned. The Debian package may or may not have as grievous a structure. No. I don't have anything like your problems to contend with! I can and do copy the old binaries and libraries in the pre-removal script, which means I have a pretty good chance of accomplishing an upgrade without user intervention. If I had your problems I'd give up!
Oliver Elphick wrote: > > The current upgrade process for PostgreSQL is founded on the idea that > people build from source. With binary distributions, half the users > wouldn't know what to do with source; they expect (and are entitled to > expect) that an upgrade will progress without the need for significant > intervention on their part. PostgreSQL makes this really difficult for > the package maintainers, and this has a knock-on effect on the > reliability of the upgrade process and thus on PostgreSQL itself. I have to object here. The PostgreSQL upgrade process is based on the idea of dump, install, initdb, restore. That has nothing to do with building from source or installing from binaries. The problem why this conflicts with these package managers is, because they work package per package, instead of looking at the big picture. Who said you can replace package A before running the pre-upgrade script of dependent package B? Somehow this looks like a foreign key violation to me. Oh, I forgot, RI constraints are for documentation purposes only ... Greetings from the MySQL documentation ;-) Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Lamar Owen wrote: > > [replying to myself] > On Tuesday 09 July 2002 07:34 pm, Lamar Owen wrote: > > if you do this. Already RPM can rollback the transaction being done on the > > RPM database (it's a db3 database system), but rolling back the filesystem > > is a little different. > > As a note of interest, RPM itself is backed by a database, db3. Prior to > version 4.x, it was backed by db1. Upgrading between the versions of RPM is > simply -- installing db3 and dependenies, upgrade RPM, and run 'rpm > --rebuilddb' -- which works most of the time, but there are pathological > cases..... > > You now are running db3 instead of db1, if you didn't get bit by a > pathological case. :-) And how big/complex is the db1/3 system catalog we're talking about exactly? How many rewrite rules have to be converted into the new parsetree format during an RPM upgrade? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wed, 2002-07-10 at 01:09, Lamar Owen wrote: > On Tuesday 09 July 2002 04:17 pm, Hannu Krosing wrote: > > On Tue, 2002-07-09 at 22:10, Lamar Owen wrote: > > > The pre-upgrade script is run in an environment that isn't robust enough > > > to handle that. What if you run out of disk space during the dump? > > > You can either check beforehand or abort and delete the offending > > dumpfile. > ... > > That's what I propose - dump all databases in pre-upgrade (if you are > > concerned about disk usage, run it twice, first to | wc and then to a > > file) and try to load in post-upgrade. > > The wc utility isn't in the path in an OS install situation. The df utility > isn't in the path, either. You can use python, though. :-) Not that that > would be a good thing in this context, however. Why not ? The following is wc in python #!/usr/bin/python import sys, string bytes,words,lines = 0,0,0 while 1: s = sys.stdin.readline() if not s: break bytes = bytes + len(s) words = words + len(string.split(s)) lines= lines + 1 sys.stdout.write('%7d %7d %7d\n' % (lines,words,bytes)) And I have written custom postgres table dumpers in python without too much effort (except reverse-engineering the page structure ;) for both 6.x and 7.x database tables, so we could actually use python here too. The basic user_data extractor part is done in about 50 lines - I did not need much else as I wrote custom datatype converters for the specific table I needed. The generic part ( conversions and determining if tuples are live) should also not bee too difficult. The only part I can see right away as hard to re-implement in python is TOAST. Still I guess that the basic db_dump.py app will be somewhere between 500 and 5000 lines long, with possibly the toast compression module done as c-language module modtoast.so The only problem with this approach is that it needs maintaining separately from postgres proper. OTOH, this may also be a good thing, as a separate reimplementation is only known working guarantee that we actually know what our page format is ;) as the docs have always been wrong about this. > Again I say -- would such a data dumper not be useful in cases of system > catalog corruption that prevents a postmaster from starting? I'm talking > about a multipurpose utility here, not just something to make my life as RPM > maintainer easy. > > The pg_fsck program is a good beginning to such a program. Where can I fing pg_fsck ? It is not in recent CVS snapshots. ------------- Hannu
On Wednesday 10 July 2002 03:24 am, Jan Wieck wrote: > Oliver Elphick wrote: > > The current upgrade process for PostgreSQL is founded on the idea that > > people build from source. With binary distributions, half the users > > wouldn't know what to do with source; they expect (and are entitled to > I have to object here. The PostgreSQL upgrade process is based on > the idea of dump, install, initdb, restore. That has nothing to > do with building from source or installing from binaries. Let me interject a minor point here. I recall upgrade cycles where I had to install a newer pg_dump in order to get my data out of the old system due to bugs in the prior pg_dump. Getting two versions of PostgreSQL to cooexist peacefully in a binary packaged environment is a completely different problem than the typical 'from source' installation path -- which almost implies two versions available concurrently. I believe this is the artifact Oliver was alluding to. I personally have not had the luxury of having two complete installations available at one instant during RPM upgrades. Nor will any users of prepackaged binaries. > The problem why this conflicts with these package managers is, > because they work package per package, instead of looking at the > big picture. Who said you can replace package A before running > the pre-upgrade script of dependent package B? How does this create the problem? The postgresql-server subpackages of two versions are 'Package A' above. There is no package B. Define 'the big picture' for all possible permutations of installed packages, please. > Somehow this looks > like a foreign key violation to me. Oh, I forgot, RI constraints > are for documentation purposes only ... Greetings from the MySQL > documentation ;-) Is sarcasm really necessary? -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
[cc: trimmed] On Wednesday 10 July 2002 03:42 am, Jan Wieck wrote: > Lamar Owen wrote: > > As a note of interest, RPM itself is backed by a database, db3. Prior to > > version 4.x, it was backed by db1. Upgrading between the versions of RPM > > is simply -- installing db3 and dependenies, upgrade RPM, and run 'rpm > > --rebuilddb' -- which works most of the time, but there are pathological > > cases..... > > You now are running db3 instead of db1, if you didn't get bit by a > > pathological case. :-) > And how big/complex is the db1/3 system catalog we're talking > about exactly? Well, on a fully installed system it's about 44MB. The RPM database isn't terribly complicated, but it's not trivial, either. However, unless I am mistaken the generic db3 situation is easy migration. >How many rewrite rules have to be converted into > the new parsetree format during an RPM upgrade? Don't know if anything comparable exists. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Wednesday 10 July 2002 09:11 am, Hannu Krosing wrote: > On Wed, 2002-07-10 at 01:09, Lamar Owen wrote: > > The wc utility isn't in the path in an OS install situation. The df > > utility isn't in the path, either. You can use python, though. :-) Not > > that that would be a good thing in this context, however. > Why not ? > The following is wc in python [snip] > And I have written custom postgres table dumpers in python without too > much effort (except reverse-engineering the page structure ;) for both > 6.x and 7.x database tables, so we could actually use python here too. I'm willing to look into this. However, the dump still has to be pulled with a standalone backend -- no networking availability can be assumed. > The only problem with this approach is that it needs maintaining > separately from postgres proper. OTOH, this may also be a good thing, as > a separate reimplementation is only known working guarantee that we > actually know what our page format is ;) as the docs have always been > wrong about this. Well, I could deal with that. > > The pg_fsck program is a good beginning to such a program. > Where can I fing pg_fsck ? [looking in my bookmarks....] http://svana.org/kleptog/pgsql/pgfsck.html > It is not in recent CVS snapshots. Martijn hasn't submitted it yet (AFAICT) for inclusion. I believe if nothing else it should be in contrib. Contrary to some people's apparent perception, I'm actually fairly flexible on this as long as the basic points can be dealt with. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Wed, 2002-07-10 at 16:20, Lamar Owen wrote: > On Wednesday 10 July 2002 09:11 am, Hannu Krosing wrote: > > On Wed, 2002-07-10 at 01:09, Lamar Owen wrote: > > > The wc utility isn't in the path in an OS install situation. The df > > > utility isn't in the path, either. You can use python, though. :-) Not > > > that that would be a good thing in this context, however. > > > Why not ? > > > The following is wc in python > > [snip] > > > And I have written custom postgres table dumpers in python without too > > much effort (except reverse-engineering the page structure ;) for both > > 6.x and 7.x database tables, so we could actually use python here too. > > I'm willing to look into this. However, the dump still has to be pulled with > a standalone backend -- no networking availability can be assumed. Actually it works on raw table file ;) I attach code that dumps data from page file for table of 4 ints all NOT NULL, like create table fourints( i1 int not null, i2 int not null, i3 int not null, i4 int not null ); the script is meant for quick and dirty resque operations, and requires that one writes their own data-field extractor code. I have used it mainly to resurrect accidentally deleted data. it is for 7.x style pagefile layout ------------------- Hannu
Attachment
On Wed, 2002-07-10 at 16:15, Lamar Owen wrote: > [cc: trimmed] > > On Wednesday 10 July 2002 03:42 am, Jan Wieck wrote: > >How many rewrite rules have to be converted into > > the new parsetree format during an RPM upgrade? > > Don't know if anything comparable exists. > IMHO the best solution here is to also keep the source code for anything that is usually kept as parse trees or somesuch so that one can get at it without full backend running. ------------- Hannu
On Wed, 2002-07-10 at 16:20, Lamar Owen wrote: > On Wednesday 10 July 2002 09:11 am, Hannu Krosing wrote: > > > The only problem with this approach is that it needs maintaining > > separately from postgres proper. OTOH, this may also be a good thing, as > > a separate reimplementation is only known working guarantee that we > > actually know what our page format is ;) as the docs have always been > > wrong about this. > > Well, I could deal with that. also we must be aware that the page format is most likely platform-dependant ----------- Hannu
On Wednesday 10 July 2002 11:48 am, Hannu Krosing wrote: > On Wed, 2002-07-10 at 16:20, Lamar Owen wrote: > > On Wednesday 10 July 2002 09:11 am, Hannu Krosing wrote: > > > And I have written custom postgres table dumpers in python without too > > > much effort (except reverse-engineering the page structure ;) for both > > > 6.x and 7.x database tables, so we could actually use python here too. > > I'm willing to look into this. However, the dump still has to be pulled > > with a standalone backend -- no networking availability can be assumed. > Actually it works on raw table file ;) > the script is meant for quick and dirty resque operations, and requires > that one writes their own data-field extractor code. I have used it > mainly to resurrect accidentally deleted data. > it is for 7.x style pagefile layout Hmmm. This is interesting stuff. I'll have to take a look at it once I'm finished re-learning Fortran 77 for a project I'm doing (34MB of DEC Fortran source that g77 doesn't like very well) for work. I have a hard time switching language gears. Particularly the Fortran 77 -> Python gear... :-) Although at least the fixed-form paradigm stays there in the transition. :-) It's been a very long time since I've done Fortran of this complexity. Actually, I've never done Fortran of _this_ complexity -- this is serious number-crunching stuff that uses all manners of higher math (tensors, even). There is no direct C equivalent to some of the stuff this code is doing -- which is part of the reason g77 is having problems. But I digress. Getting the %pre scriptlet to use a non-sh interpreter is undocumented, but not hard. :-) (actually, I stumbled upon it by accident one time -- that time it was a bug....) Now to see if it can be done consistently in both the anaconda chroot as well as a standard rpm command line invocation. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
On Wed, 2002-07-10 at 19:56, Lamar Owen wrote: > On Wednesday 10 July 2002 11:48 am, Hannu Krosing wrote: > > On Wed, 2002-07-10 at 16:20, Lamar Owen wrote: > > > On Wednesday 10 July 2002 09:11 am, Hannu Krosing wrote: > > > > And I have written custom postgres table dumpers in python without too > > > > much effort (except reverse-engineering the page structure ;) for both > > > > 6.x and 7.x database tables, so we could actually use python here too. > > > > I'm willing to look into this. However, the dump still has to be pulled > > > with a standalone backend -- no networking availability can be assumed. > > > Actually it works on raw table file ;) > > > the script is meant for quick and dirty resque operations, and requires > > that one writes their own data-field extractor code. I have used it > > mainly to resurrect accidentally deleted data. > > > it is for 7.x style pagefile layout > > Hmmm. This is interesting stuff. I'll have to take a look at it once I'm > finished re-learning Fortran 77 for a project I'm doing (34MB of DEC Fortran > source that g77 doesn't like very well) for work. I have a hard time > switching language gears. Particularly the Fortran 77 -> Python gear... :-) > Although at least the fixed-form paradigm stays there in the transition. :-) > It's been a very long time since I've done Fortran of this complexity. > Actually, I've never done Fortran of _this_ complexity -- this is serious > number-crunching stuff that uses all manners of higher math (tensors, even). > There is no direct C equivalent to some of the stuff this code is doing -- > which is part of the reason g77 is having problems. But I digress. Once you understand what te code is doing you can port it to python using Numerical Python (http://www.pfdubois.com/numpy/) and/or Scientific Python (http://starship.python.net/~hinsen/ScientificPython/) to get a head-start in total conversion to python ;) You may even try using F2PY Fortran to Python Interface Generator (http://cens.ioc.ee/projects/f2py2e/). > Getting the %pre scriptlet to use a non-sh interpreter is undocumented, but > not hard. :-) (actually, I stumbled upon it by accident one time -- that > time it was a bug....) Now to see if it can be done consistently in both the > anaconda chroot as well as a standard rpm command line invocation. Actually, if the python dumper can be made to work somewhat reiably it can be run after install/upgrade without too much trouble. -------------- Hannu
On Wednesday 10 July 2002 10:26 am, Hannu Krosing wrote: > Actually, if the python dumper can be made to work somewhat reiably it > can be run after install/upgrade without too much trouble. Yes, yes, of course. My bad -- brain needs oil change... :-) Thanks for the links to the python stuff, particularly the fortran to python translator. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen wrote: > On Wednesday 10 July 2002 03:24 am, Jan Wieck wrote: > > > The problem why this conflicts with these package managers is, > > because they work package per package, instead of looking at the > > big picture. Who said you can replace package A before running > > the pre-upgrade script of dependent package B? > > How does this create the problem? The postgresql-server subpackages of two > versions are 'Package A' above. There is no package B. Someone was talking about doing a complete OS upgrade and updating something the new PG release (that is scheduled for update later) needs but what makes the current old release not functional any more. Maybe I misunderstood something. > > Define 'the big picture' for all possible permutations of installed packages, > please. Got me on that. Sure, with all the possible permutations there is allways an unsolveable dependency. What I think is, that knowing all packages that are installed, that are to be added/removed/updated, it would be possible to run pre-install, pre-update, pre-remove scripts for all packages first. They have to clean up, save info and the like (dump in our case, maybe install a new version of pg_dump runnable in the old environment), but NOT disable functionality of any package. Second install all binaries. Third run a second round of scripts for all packages, finalizing the packages action. > > > Somehow this looks > > like a foreign key violation to me. Oh, I forgot, RI constraints > > are for documentation purposes only ... Greetings from the MySQL > > documentation ;-) > > Is sarcasm really necessary? Really really! I am dependent on it. If I don't get my daily dosis of sarcasm, I become extremely ironic or sometimes cynic. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wednesday 10 July 2002 04:42 pm, Jan Wieck wrote: > Lamar Owen wrote: > > On Wednesday 10 July 2002 03:24 am, Jan Wieck wrote: > > > The problem why this conflicts with these package managers is, > > > because they work package per package, instead of looking at the > > > big picture. Who said you can replace package A before running > > > the pre-upgrade script of dependent package B? > > The postgresql-server subpackages of > > two versions are 'Package A' above. There is no package B. > Someone was talking about doing a complete OS upgrade and updating > something the new PG release (that is scheduled for update later) needs > but what makes the current old release not functional any more. Maybe I > misunderstood something. Yes, you misunderstood. The whole release is upgraded, and its the database itself that breaks. How is the package manager supposed to know you had to make a backup copy of an executable in order to cater to the broken upgrade cycle? (Is that sarcastic enough :-)... being that you like your daily dose :-)). The backup executable no longer 'belongs' to any package as far as the rpm database is concerned. Suppose the upgrade in question was from PostgreSQL 7.0.3-2 to 7.2.1-5 (the '-2' and '-5' are the release numbers of that particular RPMset -- a version number for the package independent of the upstream program). The backend itself belongs to package 'postgresql-server' in both versions. After checking that postinstallation dependencies will be satisfied by its actions, the upgrade proceeds to install postgresql-server-7.2.1-5, which has a %pre scriptlet that makes a copy of /usr/bin/postgres and links, along with libpq and pg_dump for that version, into /usr/lib/pgsql/backups (IIRC -- it's been a long day and I haven't checked the accuracy of that detail). Said %pre scriptlet runs, then rpm unpacks its payload, a cpio archive containing the files of the package. /usr/bin/postgres is one of the files overwritten in this process. There could be trigger scripts installed by other packages run at this time. Then the %post scriptlet is run, which in practice creates a postgres user and group, chowns a few directories, and runs ldconfig to get any new shared libraries. Now the postgresql-server-7.0.3-2 package gets uninstalled. First, the %preuninst scriptlet runs. Note that a conditional is available to distinguish between an upgrade 'uninstall' and a real uninstall. Then any registered triggers are run. Then any non-overwritten files are removed, and the database entries for 7.0.3-2 are removed. Finally, the %postuninst scriptlet runs. You now have the new package in place. During an OS upgrade the dependencies are finagled in such a way that the 'satisfied dependencies' for postgresql-server-7.0.3-2, which is going to be replaced by postgresql-server-7.2.1-5's, won't be required any more. Unless another package requires the various shared libraries the 7.0.3-2 backend required, those shared libraries may get 'upgraded' out of the way -- the scriptlets have no way of communicating to the upgrade process 'hey! hold on to the dependency information for postgresql-server-7.0.3-2, even though that package is no longer marked as being installed.' Whew. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11