Thread: Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Bruce Momjian

Date:

27 October 2000, 23:36:15

> Ok, here goes:

Cool, a list.

> *    Location-agnostic installation.  Documentation (which I'll be happy to
> contribute) on that.  Peter E is already working in this area. Getting
> the installation that 'make install' spits out massaged into an FHS
> compliant setup is the majority of the RPM's spec file.

Well, we certainly don't want to make changes that make things harder or
more confusing for non-RPM installs.  How are they affected here?

> *    Upgrades that don't require an ASCII database dump for migration. This
> can either be implemented as a program to do a pg_dump of an arbitrary
> version of data, or as a binary migration utility.  Currently, I'm
> saving old executables to run under a special environment to pull a dump
> -- but it is far from optimal.  What if the OS upgrade behind 99% of the
> upgrades makes it where those old executables can't run due to binary
> incompatibility (say I'm going from RedHat 3.0.3 to RedHat 7 -- 3.0.3,
> IIRC, as a.out...( and I know 3.0.3 didn't have PostgreSQL RPMs).)? 
> What I could actually do to prevent that problem is build all of
> PostgreSQL's 6.1.x, 6.2.x, 6.3.x, 6.4.x, and 6.5.x and include the
> necessary backend executables as part of the RPM.... But I think you see
> the problem there.  However, that would in my mind be better than the
> current situation, albeit taking up a lot of space.

I really don't see the issue here.  We can compress ASCII dump files, so
the space need should not be too bad.  Can't you just check to see if
there is enough space, and error out if there is not?  If the 2GIG limit
is a problem, can't the split utility drop the files in <2gig chunks
that can be pasted together in a pipe on reload?

> *    A less source-centric mindset.  Let's see, how to explain?  The
> regression tests are a good example.  You need make. You need the source
> installed, configured, and built in the usual location.  You need
> portions of contrib.  RPM's need to be installable on compiler-crippled
> servers for security.  While the demand for regression testing on such a
> box may not be there, it certainly does give the user something to use
> to get standard output for bug reports.  As a point, I run PostgreSQL in
> production on a compilerless machine.  No compiler == more security. 
> And Linux has enough security problems without a compiler being
> available :-(.  Oh, and I have no make on that machine either.

Well, no compiler?  I can't see how we would do that without making
other OS installs harder.  That is really the core of the issue.  We
can't be making changes that make things harder for other OS's.  Those
have to be isolated in the RPM, or in some other middle layer.


> 
> The documentation as well as many of the examples assume too much, IMHO,
> about the install location and the install methodology.

Well, if we are not specific, things get very confusing for those other
OS's.  Being specific about locations makes things easier.  Seems we may
need to patch RPM installs to fix that.  Certainly a pain, but I see no
other options.

> 
> I think I may have a solution for the library versioning problem. 
> Rather than symlink libpq.so->libpq.so.2->libpq.so.2.x, I'll copy
> libpq.so.2.1 to libpq.so.2 and symlink libpq.so to that.  A little more
> code for me.  There is no real danger in version confusion with RPM's
> versioning and upgrade methodology, as long as you consistently use the
> RPMset.  The PostgreSQL version number is readily found from an RPM
> database query, making the so version immaterial.

Oh, that is good.

> 
> The upgrade issue is the hot trigger for me at this time.  It is and has
> been a major drain on my time and effort, as well as Trond's and others,
> to get the RPM upgrade working even remotely smoothly.  And I am willing
> to code -- once I know how to go about doing it in the backend.

Please give us more information about how the current upgrade is a
problem.  We don't hear that much from other OS's.  How are RPM's
specific, and maybe we can get a plan for a solution.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Lamar Owen

Date:

30 October 2000, 10:33:31

[Since I've rested over the weekend, I hope I don't come across this
morning as an angry old snarl, like some of my previous posts on this
subject unfortunately have been.]

Bruce Momjian wrote:
> > *     Location-agnostic installation.  Documentation (which I'll be happy to
> > contribute) on that.  Peter E is already working in this area. Getting
> > the installation that 'make install' spits out massaged into an FHS
> > compliant setup is the majority of the RPM's spec file.
> Well, we certainly don't want to make changes that make things harder or
> more confusing for non-RPM installs.  How are they affected here?

They wouldn't be.  Peter E has seemingly done an excellent job in this
area. I say seemingly because I haven't built an RPM from the 7.1 branch
yet, but from what he has posted, he seems to understand the issue. 
Many thanks, Peter.
> > *     Upgrades that don't require an ASCII database dump for migration. This
> > can either be implemented as a program to do a pg_dump of an arbitrary
> > version of data, or as a binary migration utility.  Currently, I'm
> I really don't see the issue here.

At the risk of being redundant, here goes.  As I've explained before,
the RPM upgrade environment, thanks to our standing with multiple
distributions as being shipped as a part of the OS, could be run as part
of a general-purpose OS upgrade.  In the environment of the general
purpose OS upgrade, the RPM's installation scripts cannot fire up a
backend, nor can it assume one is running or is not running, nor can the
RPM installation scripts fathom from the run-time environment whether
they are being run from a command line or from the OS upgrade (except on
Linux Mandrake, which allows such usage).

Thus, if a system administrator upgrades a system, or if an end user who
has a pgaccess-customized data entry system for things as mundane as an
address list or recipe book, there is no opportunity to do a dump.  The
dump has to be performed _after_ the RPM upgrade.

Now, this is far from optimal, I know.  I _know_ that the user should
take pains with their data.  I know that there should be a backup.  I
also know that a user of PostgreSQL should realize that 'this is just
the way it is done' and do things Our Way.

I also know that few new users will do it 'Our Way'.  No other package
that I am aware of requires the manual intervention that PostgreSQL
does, with the possible exception of upgrading to a different file
system -- but that is something most new users won't do, and is
something that is more difficult to automate.

However, over the weekend, while resting (I did absolutely NO computer
work this weekend -- too close to burnout), I had a brainstorm.

A binary migration tool does not need to be written, if a concession to
the needs of some users who just simply want to upgrade can be made.

Suppose we can package old backends (with newer network code to connect
to new clients).  Suppose further that postmaster can be made
intelligent enough to fire up old backends for old data, using
PG_VERSION as a key.  Suppose a NOTICE can be fired off warning the user
that 'The Database is running in Compatibility Mode -- some features may
not be available.  Please perform a dump of your data, reinitialize the
database, and restore your data to access new features of version x.y'.

I'm highly considering doing just that from a higher level.  It will not
be nearly as smooth, but doable.

Of course, that increases maintenance work, and I know it does.  But I'm
trying to find a middle ground here, since providing a true migration
utility (even if it just produces a dump of the old data) seems out of
reach at this time.

We are currently forcing something like a popular word processing
program once did -- it's proprietary file format changed.  It was coded
so that it could not even read the old files.  But both the old and the
new versions could read and write an interchange format.  People who
blindly upgraded their word processor were hit with a major problem. 
There was even a notice in the README -- which could be read after the
program was installed.

While the majority of us use PostgreSQL as a server behind websites and
other clients, there will be a large number of new users who want to use
it for much more mundane tasks.  Like address books, or personal
information management, or maybe even tax records.  Frontends to
PostgreSQL, thanks to PostgreSQL's advanced features, are likely to span
the gamut -- we already have OnShore TimeSheet for time tracking and
payroll, as one example.  And I even see database-backed intranet-style
web scripts being used on a client workstation for these sorts of
things.  I personally do just that with my home Linux box -- I have a
number of AOLserver dynamic pages that use PostgreSQL for many mundane
tasks (a multilevel sermon database is one).

While I don't need handholding in the upgrade process, I have provided
support to users that do -- who are astonished at the way we upgrade. 
Seamless upgrading won't help me personally -- but it will help
multitudes of users -- not just RPM users.  As a newbie to PostgreSQL I
was bitten, giving me compassion on those who might be bitten.

> We can compress ASCII dump files, so
> the space need should not be too bad.

Space isn't the problem.  The procedure is the problem.  Even if the
user fails to do it Right, we should at least attempt to help them
recover, IMHO.

> > *     A less source-centric mindset.  Let's see, how to explain?  The
> > regression tests are a good example.  You need make. You need the source
[snip]
> > it certainly does give the user something to use
> > to get standard output for bug reports.  As a point, I run PostgreSQL in

> Well, no compiler?  I can't see how we would do that without making
> other OS installs harder.  That is really the core of the issue.  We
> can't be making changes that make things harder for other OS's.  Those
> have to be isolated in the RPM, or in some other middle layer.

And I've done that in the past with the older serialized regression
tests.

I don't see how executing a shell script instead of executing a make
command would make it any harder for other OS users.  I am not trying to
make it harder for other OS users.  I _am_ trying to make it easier for
users who are getting funny results from queries to be able to run
regression tests as a standardized way to see where the problem lies. 
Maybe there is a hardware issue -- regression testing might be the only
way to have a standard way to pinpoint the problem.

And telling someone who is having a problem with prepackaged binaries
'Run the regression tests by executing the script
/usr/lib/pgsql/tests/regress/regress.sh and pipe me the results' is much
easier to do than 'Find me a test case where this blow up, and pipe me a
backtrace/dump/whatever' for the new users.  Plus that regression output
is a known quantity.

Or, to put it in a soundbite, regression testing can be the user's best
bug-zapping friend.

> > The documentation as well as many of the examples assume too much, IMHO,
> > about the install location and the install methodology.
> Well, if we are not specific, things get very confusing for those other
> OS's.  Being specific about locations makes things easier.  Seems we may
> need to patch RPM installs to fix that.  Certainly a pain, but I see no
> other options.

I can do that, I guess.  I currently ship the README.rpm as part of the
package -- but I continue to hear from people who have not read it, but
have read the online docs.  I have even put the unpacked source RPM up
on the ftp site so that people can read the README right online.
> Please give us more information about how the current upgrade is a
> problem.  We don't hear that much from other OS's.  How are RPM's
> specific, and maybe we can get a plan for a solution.

RPM's are expected to 'rpm -U' and you can simply _use_ the new version,
with little to no preparation.  At least that is the theory.  And it
works for most packages.

--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Peter Eisentraut

Date:

31 October 2000, 04:42:15

Lamar Owen writes:

> In the environment of the general purpose OS upgrade, the RPM's
> installation scripts cannot fire up a backend, nor can it assume one
> is running or is not running, nor can the RPM installation scripts
> fathom from the run-time environment whether they are being run from a
> command line or from the OS upgrade (except on Linux Mandrake, which
> allows such usage).

I don't understand why this is so.  It seems perfectly possible that some
%preremovebeforeupdate starts a postmaster, runs pg_dumpall, saves the
file somewhere, then the %postinstallafterupdate runs the inverse
operation.  Disk space is not a valid objection, you'll never get away
without 2x storage.  Security is not a problem either.  Are you not
upgrading in proper dependency order or what?  Everybody does dump,
remove, install, undump; so can the RPMs.

Okay, so it's not as great as a new KDE starting up and asking "may I
update your configuration files?", but understand that the storage format
is optimized for performance, not easy processing by external tools or
something like that.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Lamar Owen

Date:

31 October 2000, 10:51:30

[Explanation on why an RPM cannot dump a database during upgrade
follows.  This is a lengthy explanation.  If you don't want to read it,
please hit 'Delete' now. -- Also, I have blind copied Hackers, and cc:'d
PORTS, as that is where this discussion belongs, per Bruce's wishes.]

Peter Eisentraut wrote:
> Lamar Owen writes:
> > In the environment of the general purpose OS upgrade, the RPM's
> > installation scripts cannot fire up a backend, nor can it assume one

> I don't understand why this is so.  It seems perfectly possible that some
> %preremovebeforeupdate starts a postmaster, runs pg_dumpall, saves the
> file somewhere, then the %postinstallafterupdate runs the inverse
> operation.  Disk space is not a valid objection, you'll never get away
> without 2x storage.  Security is not a problem either.  Are you not
> upgrading in proper dependency order or what?  Everybody does dump,
> remove, install, undump; so can the RPMs.

The RedHat installer (anaconda) is running in a terribly picky
environment. There a very few tools in this environment -- after all,
this is an installer we're talking about here.  Starting a postmaster is
likely to fail, and fail big.  Further, the anaconda install environment
is a chroot -- or, at least the environment the RPM scriptlets run in is
a chroot -- a chroot that is the active filesystem that is being
upgraded.  This filesystem likely contains old libraries, old
executables, and other programs that may have a hard time running under
the limited installation kernel and the limited libraries available to
the installer.

And since packages are actively discouraged from probing whether they're
running in the anaconda chroot or not, it is not possible to start a
postmaster.  Mandrake allows packages to probe this -- which I
personally think is a bad idea -- packages that need to know this sort
of information are usually packages that would be better off finding a
least common denominator upgrade path that will work the best.  A single
upgrade path is much easier to maintain the two upgrade paths.

Sure, during a command line upgrade, I can probe for a postmaster, and
even start one -- but I dare say the majority of PostgreSQL RPM upgrades
don't happen from the command line.  Even if I _can_ probe whether I'm
in the anaconda chroot or not, I _still_ have to have an upgrade path in
case this _is_ an OS upgrade.

Think about it: suppose I had a postmaster start up, and a pg_dumpall
runs during OS upgrade.  Calculating free space is not possible -- you
are in the middle of an OS upgrade, and more packages may be selected
for installation than are already installed -- or, an upgrade to an
existing package may take more space than the previous version (XFree86
3.3.6 to XFree86 4.0.1 is a good example) -- you have no way of knowing
from the RPM installation scripts in the package how much free space
there will or won't be when the upgrade is complete.  And anaconda
doesn't help you out with an ESTIMATED_SPACE_AFTER_INSTALL environment
variable.

And you really can't assume 2x space -- the user may have decided that
this machine that didn't have TeX installed needs TeX installed, and
Emacs, and, while it didn't have GNOME before, it needs it now.....
Sure, the user just got himself in a pickle -- but I'm not about to be
the scapegoat for _his_ pickle.

And I can't assume that the /var partition (where the dataset resides)
is separate, or that it even has enough space -- the user might be
dumping to another filesystem, or maybe onto tape.  And, in the confines
of an RPM %pre scriptlet, I have no way of finding out.

Furthermore, I can't accurately predict how much space even a compressed
ASCII dump will take .  Calculating the size of the dataset in PGDATA
does not accurately predict the size of the dumpfile.

As to using split or the like to split huge dumpfiles, that is a
necessity -- but the space calculation problem defeats the whole concept
of dump-during-upgrade.  I cannot determine how much space I have, and I
cannot determine how much space I need -- and, if I overflow the
filesystem during an OS upgrade that is halfway complete (PostgreSQL
usually is upgraded about two thirds of the way through or so), then I
leave the user with a royally hosed system.  I don't want that on my
shoulders, do you? :-)

Therefore, the pg_dumpall _has_ to occur _after_ the new version has
overwritten the old version, and _after_ the OS upgrade is completed --
unless the user has done what they should have done to begin with --
but, the fact of the matter is that many users simply won't do it Right.

You can't assume the user is going to be reasonable by your standard --
in fact, you have to do the opposite -- your standard of reasonable, and
the user's standard of reasonable, might be totally different things.

Incidentally, I originally attempted doing the dump inside the
preinstall, and found it to be an almost impossible task.  The above
reasons might be solvable, but then there's this little problem: what if
you _are_ able to predict the space needed and the space available --
and there's not enough space available?

The PostgreSQL RPM's are not a single package, and anaconda has no way
of rolling back another part of an RPMset's installation if one part
fails.  So, you can't just abort because you failed to dump -- the
package that needs the dump is the server subpackage -- and the main
package has already finished installation by that time.  And you can't
roll it back.

And the user has a hosed PostgreSQL installation as a result.

As to why the package is split, well, it is highly useful to many people
to have a PostgreSQL _client_ installation that accesses a central
database server -- there is no need to have a postmaster and a full
backend when all you need is psql and the libraries and documentation
that goes along with psql.

RPM's have to deal with both a very difficult environment, and users who
might not be as technically savvy as those who install from source.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Bruce Momjian

Date:

01 November 2000, 22:27:29

> > Well, we certainly don't want to make changes that make things harder or
> > more confusing for non-RPM installs.  How are they affected here?
> 
> They wouldn't be.  Peter E has seemingly done an excellent job in this
> area. I say seemingly because I haven't built an RPM from the 7.1 branch
> yet, but from what he has posted, he seems to understand the issue. 
> Many thanks, Peter.

OK, glad that is done.

> > > *     Upgrades that don't require an ASCII database dump for migration. This
> > > can either be implemented as a program to do a pg_dump of an arbitrary
> > > version of data, or as a binary migration utility.  Currently, I'm
>  
> > I really don't see the issue here.
> 
> At the risk of being redundant, here goes.  As I've explained before,
> the RPM upgrade environment, thanks to our standing with multiple
> distributions as being shipped as a part of the OS, could be run as part
> of a general-purpose OS upgrade.  In the environment of the general
> purpose OS upgrade, the RPM's installation scripts cannot fire up a
> backend, nor can it assume one is running or is not running, nor can the
> RPM installation scripts fathom from the run-time environment whether
> they are being run from a command line or from the OS upgrade (except on
> Linux Mandrake, which allows such usage).

OK, maybe doing it in an RPM is the wrong way to go.  If an old version
exists, maybe the RPM is only supposed to install the software in a
saved location, and the users must execute a command after the RPM
install that starts the old postmaster, does the dump, puts the new
PostgreSQL server in place, and reloads the database.

> > Well, no compiler?  I can't see how we would do that without making
> > other OS installs harder.  That is really the core of the issue.  We
> > can't be making changes that make things harder for other OS's.  Those
> > have to be isolated in the RPM, or in some other middle layer.
> 
> And I've done that in the past with the older serialized regression
> tests.
> 
> I don't see how executing a shell script instead of executing a make
> command would make it any harder for other OS users.  I am not trying to
> make it harder for other OS users.  I _am_ trying to make it easier for
> users who are getting funny results from queries to be able to run
> regression tests as a standardized way to see where the problem lies. 
> Maybe there is a hardware issue -- regression testing might be the only
> way to have a standard way to pinpoint the problem.

You are basically saying that because you can ship without a compiler
sometimes, we are supposed to change the way our regression tests work.
Let's suppose SCO says they don't ship with a compiler, and wants us to
change our code to accomodate it.  Should we?  You can be certain we
would not, and in the RPM case, you get the same answer.

If the patch is trivial, we will work around OS limitations, but we do
not redesign code to work around OS limitations.  We expect the OS to
get the proper features.  That is what we do with NT.  Cygwin provides
the needed features.

> > Please give us more information about how the current upgrade is a
> > problem.  We don't hear that much from other OS's.  How are RPM's
> > specific, and maybe we can get a plan for a solution.
> 
> RPM's are expected to 'rpm -U' and you can simply _use_ the new version,
> with little to no preparation.  At least that is the theory.  And it
> works for most packages.

This is the "Hey, other people can do it, why can't you" issue.  We are
looking for suggestions from Linux users in how this can be done. 
Perhaps running a separate command after the RPM has been installed is
the only way to go.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Lamar Owen

Date:

02 November 2000, 10:28:56

Bruce Momjian wrote:
> > At the risk of being redundant, here goes.  As I've explained before,
> > the RPM upgrade environment, thanks to our standing with multiple
> > distributions as being shipped as a part of the OS, could be run as part
> OK, maybe doing it in an RPM is the wrong way to go.  If an old version
> exists, maybe the RPM is only supposed to install the software in a
> saved location, and the users must execute a command after the RPM
> install that starts the old postmaster, does the dump, puts the new
> PostgreSQL server in place, and reloads the database.

That's more or less what's being done now.  The RPM's preinstall script
(run before any files are overwritten from the new package) backs up the
required executables from the old installation.  The RPM then overwrites
the necessary files, and then any old files left over are removed, along
with database removal of their records.

A script (actually, two scripts due to a bug in the first one) is
provided to dump the database using the old executables.  Which works OK
as long as the new OS release is executable compatible with the old
release.  Oliver Elphick originally wrote the script for the Debian
packages, and I adapted it to the RPM environment.

However, the dependency upon the new version of the OS being able to run
the old executables could be a killer in the future if executable
compatibility is removed -- after all, an upgrade might not be from the
immediately prior version of the OS.
> You are basically saying that because you can ship without a compiler
> sometimes, we are supposed to change the way our regression tests work.
> Let's suppose SCO says they don't ship with a compiler, and wants us to
> change our code to accomodate it.  Should we?  You can be certain we
> would not, and in the RPM case, you get the same answer.
> If the patch is trivial, we will work around OS limitations, but we do
> not redesign code to work around OS limitations.  We expect the OS to
> get the proper features.  That is what we do with NT.  Cygwin provides
> the needed features.

No, I'm saying that someone running any OS might want to do this:
1.)    They have two machines, a development machine and a production
machine.  Due to budget constraints, the dev machine is an el cheapo
version of the production machine (for the sake of argument, let's say
dev is a Sun Ultra 1 bare-bones workstation, and the production is a
high end SMP Ultra server, both running the same version of Solaris).

2.)    For greater security, the production machine has been severely
crippled WRT development tools -- if a cracker gets in, don't give him
any ammunition.  Good procedure to follow for publicly exposed database
servers, like those that sit behind websites.  Requiring such a server
to have a development system installed is a misfeature, IMHO.

3.)    After compiling and testing PostgreSQL on dev, the user transfers
the binaries only over to production.  All is well, at first.

4.)    But then the load on production goes up -- and PostgreSQL starts
spitting errors and FATAL's.  The problem cannot be duplicated on the
dev machine -- looks like a Solaris SMP issue.

5.)    The user decides to run regression on production in parallel mode to
help debug the problem -- but cannot figure out how to do so without
installing make and other development tools on it, when he specifically
did not want those tools on there for security.  Serial regression,
which is easily started in a no-make mode, doesn't expose the problem.

All I'm saying is that regression should be _runnable_ in all modes
without needing anything but a shell and the PostgreSQL binary
installation.

This is the problem -- it is not OS-specific.
> > RPM's are expected to 'rpm -U' and you can simply _use_ the new version,
> > with little to no preparation.  At least that is the theory.  And it
> > works for most packages.
> This is the "Hey, other people can do it, why can't you" issue.  We are
> looking for suggestions from Linux users in how this can be done.
> Perhaps running a separate command after the RPM has been installed is
> the only way to go.

It's not really an RPM issue -- it's a PostgreSQL issue -- there have
been e-mails from users of other OS's -- even those that compile from
source -- expressing a desire for a smoother upgrade cycle.  The RPM's,
Debian packages, and other binary packages just put the extant problem
in starker contrast.  Until such occurs, I'll just have to continue
doing what I'm doing -- which I consider a stop-gap, not a solution.

And, BTW, welcome back from the summit.  I heard that there was a little
'excitement' there :-).
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Bruce Momjian

Date:

02 November 2000, 13:51:09

> Bruce Momjian wrote:
> > > At the risk of being redundant, here goes.  As I've explained before,
> > > the RPM upgrade environment, thanks to our standing with multiple
> > > distributions as being shipped as a part of the OS, could be run as part
>  
> > OK, maybe doing it in an RPM is the wrong way to go.  If an old version
> > exists, maybe the RPM is only supposed to install the software in a
> > saved location, and the users must execute a command after the RPM
> > install that starts the old postmaster, does the dump, puts the new
> > PostgreSQL server in place, and reloads the database.
> 
> That's more or less what's being done now.  The RPM's preinstall script
> (run before any files are overwritten from the new package) backs up the
> required executables from the old installation.  The RPM then overwrites
> the necessary files, and then any old files left over are removed, along
> with database removal of their records.
> 
> A script (actually, two scripts due to a bug in the first one) is
> provided to dump the database using the old executables.  Which works OK
> as long as the new OS release is executable compatible with the old
> release.  Oliver Elphick originally wrote the script for the Debian
> packages, and I adapted it to the RPM environment.
> 
> However, the dependency upon the new version of the OS being able to run
> the old executables could be a killer in the future if executable
> compatibility is removed -- after all, an upgrade might not be from the
> immediately prior version of the OS.


That is a tough one.  I see your point.  How would the RPM do this
anyway?  It is running the same version of the OS right?  Did they move
the data files from the old OS to the new OS and now they want to
upgrade?  Hmm.

> > You are basically saying that because you can ship without a compiler
> > sometimes, we are supposed to change the way our regression tests work.
> > Let's suppose SCO says they don't ship with a compiler, and wants us to
> > change our code to accomodate it.  Should we?  You can be certain we
> > would not, and in the RPM case, you get the same answer.
>  
> > If the patch is trivial, we will work around OS limitations, but we do
> > not redesign code to work around OS limitations.  We expect the OS to
> > get the proper features.  That is what we do with NT.  Cygwin provides
> > the needed features.
> 
> No, I'm saying that someone running any OS might want to do this:
> 1.)    They have two machines, a development machine and a production
> machine.  Due to budget constraints, the dev machine is an el cheapo
> version of the production machine (for the sake of argument, let's say
> dev is a Sun Ultra 1 bare-bones workstation, and the production is a
> high end SMP Ultra server, both running the same version of Solaris).

Yes, but if we added capabilities every time someone wanted something so
it worked better in their environment, this software would be a mess,
right?

> > This is the "Hey, other people can do it, why can't you" issue.  We are
> > looking for suggestions from Linux users in how this can be done.
> > Perhaps running a separate command after the RPM has been installed is
> > the only way to go.
> 
> It's not really an RPM issue -- it's a PostgreSQL issue -- there have
> been e-mails from users of other OS's -- even those that compile from
> source -- expressing a desire for a smoother upgrade cycle.  The RPM's,
> Debian packages, and other binary packages just put the extant problem
> in starker contrast.  Until such occurs, I'll just have to continue
> doing what I'm doing -- which I consider a stop-gap, not a solution.

Yes, we all agree upgrades should be smoother.  The problem is that the
cost/benefit analysis always pushed us away from improving it.

> 
> And, BTW, welcome back from the summit.  I heard that there was a little
> 'excitement' there :-).

Yes, it was very nice.  I will post a summary to announce/general today.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Lamar Owen

Date:

02 November 2000, 14:22:52

Bruce Momjian wrote:
> > However, the dependency upon the new version of the OS being able to run
> > the old executables could be a killer in the future if executable
> > compatibility is removed -- after all, an upgrade might not be from the
> > immediately prior version of the OS.
> That is a tough one.  I see your point.  How would the RPM do this
> anyway?  It is running the same version of the OS right?  Did they move
> the data files from the old OS to the new OS and now they want to
> upgrade?  Hmm.

Well, let's suppose the following: J. Random User has a database server
that has been running smoothly for ages on RedHat 5.2, running
PostgreSQL 6.3.2. He has had no reason to upgrade since -- while MVCC
was a nice feature, he was really waiting for OUTER JOIN before
upgrading, as his server is lightly loaded and won't benefit greatly
from MVCC.  

Likewise, he's not upgraded from RedHat 5.2, because until RedHat got
the 2.4 kernel into a distribution, he wasn't ready to upgrade, as he
needs improved NFS performance, available in Linux kernel 2.4.  And he
wasn't about to go with a version of GCC that doesn't exist.  So he
skips the whole RedHat 6.x series -- he doesn't want to mess with kernel
2.2 in any form, thanks to its abyssmal NFS performance.

So he waits on RedHat 7.2 to be released -- around October 2001 (if the
typical RedHat schedule holds).  At this point, PostgreSQL 7.2.1 is the
standards bearer, with OUTER JOIN support that he craves, and robust WAL
for excellent recoverability, amongst other Neat Features(TM).

Now, by the time of RedHat 7.2, kernel 2.4 is up to .15 or so, with gcc
3.0 freshly (and officially) released, and glibc 2.2.5 finally fixing
the problems that had plagued both pre-2.2 glibc's AND the earliest 2.2
glibc's -- but, the upshot is that glibc 2.0 compatibility is toast.

Now, J Random slides in the new OS CD on a backup of his main server,
and upgrades.  RedHat 7.2's installer is very smart -- if no packages
are left that use glibc 2.0, it doesn't install the compat-libs
necessary for glibc 2.0 apps to run.

The PostgreSQL RPMset's server subpackage preinstall script runs about
two-thirds of the way through the upgrade, and backs up the old 6.3.2
executables necessary to pull a dump.  The old 6.3.2 rpm database
entries are removed, and, as far as the system is concerned, no
dependency upon glibc 2.0 remains, so no compat-libs get installed.

J Random checks out the new installation, and finds a conspicuous log
message telling him to read /usr/share/doc/postgresql-7.2.1/README.rpm.
He does so, and runs the (fixed by then) postgresql-dump script, which
attempts to start an old backend and do a pg_dumpall -- but, horrors,
the old postmaster can't start, glibc 2.0 is gone and glibc 2.2 blows
core loaded under postmaster-6.3.2.  ARGGHHH....

That's the scenario I have nightmares about.  Really.
> Yes, but if we added capabilities every time someone wanted something so
> it worked better in their environment, this software would be a mess,
> right?

Yes, it would.  I'll work on a patch, and we'll see what it looks like.
> > been e-mails from users of other OS's -- even those that compile from
> > source -- expressing a desire for a smoother upgrade cycle.  The RPM's,
> Yes, we all agree upgrades should be smoother.  The problem is that the
> cost/benefit analysis always pushed us away from improving it.

I understand.  

I'm looking at some point in time in the future doing a
'postgresql-upgrade' RPM that would include pre-built postmasters and
other binaries necessary to dump any previous version PostgreSQL (since
about 6.2.1 or so -- 6.2.1 was the first RedHat official PostgreSQL RPM,
although there were 6.1.1 RPM's before that, and there is still a
postgres95-1.09 RPM out there), linked to the current libs for that
RPM's OS release.  It would be a large RPM (and the source RPM for it
would be _huge_, containing entire tarballs for at least 6.2.1, 6.3.2,
6.4.2, 6.5.3, and 7.0.3).  But, this may be the only way to make this
work barring a real migration utility.

> > And, BTW, welcome back from the summit.  I heard that there was a little
> > 'excitement' there :-).
> Yes, it was very nice.  I will post a summary to announce/general today.

Good.  And a welcome back to Tom as well, as he went too, IIRC.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

teg@redhat.com (Trond Eivind Glomsrød)

Date:

02 November 2000, 14:32:47

Lamar Owen <lamar.owen@wgcr.org> writes:

> Now, J Random slides in the new OS CD on a backup of his main server,
> and upgrades.  RedHat 7.2's installer is very smart -- if no packages
> are left that use glibc 2.0, it doesn't install the compat-libs
> necessary for glibc 2.0 apps to run.

Actually, glibc is a bad example of things to break - it has versioned
symbols, so postgresql is pretty likely to continue working (barring
doing extremely low-level stuff, like doing weird things to the loader
or depend on buggy behaviour (like Oracle did)).

Postgresql doesn't use C++ either (which is a horrible mess wrt. binary
compatibility - there is no such thing, FTTB).

However, if it depended on kernel specific behaviour (like things in
/proc, which may or may not have changed its output format) it could
break.

-- 
Trond Eivind Glomsrød
Red Hat, Inc.

Re: Re: [GENERAL] 7.0 vs. 7.1 (was: latest version?)

From

Tom Lane

Date:

02 November 2000, 15:56:04

Lamar Owen <lamar.owen@wgcr.org> writes:
> All I'm saying is that regression should be _runnable_ in all modes
> without needing anything but a shell and the PostgreSQL binary
> installation.

I think this'd be mostly a waste of effort.  IMHO, 99% of the problems
the regression tests might expose will be exposed if they are run
against the RPMs by the RPM maker.  (Something we have sometimes failed
to do in the past ;-).)  The regress tests are not that good at
detecting environment-specific problems; in fact, they go out of their
way to suppress environmental differences.  So I don't see any strong
need to support regression test running in binary distributions.
Especially not if we have to kluge around a lack of essential tools.
        regards, tom lane