Thread: About pg_upgrade

About pg_upgrade

From
Peter Eisentraut
Date:
I was just looking at the latest pg_upgrade revision.  Maybe it works for
particular installations, but before we can think about releasing this, at
least the issues below need to be addressed.

It seems to require that the data directory is under a directory named
"data" and that your current directory is above that.  This is not
appropriate for real software.  I suggest using standard -D options and/or
the PGDATA environment variable.

The locations of directories like the INFODIR and the OLDDIR should be
configurable.  Don't just put them where they happen to fall.  (Remember
that possibly a lot of data will end up there.)

The way temp files are allocated looks very insecure.  And does this thing
even check under what user it's running?

'test -e' is not portable.

'grep -q' is not portable.  (At least it doesn't portably behave as you
might think it does.)

Although 'head' is probably portable, it has never been explored, because
'sed 1q' is better.

If you set an exit trap, then 'exit 1' is not portable.  (I'm not kidding,
see pg_regress.)

You can't nest " inside ` inside ".

Pattern matching with 'expr' should be invoked like this:  expr x"$OBJ" :
x'pg_', or it might blow up when $OBJ has funny values.

Moving directories with 'mv' is not necessarily a good idea.

Should do a lot more error checking.

psql, pg_ctl, etc. should not just be called from the path.  You know the
install directory, so use that.

awk should be called as determined by configure.

Poor quality (spelling, wording) of messages.

The man page is very confusing.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: About pg_upgrade

From
Bruce Momjian
Date:
These are all excellent points.  Peter, I will be out for a few hours so
I can't get to this right away.  If you want to get started, the current
version is in CVS;  feel free to wack it around.  I have been committing
to CVS so people would view/modify it.  I will send you email before I
get started with any changes.

Moving to /contrib I think makes sense, particularly because you need
7.2 pg_upgrade while 7.1 is still installed.   I believe this will
remain true for future releases as well.  I was going to add a mention
of that to pg_upgrade.sgml.

One more change I need which I haven't yet is to force sequence
regeneration even for 7.2 because the XID sequence status changed from
7.2beta4 to 7.2beta5.  I will fix that today.

I had no great plans for pg_upgrade;  I just knew it had to be done some
day and I figured now was a good time.  The group can decide when and
how to use it.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> I was just looking at the latest pg_upgrade revision.  Maybe it works for
> particular installations, but before we can think about releasing this, at
> least the issues below need to be addressed.
> 
> It seems to require that the data directory is under a directory named
> "data" and that your current directory is above that.  This is not
> appropriate for real software.  I suggest using standard -D options and/or
> the PGDATA environment variable.
> 
> The locations of directories like the INFODIR and the OLDDIR should be
> configurable.  Don't just put them where they happen to fall.  (Remember
> that possibly a lot of data will end up there.)
> 
> The way temp files are allocated looks very insecure.  And does this thing
> even check under what user it's running?
> 
> 'test -e' is not portable.
> 
> 'grep -q' is not portable.  (At least it doesn't portably behave as you
> might think it does.)
> 
> Although 'head' is probably portable, it has never been explored, because
> 'sed 1q' is better.
> 
> If you set an exit trap, then 'exit 1' is not portable.  (I'm not kidding,
> see pg_regress.)
> 
> You can't nest " inside ` inside ".
> 
> Pattern matching with 'expr' should be invoked like this:  expr x"$OBJ" :
> x'pg_', or it might blow up when $OBJ has funny values.
> 
> Moving directories with 'mv' is not necessarily a good idea.
> 
> Should do a lot more error checking.
> 
> psql, pg_ctl, etc. should not just be called from the path.  You know the
> install directory, so use that.
> 
> awk should be called as determined by configure.
> 
> Poor quality (spelling, wording) of messages.
> 
> The man page is very confusing.
> 
> -- 
> Peter Eisentraut   peter_e@gmx.net
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: About pg_upgrade

From
Bruce Momjian
Date:
> One more change I need which I haven't yet is to force sequence
> regeneration even for 7.2 because the XID sequence status changed from
> 7.2beta4 to 7.2beta5.  I will fix that today.

I have made this change and committed the fix.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: About pg_upgrade

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> I was just looking at the latest pg_upgrade revision.  Maybe it works for
> particular installations, but before we can think about releasing this, at
> least the issues below need to be addressed.
> 
> It seems to require that the data directory is under a directory named
> "data" and that your current directory is above that.  This is not
> appropriate for real software.  I suggest using standard -D options and/or
> the PGDATA environment variable.

I have added code to use PGDATA, which can be overridden with a new -D
option, as you suggested.

> The locations of directories like the INFODIR and the OLDDIR should be
> configurable.  Don't just put them where they happen to fall.  (Remember
> that possibly a lot of data will end up there.)

Well, actually, we want the INFODIR and SAVEDATA (was OLDDIR) to be in
the same filesystem as PGDATA.  In fact, we use mv so the data files are
just moved into different directories and not actually moved.  I added
code to put these in the directory _above_ PGDATA.  Does that help?

> The way temp files are allocated looks very insecure.  And does this thing
> even check under what user it's running?

It does now --- It checks for root user and makes sure you can read the
PGDATA directory.

> 'test -e' is not portable.

OK, changed to -f and -h.

> 'grep -q' is not portable.  (At least it doesn't portably behave as you
> might think it does.)

OK, changed to grep "" > /dev/null 2>&1

> Although 'head' is probably portable, it has never been explored, because
> 'sed 1q' is better.

I changed it so "sed -n '1p'" which is a more general solution;  sed
'Xq' only works for the top X lines, i.e. sed '2q' prints lines 1 and 2.

> If you set an exit trap, then 'exit 1' is not portable.  (I'm not kidding,
> see pg_regress.)

Fixed -- I change 'exit 1' to 'return 1' and changed the calls to:
func || exit "$?"

> You can't nest " inside ` inside ".

Can you show me where?  I don't see it.

> Pattern matching with 'expr' should be invoked like this:  expr x"$OBJ" :
> x'pg_', or it might blow up when $OBJ has funny values.

Done.

> Moving directories with 'mv' is not necessarily a good idea.

Can you be more specific?  Seems pretty portable.

> Should do a lot more error checking.

Suggestions?  Already lots in there.

> psql, pg_ctl, etc. should not just be called from the path.  You know the
> install directory, so use that.

Well, we do use the same script on the old install and the new one so I
am not sure if looking at the install or coupling it with 'configure'
run is a good idea --- it may cause more problems than it solves.

> awk should be called as determined by configure.

Again, requires configure handling.

> Poor quality (spelling, wording) of messages.

Improved.  Let me know if I missed something.

> The man page is very confusing.

I have improved it.  All moved to /contrib/pg_upgrade.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: About pg_upgrade

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > Well, actually, we want the INFODIR and SAVEDATA (was OLDDIR) to be in
> > the same filesystem as PGDATA.  In fact, we use mv so the data files are
> > just moved into different directories and not actually moved.  I added
> > code to put these in the directory _above_ PGDATA.  Does that help?
> 
> Hmm, I think that's actually worse, because the directory above PGDATA
> might in fact be a different file system.  It's probably better to keep it
> within one directory then.  And using "dirname" isn't portable anyway.

Well, actually, it may not be better or worse.  We actually use PGDATA
to point not to the top-level PostgreSQL directory but to the /data part
of that directory, so the old behavior was to put it above /data and the
new code will do exactly that too.  Am I confused?

> > > The way temp files are allocated looks very insecure.  And does this thing
> > > even check under what user it's running?
> >
> > It does now --- It checks for root user and makes sure you can read the
> > PGDATA directory.
> 
> Still not sure about those temp files.  People like to see a possible
> exploit in every temp file.

Well, yes, if you get the pid, you can create symlink files in /tmp and
overwrite things.  How do I handle this properly, probably a directory
in /tmp that I create but I have to set my umask first -- is that a
plan?

> 
> > > 'test -e' is not portable.
> >
> > OK, changed to -f and -h.
> 
> -h isn't portable either.  I think you have to do something like

Oh, great!

> 
> if (test -L "$name") >/dev/null 2>&1 && (test -h "$name") >/dev/null 2>&1

I am confused about this, I don't have -L and the code seems to say that
the second part executes only if the first part is OK --- I don't really
need the test, I could just try the 'mv' and report a failure.

> If should also add here that "if ! command" is not portable, you'd need to
> write "if command; then : ; else action fi"

Wow, I have written portable scripts but never went to these lengths ---
are there actually that many broken OS's?

> > > If you set an exit trap, then 'exit 1' is not portable.  (I'm not kidding,
> > > see pg_regress.)
> >
> > Fixed -- I change 'exit 1' to 'return 1' and changed the calls to:
> >
> >     func || exit "$?"
> 
> That doesn't matter.  You have to write (exit x); exit x so the exit code
> is actually seen outside pg_upgrade.

So 'return 1' doesn't propogate to the caller?

> > > You can't nest " inside ` inside ".
> >
> > Can you show me where?  I don't see it.
> 
> Line 87 looks suspicious.  Especially with the echo and backslash in there
> it is not very portable.

Wow, that command was really stupid --- I just changed it to make the
assignment without the 'echo.'

> 
> > > Moving directories with 'mv' is not necessarily a good idea.
> >
> > Can you be more specific?  Seems pretty portable.
> 
> I was only concerned about moving across file systems, which you're not
> doing.

I sure hope so.  :-)  If not, the script will fail.

> 
> > > Should do a lot more error checking.
> >
> > Suggestions?  Already lots in there.
> 
> It looks better already.

Good.

> > > psql, pg_ctl, etc. should not just be called from the path.  You know the
> > > install directory, so use that.
> >
> > Well, we do use the same script on the old install and the new one so I
> > am not sure if looking at the install or coupling it with 'configure'
> > run is a good idea --- it may cause more problems than it solves.
> 
> It's better than using some random tool that might not even be in the
> path.  You have to run configure anyway, so what's the point in avoiding
> it?

But I don't assume they have run configure on the new source tree when
pg_upgrade is run in phase one  --- very likely they will grab
pg_upgrade, do phase one, then configure/compile the new code and
install it as part of the upgrade steps.

If they don't have a working 'awk' in their path, they have bigger
problems than pg_upgrade not working --- I know there are quirky
problems with awk (I saw them in pgmonitor) but these are vanilla awk
scripts that would work on any one.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: About pg_upgrade

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> Well, actually, we want the INFODIR and SAVEDATA (was OLDDIR) to be in
> the same filesystem as PGDATA.  In fact, we use mv so the data files are
> just moved into different directories and not actually moved.  I added
> code to put these in the directory _above_ PGDATA.  Does that help?

Hmm, I think that's actually worse, because the directory above PGDATA
might in fact be a different file system.  It's probably better to keep it
within one directory then.  And using "dirname" isn't portable anyway.

> > The way temp files are allocated looks very insecure.  And does this thing
> > even check under what user it's running?
>
> It does now --- It checks for root user and makes sure you can read the
> PGDATA directory.

Still not sure about those temp files.  People like to see a possible
exploit in every temp file.

> > 'test -e' is not portable.
>
> OK, changed to -f and -h.

-h isn't portable either.  I think you have to do something like

if (test -L "$name") >/dev/null 2>&1 && (test -h "$name") >/dev/null 2>&1

If should also add here that "if ! command" is not portable, you'd need to
write "if command; then : ; else action fi"

> > If you set an exit trap, then 'exit 1' is not portable.  (I'm not kidding,
> > see pg_regress.)
>
> Fixed -- I change 'exit 1' to 'return 1' and changed the calls to:
>
>     func || exit "$?"

That doesn't matter.  You have to write (exit x); exit x so the exit code
is actually seen outside pg_upgrade.

> > You can't nest " inside ` inside ".
>
> Can you show me where?  I don't see it.

Line 87 looks suspicious.  Especially with the echo and backslash in there
it is not very portable.

> > Moving directories with 'mv' is not necessarily a good idea.
>
> Can you be more specific?  Seems pretty portable.

I was only concerned about moving across file systems, which you're not
doing.

> > Should do a lot more error checking.
>
> Suggestions?  Already lots in there.

It looks better already.

> > psql, pg_ctl, etc. should not just be called from the path.  You know the
> > install directory, so use that.
>
> Well, we do use the same script on the old install and the new one so I
> am not sure if looking at the install or coupling it with 'configure'
> run is a good idea --- it may cause more problems than it solves.

It's better than using some random tool that might not even be in the
path.  You have to run configure anyway, so what's the point in avoiding
it?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: About pg_upgrade

From
Bruce Momjian
Date:
> > Still not sure about those temp files.  People like to see a possible
> > exploit in every temp file.
> 
> Well, yes, if you get the pid, you can create symlink files in /tmp and
> overwrite things.  How do I handle this properly, probably a directory
> in /tmp that I create but I have to set my umask first -- is that a
> plan?

Forget what I said, you don't need to change the umask, just do:
trap "rm -rf /tmp/$$" 0 1 2 3 15mkdir /tmp/$$ || exit 1

and you call all your temp files /tmp/$$/XXX, right?  Once you create
the directory, you own it and no one else can write into there.

I just did a Google search and no one came up with this idea, though I
believe X11 uses /tmp directories for this exact reason, right?

I finally found one mention of it:  Seems Suse uses it, but they did
'mkdir -p' which doesn't return an error if it fails so it was a
security problem itself:

http://groups.google.com/groups?q=tmp+security+race+directory+script+mkdir&hl=en&selm=bugtraq/Pine.LNX.4.30.0101170202040.15609-100000%40dent.suse.de&rnum=1

I just looked in /usr/bin on BSD/OS and found a whole bunch that do the
insecure /tmp/$$ trick I currently do in pg_upgrade:#$ file `grep -l '\$\$' *` | grep shell cvsbug:       Bourne shell
scripttextigawk:        Bourne shell script textlorder:       Bourne shell script textmkdep:        Bourne shell script
textpppattach:   Korn shell script textrcsfreeze:    Bourne shell script textsendbug:      Bourne shell script
textuupick:      Bourne shell script text
 

For example, cvsbug does:[ -z "$TMPDIR" ] && TMPDIR=/tmpTEMP=$TMPDIR/p$$BAD=$TMPDIR/pbad$$REF=$TMPDIR/pf$$

Bet everyone has that one on their system.  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026