Thread: How to shoot yourself in the foot: kill -9 postmaster

How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 17:40:16

I have spent several days now puzzling over the corrupted WAL logfile
that Scott Parish was kind enough to send me from a 7.1beta4 crash.
It looks a lot like two different series of transactions were getting
written into the same logfile.  I'd been digging like mad in the WAL
code to try to explain this as a buffer-management logic error, but
after a fresh exchange of info it turns out that I was barking up the
wrong tree.  There *were* two different series of transactions.
Specifically, here's what happened:

1. Scott (or actually his associate) shut down and restarted the
postmaster using the /etc/rc.d/init.d/pgsql script that ships with
our RPMs.  That script shuts down the old postmaster withkillproc postmaster
It turns out that at least on Scott's machine (RedHat 6.1), the default
kill level for the killproc function is kill -9.  (This is clearly a bad
bug in the init script, but I digress.)

2. So, the old postmaster was killed with kill -9, but its child
backends were still running.  The new postmaster will start up
successfully because it'll think the old postmaster crashed, and
so it will go through the usual recovery procedure.

3. Now we have two sets of backends running in different shmem blocks
(7.0 might have choked on that part, but 7.1 doesn't care) and running
different sets of transactions.  But they're writing to the same WAL
log.  Result: guaranteed corruption of the log.

It actually took two iterations of this to expose the bug: the third
attempted postmaster start went looking for the checkpoint record last
written by the second one, which meanwhile had got overwritten by
activity of the first backend set.


Now, killing the postmaster -9 and not cleaning up the backends has
always been a good way to shoot yourself in the foot, but up to now the
worst thing that was likely to happen to you was isolated corruption in
specific tables.  In the brave new world of WAL the stakes are higher,
because the system will refuse to start up if it finds a corrupted
checkpoint record.  Clueless admins who resort to kill -9 as a routine
admin tool *will* lose their databases.  Moreover, the init scripts
that are running around now are dangerous weapons if used with 7.1.

I think we need a stronger interlock to prevent this scenario, but I'm
unsure what it should be.  Ideas?
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Thomas Swan

Date:

05 March 2001, 18:39:14

At 3/5/2001 04:30 PM, you wrote:
>Now, killing the postmaster -9 and not cleaning up the backends has
>always been a good way to shoot yourself in the foot, but up to now the
>worst thing that was likely to happen to you was isolated corruption in
>specific tables.  In the brave new world of WAL the stakes are higher,
>because the system will refuse to start up if it finds a corrupted
>checkpoint record.  Clueless admins who resort to kill -9 as a routine
>admin tool *will* lose their databases.  Moreover, the init scripts
>that are running around now are dangerous weapons if used with 7.1.
>
>I think we need a stronger interlock to prevent this scenario, but I'm
>unsure what it should be.  Ideas?

Is there anyway to see if the other processes (child) have a lock on the 
log file?

On a lot of systems, when a daemon starts, will record the PID in a file so 
it/'the admin' can do a 'shutdown' script with the PID listed.
Can child processes list themselves like child.PID in a configurable 
directory, and have the starting process look for all of these and shut the 
"orphaned" child processes down?

Just thoughts...

Thomas

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

05 March 2001, 18:48:57

* Tom Lane <tgl@sss.pgh.pa.us> [010305 14:51] wrote:
> 
> I think we need a stronger interlock to prevent this scenario, but I'm
> unsure what it should be.  Ideas?

Re having multiple postmasters active by accident.

The sysV IPC stuff has some hooks in it that may help you.

One idea is to check the 'struct shmid_ds' feild 'shm_nattch',
basically at startup if it's not 1 (or 0) then you have more than
one postgresql instance messing with it and it should not proceed.

I'd also suggest looking into using sysV semaphores and the semundo
stuff, afaik it can be used to track the number of consumers of
a reasource.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 20:46:58

Tom Lane wrote:
> checkpoint record.  Clueless admins who resort to kill -9 as a routine
> admin tool *will* lose their databases.  Moreover, the init scripts
> that are running around now are dangerous weapons if used with 7.1.

Thanks for the headsup, Tom.  Time to nix killproc and do something
cleaner -- compatible, but cleaner.  I'll have to research what the
defaults are for later RH's -- but, as 6.1 is one of my target platforms
at this time, I have to fix that issue for sure.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 20:49:08

Lamar Owen <lamar.owen@wgcr.org> writes:
> Thanks for the headsup, Tom.  Time to nix killproc and do something
> cleaner -- compatible, but cleaner.

As far as I could tell from the 6.1 scripts, it would work to do
killproc postmaster -TERM

The problem is just that killproc has an overenthusiastic default...
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Bruce Momjian

Date:

05 March 2001, 20:52:51

killproc should send a kill -15 to the process, wait a few seconds for
it to exit.  If it does not, try kill -1, and if that doesn't kill it,
then kill -9.

> Tom Lane wrote:
> > checkpoint record.  Clueless admins who resort to kill -9 as a routine
> > admin tool *will* lose their databases.  Moreover, the init scripts
> > that are running around now are dangerous weapons if used with 7.1.
> 
> Thanks for the headsup, Tom.  Time to nix killproc and do something
> cleaner -- compatible, but cleaner.  I'll have to research what the
> defaults are for later RH's -- but, as 6.1 is one of my target platforms
> at this time, I have to fix that issue for sure.
> --
> Lamar Owen
> WGCR Internet Radio
> 1 Peter 4:11
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Bruce Momjian

Date:

05 March 2001, 20:54:54

> Lamar Owen <lamar.owen@wgcr.org> writes:
> > Thanks for the headsup, Tom.  Time to nix killproc and do something
> > cleaner -- compatible, but cleaner.
> 
> As far as I could tell from the 6.1 scripts, it would work to do
> 
>     killproc postmaster -TERM
> 

Yes, amazing it has a -9 default.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 20:55:53

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> killproc should send a kill -15 to the process, wait a few seconds for
> it to exit.  If it does not, try kill -1, and if that doesn't kill it,
> then kill -9.

Tell it to the Linux people ... this is their boot-script code we're
talking about.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 21:11:52

Tom Lane wrote:
> 
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > killproc should send a kill -15 to the process, wait a few seconds for
> > it to exit.  If it does not, try kill -1, and if that doesn't kill it,
> > then kill -9.
> 
> Tell it to the Linux people ... this is their boot-script code we're
> talking about.

RedHat, in particular.  I can't vouch for any others.

On my RH 6.2 box, with initscripts-5.00-1 loaded, here's what killproc
does if no killlevel is set (even though a default $killlevel is set to
-9, it's not used in this code):
($pid is the pid of the proc to kill, $base is the name of the proc,
etc)
  if [ "$notset" = "1" ] ; then     if ps h $pid>/dev/null 2>&1; then        # TERM first, then KILL if not dead
kill-TERM $pid        usleep 100000        if ps h $pid >/dev/null 2>&1 ; then           sleep 1           if ps h $pid
>/dev/null2>&1 ; then              sleep 3              if ps h $pid >/dev/null 2>&1 ; then                 kill -KILL
$pid             fi           fi        fi     fi     ps h $pid >/dev/null 2>&1     RC=$?     [ $RC -eq 0 ] && failure
"$baseshutdown" || success "$base
 
shutdown"     RC=$((! $RC))     # use specified level only else     if ps h $pid >/dev/null 2>&1; then        kill
$killlevel$pid        RC=$?        [ $RC -eq 0 ] && success "$base $killlevel" || failure "$base
 
$killlevel"     fi fi


Is 6.1 this different from 6.2?  This code on the surface seems
reasonable to me -- am I missing something?  The 6.2 code (found in
/etc/rc.d/init.d/functions, for those who might not know where to find
killproc) sets a default killlevel but never uses it -- ignorant but not
stupid.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Bruce Momjian

Date:

05 March 2001, 21:14:28

>    if [ "$notset" = "1" ] ; then
>       if ps h $pid>/dev/null 2>&1; then
>          # TERM first, then KILL if not dead
>          kill -TERM $pid
>          usleep 100000
>          if ps h $pid >/dev/null 2>&1 ; then
>             sleep 1
>             if ps h $pid >/dev/null 2>&1 ; then
>                sleep 3
>                if ps h $pid >/dev/null 2>&1 ; then
>                   kill -KILL $pid
>                fi
>             fi
>          fi
>       fi

Yes, this seems like the proper way to do it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Hiroshi Inoue

Date:

05 March 2001, 21:18:07

Tom Lane wrote:
> 
> Now, killing the postmaster -9 and not cleaning up the backends has
> always been a good way to shoot yourself in the foot, but up to now the
> worst thing that was likely to happen to you was isolated corruption in
> specific tables.  In the brave new world of WAL the stakes are higher,
> because the system will refuse to start up if it finds a corrupted
> checkpoint record.  Clueless admins who resort to kill -9 as a routine
> admin tool *will* lose their databases.  Moreover, the init scripts
> that are running around now are dangerous weapons if used with 7.1.
> 
> I think we need a stronger interlock to prevent this scenario, but I'm
> unsure what it should be.  Ideas?
> 

Seems the simplest way is to inhibit starting postmaster
if the pid file exists.
Another way is to use flock() if flock() is available.
We could flock() the pid file so that another postmaster
could detect the lock of the file.

Regards,
Hiroshi Inoue

Re: How to shoot yourself in the foot: kill -9 postmaster

From

ncm@zembu.com (Nathan Myers)

Date:

05 March 2001, 21:19:30

On Mon, Mar 05, 2001 at 08:55:41PM -0500, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > killproc should send a kill -15 to the process, wait a few seconds for
> > it to exit.  If it does not, try kill -1, and if that doesn't kill it,
> > then kill -9.
> 
> Tell it to the Linux people ... this is their boot-script code we're
> talking about.

Not to be a zealot, but this isn't _Linux_ boot-script code, it's
_Red Hat_ boot-script code.  Red Hat would like for us all to confuse
the two, but they jes' ain't the same.  (As a rule of thumb, where it
works right, credit Linux; where it doesn't, blame Red Hat. :-)

Nathan Myers
ncm@zembu.com

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 21:26:57

Bruce Momjian wrote:
> >          # TERM first, then KILL if not dead
> Yes, this seems like the proper way to do it.

Now to verify that 6.1 is the same....or different.... Hmmmm.... The
mirrors of ftp.redhat.com (and, in fact, RedHat.com itself) no longer
have the updates or the original for 6.1's initscripts-4.70 package. 
Can a RedHat 6.1 user (using as close as possible to 6.1's release
initscripts package) send me a copy of /etc/rc.d/init.d/functions, or
verify how that initscripts package defines killproc?  I cannot at this
moment locate my RH 6.1 SRPMS CD.  Found my RH _4_.1 CD, but that's just
a _little_ old :-).
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 21:28:28

Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Tom Lane wrote:
>> I think we need a stronger interlock to prevent this scenario, but I'm
>> unsure what it should be.  Ideas?

> Seems the simplest way is to inhibit starting postmaster
> if the pid file exists.

Then we're unable to recover from a crash without manual intervention.

The tricky part of this is not to give up the ability to restart when
there *has* been a crash.

> Another way is to use flock() if flock() is available.
> We could flock() the pid file so that another postmaster
> could detect the lock of the file.

This would only work if every backend is holding flock on the file,
which would mean they'd all have to keep it open all the time.  Kind
of annoying to use up that many file descriptors on it.  Might be the
best answer though; I haven't thought of anything I like better...
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 21:33:25

Nathan Myers wrote:
> Not to be a zealot, but this isn't _Linux_ boot-script code, it's
> _Red Hat_ boot-script code.  Red Hat would like for us all to confuse
> the two, but they jes' ain't the same.  (As a rule of thumb, where it
> works right, credit Linux; where it doesn't, blame Red Hat. :-)

So we're going to credit Linux for PostgreSQL being shipped as part of
the RedHat distribution since RH 5.0, then? :-0
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 21:36:42

Lamar Owen <lamar.owen@wgcr.org> writes:
> Is 6.1 this different from 6.2?

Scott sent me a copy of /etc/init.d/functions from his box, and it has
largely the same behavior (I hadn't read the whole code to notice that
it doesn't use the default killlevel...).  What's actually happening
here is that the init script sends SIGTERM, and then SIGKILL four
seconds later if the postmaster hasn't shut down yet.  Unfortunately,
unless your clients are very short-lived four seconds isn't going to
be enough for a "polite" shutdown.  (It's pretty marginal even for
an impolite one, since a checkpoint will take at least a couple of
seconds.)

However, with an explicit kill level that doesn't happen: you get one
signal of the specified value, no more.  Possibly it would be better for
the init script to send SIGINT (forcibly disconnect clients) instead of
SIGTERM, however.  So I'm now leaning to "killproc postmaster -INT".
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 21:40:50

Lamar Owen <lamar.owen@wgcr.org> writes:
> Tom Lane wrote:
>> The tricky part of this is not to give up the ability to restart when
>> there *has* been a crash.

> But kill -9 effectively _is_ an admin-initiated crash.

Yeah, but only a partial crash.  If the admin finishes the job by
killing the backends too, we're fine.  Postmaster down, backends alive
is not a scenario we're currently prepared for.  We need a way to plug
that gap.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 21:45:21

Tom Lane wrote:
> However, with an explicit kill level that doesn't happen: you get one
> signal of the specified value, no more.  Possibly it would be better for
> the init script to send SIGINT (forcibly disconnect clients) instead of
> SIGTERM, however.  So I'm now leaning to "killproc postmaster -INT".

Ok, since I can't seem to count on killproc's exact behavior, istm that
I can:
killproc postmaster -INT
wait some number of seconds
if postmaster still up  killproc postmaster -TERM
wait some number of seconds
if postmaster STILL up  killproc postmaster  #and let the grim reaper do its dirty work.

After all, the system shutdown is relying on this script to properly and
thoroughly shut things down, or it WILL do the 'kill -9
pid-of-postmaster' for you.

Now, what's a good delay here?  Or is there a better metric that a
simple delay?  After all, I want to avoid the kill -9 unless we have an
emergency hard lock situation -- what's a good indicator of the backend
fleet of processes actually _doing_ something?  Or should I key on an
indicator of processor speed (Linux does provide a nice bogus metric
known as BogoMIPS for such a purpose)?  The last thing I want to do is
wait too long on some platforms and not long enough on others.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 21:46:24

Tom Lane wrote:
> The tricky part of this is not to give up the ability to restart when
> there *has* been a crash.

But kill -9 effectively _is_ an admin-initiated crash.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 21:53:18

Lamar Owen <lamar.owen@wgcr.org> writes:
> The last thing I want to do is
> wait too long on some platforms and not long enough on others.

The difficulty is to know how long the final checkpoint will take.
This depends on (at least) your hard disk speed and the number of
dirty buffers, so I think you're going to have some difficulty
estimating it with any reliability.  BogoMIPS won't help, for sure.

However, if you do SIGINT and then wait a few seconds, you can be fairly
sure that all the extant backends are dead (if not frozen up...) and
that the checkpoint is in progress.  That may be about the best you can
do.

I do not agree that this script should take it on itself to kill -9 the
postmaster.  Please note that the reason we're having this discussion at
all is that the init script may be used for purposes other than system
shutdown.  So the argument that "it's going to happen anyway" is wrong.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Bruce Momjian

Date:

05 March 2001, 21:58:10

> Ok, since I can't seem to count on killproc's exact behavior, istm that
> I can:
> killproc postmaster -INT
> wait some number of seconds
> if postmaster still up
>    killproc postmaster -TERM
> wait some number of seconds
> if postmaster STILL up
>    killproc postmaster  #and let the grim reaper do its dirty work.
> 
> After all, the system shutdown is relying on this script to properly and
> thoroughly shut things down, or it WILL do the 'kill -9
> pid-of-postmaster' for you.
> 
> Now, what's a good delay here?  Or is there a better metric that a
> simple delay?  After all, I want to avoid the kill -9 unless we have an
> emergency hard lock situation -- what's a good indicator of the backend
> fleet of processes actually _doing_ something?  Or should I key on an
> indicator of processor speed (Linux does provide a nice bogus metric
> known as BogoMIPS for such a purpose)?  The last thing I want to do is
> wait too long on some platforms and not long enough on others.

In remembering how other databases handle it, I think you should use
pg_ctl to shut it down.  You need to enable wait mode, not sure if that
is the default or not.  That will wait for it to shut down before
continuing.  I realize a hung shutdown would stop the kernel from
shutting down.  You could put a sleep 100 in there and call a trap on a
timeout.

Here is some shell code:
TIME=60    pg_ctl -w stop &BG="$!"; export BG
(sleep "$TIME"; kill "$BG" ) &BG2="$!"; export BG2
wait "$BG"if ! kill -0 "$BG2"else    kill "$BG2"fi


This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl.  You
would then need a kill of you own.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 21:59:50

Tom Lane wrote:
> Yeah, but only a partial crash.  If the admin finishes the job by
> killing the backends too, we're fine.  Postmaster down, backends alive
> is not a scenario we're currently prepared for.  We need a way to plug
> that gap.

Postmaster can easily enough find out if zombie backends are 'out there'
during startup, right?  What can postmaster _do_ about it, though?  It
won't necessarily be able to kill them -- but it also can't control
them.  If it _can_ kill them, should it try?

After all, if those zombies are out there on this PGDATA there's going
to be big trouble if we even try to start.  If we can't kill the zombies
(that might still be doing something useful with their clients) from our
starting postmaster, how can we possibly start up underneath running
backends?

Should the backend look for the presence of its parent postmaster
periodically and gracefully come down if postmaster goes away without
the proper handshake?  A watchdog semaphore (or shared memory flag) that
the backend resets and then checks periodically for it being set by its
parent postmaster?

Should a set of backends detect a new postmaster coming up and try to
'sync up' with that postmaster, like the baroque GEMM handshake dance
performed by 386 memory managers when Windows needs to start its own
VMM?

Or should we spend that much time protecting Barney Fife's from their
own single bullet? :-)

Just a nor-easter of a brainstorm....
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 22:03:59

Tom Lane wrote:
> Please note that the reason we're having this discussion at
> all is that the init script may be used for purposes other than system
> shutdown.  So the argument that "it's going to happen anyway" is wrong.

Believe it or not, you just disproved your own statement that the
initscript should not take it upon itself to issue the kill -9.  So,
what if I issue '/etc/rc.d/init.d/postgresql restart' -- and backends
don't go away during the 'stop' phase, while postmaster may actually
have died?  Or is it even possible for postmaster to drop out with a
running backend out there?

No, more is needed.  But I think a careful reap through the running
backends to kill those that need killing if postmaster won't go down
might be prudent.  Currently it is not possible to run multiple
postmasters with the RPM install (I am working on that little problem,
but it won't be for 7.1's RPMset yet), so all backends that are running
on the RPM PGDATA location (which I am looking at making configurable as
well) will belong to the one postmaster.  Of course, that would be an
absolute last resort.

Oh well -- the real solution is elsewhere, anyway.  I just have to make
sure it is not data-corruption broken.  And, if leaving the -9 out
completely is the only solution, then, well, it's the only solution.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 22:08:39

Lamar Owen <lamar.owen@wgcr.org> writes:
> Tom Lane wrote:
>> Postmaster down, backends alive is not a scenario we're currently
>> prepared for.  We need a way to plug that gap.

> Postmaster can easily enough find out if zombie backends are 'out there'
> during startup, right?

If you think it's easy enough, enlighten the rest of us ;-).  Be sure
your solution only finds leftover backends from the previous instance of
the same postmaster, else it will prevent running multiple postmasters
on one system.

> What can postmaster _do_ about it, though?  It
> won't necessarily be able to kill them -- but it also can't control
> them.  If it _can_ kill them, should it try?

I think refusal to start is sufficient.  They should go away by
themselves as their clients disconnect, and forcing the issue doesn't
seem like it will improve matters.  The admin can kill them (hopefully
with just a SIGTERM ;-)) if he wants to move things along ... but I'd
not like to see a newly-starting postmaster do that automatically.

> Should the backend look for the presence of its parent postmaster
> periodically and gracefully come down if postmaster goes away without
> the proper handshake?

Unless we checked just before every disk write, this wouldn't represent
a safe failure mode.  The onus has to be on the newly-starting
postmaster, I think, not on the old backends.

> Should a set of backends detect a new postmaster coming up and try to
> 'sync up' with that postmaster,

Nice try ;-).  How will you persuade the kernel that these processes are
now children of the new postmaster?
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 22:11:03

Bruce Momjian wrote:
> This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl.  You
> would then need a kill of you own.

I missed something somehwere: wasn't the consensus a few weeks ago that
pg_ctl shouldn't be used for a system initscript?  Or did I black out
that day? :-)  I certainly have no problem using pg_ctl for this purpose
-- as I have been using pg_ctl to start postmaster all along (then why
am I not using it to stop -- don't answer that :-))......
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 22:12:04

Lamar Owen <lamar.owen@wgcr.org> writes:
> I missed something somehwere: wasn't the consensus a few weeks ago that
> pg_ctl shouldn't be used for a system initscript?

I thought there was some concern about whether pg_ctl is really "ready
for prime time".  But I don't recall the details either.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Bruce Momjian

Date:

05 March 2001, 22:12:11

> Bruce Momjian wrote:
> > This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl.  You
> > would then need a kill of you own.
> 
> I missed something somehwere: wasn't the consensus a few weeks ago that
> pg_ctl shouldn't be used for a system initscript?  Or did I black out
> that day? :-)  I certainly have no problem using pg_ctl for this purpose
> -- as I have been using pg_ctl to start postmaster all along (then why
> am I not using it to stop -- don't answer that :-))......

I don't remember that discussion.  My guess was that you didn't want
pg_ctl to hang forever.  My script handles that, I think.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 22:15:11

Lamar Owen <lamar.owen@wgcr.org> writes:
> Tom Lane wrote:
>> Please note that the reason we're having this discussion at
>> all is that the init script may be used for purposes other than system
>> shutdown.  So the argument that "it's going to happen anyway" is wrong.

> Believe it or not, you just disproved your own statement that the
> initscript should not take it upon itself to issue the kill -9.

How?

> So, what if I issue '/etc/rc.d/init.d/postgresql restart' -- and
> backends don't go away during the 'stop' phase, while postmaster may
> actually have died?  Or is it even possible for postmaster to drop out
> with a running backend out there?

The postmaster will certainly not do so voluntarily.  If you kill -9 it,
of course, that's the situation you're left with ... but your reasoning
seems circular to me.  "I should kill -9 the postmaster to prevent the
situation where I've kill -9'd the postmaster."
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 22:19:42

Tom Lane wrote:
> Lamar Owen wrote:
> > Postmaster can easily enough find out if zombie backends are 'out there'
> > during startup, right?
> If you think it's easy enough, enlighten the rest of us ;-).

If postgres reported PGDATA on the command line it would be easy enough.

> > What can postmaster _do_ about it, though?  It
> > won't necessarily be able to kill them -- but it also can't control
> > them.  If it _can_ kill them, should it try?
> I think refusal to start is sufficient.  They should go away by
> themselves as their clients disconnect, and forcing the issue doesn't

???? I have misunderstood your previous statement about not wanting to
force a manual crash recovery, then.

> > Should a set of backends detect a new postmaster coming up and try to
> > 'sync up' with that postmaster,
> Nice try ;-).  How will you persuade the kernel that these processes are
> now children of the new postmaster?

Yeah, that's the kicker.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 22:23:32

Lamar Owen <lamar.owen@wgcr.org> writes:
> Tom Lane wrote:
>> If you think it's easy enough, enlighten the rest of us ;-).

> If postgres reported PGDATA on the command line it would be easy enough.

In ps status you mean?  I don't think we are prepared to require ps
status functionality to let the system start up... we'd lose a number
of supported platforms that way.


>> I think refusal to start is sufficient.  They should go away by
>> themselves as their clients disconnect, and forcing the issue doesn't

> ???? I have misunderstood your previous statement about not wanting to
> force a manual crash recovery, then.

In the case of an actual crash and restart, postgres should come back up
without help.  However, the situation here is not a crash, it is
incomplete admin intervention.  I don't think that expecting the admin
to complete his intervention is the same thing as manual crash recovery.
I especially don't think that we should second-guess what the admin
wants us to do by auto-killing backends that are still serving clients.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 22:36:29

Tom Lane wrote:
> of course, that's the situation you're left with ... but your reasoning
> seems circular to me.  "I should kill -9 the postmaster to prevent the
> situation where I've kill -9'd the postmaster."

Ok, while the script can certainly be used from the command line, its
primary purpose is system shutdown.

And, I am thinking kindof circituitously at this point -- I only now
realize just how circituitously.  If I keep slapping my forehead like
this, I'm going to be bald in a few years....

I don't want to reap the postmaster off -- I want to reap off the
backends associated with that particular postmaster, allowing that
postmaster to die on its own.  Duh.  Doing this in a safe manner is not
going to be easy, given that the PGDATA is not on the command line to
the backend as echoed by ps.  Although I could key on PPID for the
backends....  I'll have to experiment.  But not tonight -- last week was
more taxing than I thought. :-(.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 22:39:54

Tom Lane wrote:
> Lamar Owen <lamar.owen@wgcr.org> writes:
> > Tom Lane wrote:
> >> If you think it's easy enough, enlighten the rest of us ;-).
> > If postgres reported PGDATA on the command line it would be easy enough.
> In ps status you mean?  I don't think we are prepared to require ps
> status functionality to let the system start up... we'd lose a number
> of supported platforms that way.

That is one downside.  A major downside.  Again, alot of work to protect
the Barney Fife's out there.

> In the case of an actual crash and restart, postgres should come back up
> without help.  However, the situation here is not a crash, it is
> incomplete admin intervention.  I don't think that expecting the admin

Is it a correct assumption that this is the only time postmaster might
drop out?

But, thanks for the clarification, as I had misunderstood what you
meant.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 22:41:04

Lamar Owen <lamar.owen@wgcr.org> writes:
> Is it a correct assumption that this is the only time postmaster might
> drop out?

Well, there's always the possibility of a bug leading to postmaster
coredump.  Historically those have been pretty rare though.

In any case, I'm not sure that the init script is the place to be
solving these problems.  We do need some internal mechanism to protect
against a crashed or kill -9'd postmaster.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

05 March 2001, 22:45:12

Lamar Owen <lamar.owen@wgcr.org> writes:
> I don't want to reap the postmaster off -- I want to reap off the
> backends associated with that particular postmaster, allowing that
> postmaster to die on its own.  Duh.  Doing this in a safe manner is not
> going to be easy, given that the PGDATA is not on the command line to
> the backend as echoed by ps.  Although I could key on PPID for the
> backends....  I'll have to experiment.

PPID should work fine, actually.  Keep in mind though that SIGINT'ing
the postmaster will already have sent a terminate signal to its children
(barring postmaster breakage), and that if you wait around for awhile
and then kill off remaining children, you may well accomplish nothing
except to kill off the checkpoint process :-(
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

05 March 2001, 22:47:14

Tom Lane wrote:
> Well, there's always the possibility of a bug leading to postmaster
> coredump.  Historically those have been pretty rare though.

I have never personally seen one, since 6.1.1.
> In any case, I'm not sure that the init script is the place to be
> solving these problems.

Well, I do kindof have the responsibility to allow the system to shut
down..... I'll have to double check -- there may be a timeout mechanism
in the RedHat init to reap off shutdown scripts -- but I haven't yet
found it.  Better to gracefully yank the plugs than have the grim reaper
yank them in the wrong order for you, in any case.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 00:43:23

* Tom Lane <tgl@sss.pgh.pa.us> [010305 19:13] wrote:
> Lamar Owen <lamar.owen@wgcr.org> writes:
> > Tom Lane wrote:
> >> Postmaster down, backends alive is not a scenario we're currently
> >> prepared for.  We need a way to plug that gap.
> 
> > Postmaster can easily enough find out if zombie backends are 'out there'
> > during startup, right?
> 
> If you think it's easy enough, enlighten the rest of us ;-).  Be sure
> your solution only finds leftover backends from the previous instance of
> the same postmaster, else it will prevent running multiple postmasters
> on one system.

I'm sure some sort of encoding of the PGDATA directory along with
the pids stored in the shm segment...

> > What can postmaster _do_ about it, though?  It
> > won't necessarily be able to kill them -- but it also can't control
> > them.  If it _can_ kill them, should it try?
> 
> I think refusal to start is sufficient.  They should go away by
> themselves as their clients disconnect, and forcing the issue doesn't
> seem like it will improve matters.  The admin can kill them (hopefully
> with just a SIGTERM ;-)) if he wants to move things along ... but I'd
> not like to see a newly-starting postmaster do that automatically.

I agree, shooting down processes incorrectly should be left up to
vendors braindead scripts. :)

> > Should the backend look for the presence of its parent postmaster
> > periodically and gracefully come down if postmaster goes away without
> > the proper handshake?
> 
> Unless we checked just before every disk write, this wouldn't represent
> a safe failure mode.  The onus has to be on the newly-starting
> postmaster, I think, not on the old backends.
> 
> > Should a set of backends detect a new postmaster coming up and try to
> > 'sync up' with that postmaster,
> 
> Nice try ;-).  How will you persuade the kernel that these processes are
> now children of the new postmaster?

Oh, easy, use ptrace. :)

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

dom@idealx.com

Date:

06 March 2001, 09:03:51

> I especially don't think that we should second-guess what the admin
> wants us to do by auto-killing backends that are still serving
> clients.
 Sure. But it would be nice anyway if pg_ctl could do this with a
specific command line switch. 

-- 
<< Tout n'y est pas parfait, mais on y honore certainement les jardiniers >>
        Dominique Quatravaux <dom@kilimandjaro.dyndns.org>

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Peter Eisentraut

Date:

06 March 2001, 12:04:11

Lamar Owen writes:

> I missed something somehwere: wasn't the consensus a few weeks ago that
> pg_ctl shouldn't be used for a system initscript?

The consensus(?) was that there was some work to do in pg_ctl before it
was robust enough to be used (for anything).  That work has been done.
An example Linux init.d script is at contrib/start-scripts/linux.  The
only fault in that script that I can see is that it has no recipe for the
case when the postmaster does not come down after 60 seconds.  But this is
really no problem for the issue at hand because if you do a normal
runlevel switch then the postmaster will simply keep running, and during a
system shutdown all the backends are going to die anyway.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Bruce Momjian

Date:

06 March 2001, 12:10:09

> Bruce Momjian writes:
> 
> > This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl.  You
> > would then need a kill of you own.
> 
> pg_ctl automatically times out after 60 seconds.

Oh, yea, that's right, I saw that in the documenation.  Forget my
script.  Just run pg_ctl first, then kill the postmaster if it is still
there.  Much safer than doing kill and checking because pg_ctl knows
when the system cleanly shuts down and exits.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

06 March 2001, 12:56:05

Peter Eisentraut wrote:
> 
> Lamar Owen writes:
> 
> > I missed something somehwere: wasn't the consensus a few weeks ago that
> > pg_ctl shouldn't be used for a system initscript?
> 
> The consensus(?) was that there was some work to do in pg_ctl before it
> was robust enough to be used (for anything).  That work has been done.

That was the detail I missed.

> case when the postmaster does not come down after 60 seconds.  But this is
> really no problem for the issue at hand because if you do a normal
> runlevel switch then the postmaster will simply keep running, and during a
> system shutdown all the backends are going to die anyway.

Only if each and every shutdown script succeeds in its task.  And I have
to make sure that the RPM's shipping script successfully pulls down the
system in an orderly fashion -- of course, I don't have to worry about
the case where a postmaster is going to be started back up if we are in
system shutdown -- but, as Tom also stated, I can't assume I'm in the
system's death throes when called with the stop parameter.

And it _is_ possible for an admin to set up the runlevels such that a
level is set aside where even networking isn't running (actually, that
level already exists, and is called 'single user mode') -- or a run
level for website maintenance where networking is still up, but the
webserver and postgresql (and other associated) processes are to be shut
down.  I personally use this -- I have set up runlevel 4 as a 'remote
single user mode' of sorts where I still have sshd running (and the
networking stack, obviously), but AOLserver, postgresql, and RealServer
are shut down.  I then switch runlevels back to 3 to return to normal. 
Much easier than manually stopping and restarting (in the correct order,
as AOLserver is not a happy camper if postmaster drops out from
underneath it) all the necessary pieces.

So I can't assume anything.  The default RPM installation used to
automatically configure runlevels 3, 4, and 5 (not any more), but my
script can't assume that the system is actually in that state by any
means.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

06 March 2001, 13:11:07

Alfred Perlstein <bright@wintelcom.net> writes:
> I'm sure some sort of encoding of the PGDATA directory along with
> the pids stored in the shm segment...

I thought about this too, but it strikes me as not very trustworthy.
The problem is that there's no guarantee that the new postmaster will
even notice the old shmem segment: it might select a different shmem
key.  (The 7.1 coding of shmem key selection makes this more likely
than it used to be, but even under 7.0, it will certainly fail to work
if I choose to start the new postmaster using a different port number
than the old one had.  The shmem key is driven primarily by port number
not data directory ...)

The interlock has to be tightly tied to the PGDATA directory, because
what we're trying to protect is the files in and under that directory.
It seems that something based on file(s) in that directory is the way
to go.

The best idea I've seen so far is Hiroshi's idea of having all the
backends hold fcntl locks on the same file (probably postmaster.pid
would do fine).  Then the new postmaster can test whether any backends
are still alive by trying to lock the old postmaster.pid file.
Unfortunately, I read in the fcntl man page:
   Locks are not inherited by a child process in a fork(2) system call.

This makes the idea much less attractive than I originally thought:
a new backend would not automatically inherit a lock on the
postmaster.pid file from the postmaster, but would have to open/lock it
for itself.  That means there's a window where the new backend exists
but would be invisible to a hypothetical new postmaster.

We could work around this with the following, very ugly protocol:

1. Postmaster normally maintains fcntl read lock on its postmaster.pid
file.  Each spawned backend immediately opens and read-locks
postmaster.pid, too, and holds that file open until it dies.  (Thus
wasting a kernel FD per backend, which is one of the less attractive
things about this.)  If the backend is unable to obtain read lock on
postmaster.pid, then it complains and dies.  We must use read locks
here so that all these processes can hold them separately.

2. If a newly started postmaster sees a pre-existing postmaster.pid
file, it tries to obtain a *write* lock on that file.  If it fails,
conclude that an old postmaster or backend is still alive; complain
and quit.  If it succeeds, sit for say 1 second before deleting the file
and creating a new one.  (The delay here is to allow any just-started
old backends to fail to acquire read lock and quit.  A possible
objection is that we have no way to guarantee 1 second is enough, though
it ought to be plenty if the lock acquisition is just after the fork.)

One thing that worries me a little bit is that this means an fcntl
read-lock request will exist inside the kernel for each active backend.
Does anyone know of any performance problems or hard kernel limits we
might run into with large numbers of backends (lots and lots of fcntl
locks)?  At least the locks are on a file that we don't actually touch
in the normal course of business.

A small savings is that the backends don't actually need to open new FDs
for the postmaster.pid file; they can use the one they inherit from the
postmaster, even though they do need to lock it again.  I'm not sure how
much that saves inside the kernel, but at least something.

There are also the usual set of concerns about portability of flock,
though this time we're locking a plain file and not a socket, so it
shouldn't be as much trouble as it was before.

Comments?  Does anyone see a better way to do it?
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Peter Eisentraut

Date:

06 March 2001, 13:11:53

Lamar Owen writes:

> > case when the postmaster does not come down after 60 seconds.  But this is
> > really no problem for the issue at hand because if you do a normal
> > runlevel switch then the postmaster will simply keep running, and during a
> > system shutdown all the backends are going to die anyway.
>
> Only if each and every shutdown script succeeds in its task.  And I have
> to make sure that the RPM's shipping script successfully pulls down the
> system in an orderly fashion -- of course, I don't have to worry about
> the case where a postmaster is going to be started back up if we are in
> system shutdown -- but, as Tom also stated, I can't assume I'm in the
> system's death throes when called with the stop parameter.

Well, if you have something clever you want to do if the postmaster
doesn't come down after an orderly shutdown then please share it.  The
current alternatives are 'leave running' or 'kill -9'.  I know I'd prefer
the former.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 13:22:56

* Tom Lane <tgl@sss.pgh.pa.us> [010306 10:10] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > I'm sure some sort of encoding of the PGDATA directory along with
> > the pids stored in the shm segment...
> 
> I thought about this too, but it strikes me as not very trustworthy.
> The problem is that there's no guarantee that the new postmaster will
> even notice the old shmem segment: it might select a different shmem
> key.  (The 7.1 coding of shmem key selection makes this more likely
> than it used to be, but even under 7.0, it will certainly fail to work
> if I choose to start the new postmaster using a different port number
> than the old one had.  The shmem key is driven primarily by port number
> not data directory ...)

This seems like a mistake.  

I'm suprised you guys aren't just using some form of the FreeBSD
ftok() algorithm for this:

FTOK(3)                FreeBSD Library Functions Manual                FTOK(3)

...
    The ftok() function attempts to create a unique key suitable for use with    the msgget(3), semget(2) and shmget(2)
functionsgiven the path of an ex-    isting file and a user-selectable id.
 
    The specified path must specify an existing file that is accessible to    the calling process or the call will
fail. Also, note that links to    files will return the same key, given the same id.
 

BUGS    The returned key is computed based on the device minor number and inode    of the specified path in combination
withthe lower 8 bits of the given    id.  Thus it is quite possible for the routine to return duplicate keys.
 

The "BUGS" seems to be exactly what you guys are looking for, a somewhat
reliable method of obtaining a system id.  If that sounds evil, read 
below for an alternate suggestion.

> The interlock has to be tightly tied to the PGDATA directory, because
> what we're trying to protect is the files in and under that directory.
> It seems that something based on file(s) in that directory is the way
> to go.
> 
> The best idea I've seen so far is Hiroshi's idea of having all the
> backends hold fcntl locks on the same file (probably postmaster.pid
> would do fine).  Then the new postmaster can test whether any backends
> are still alive by trying to lock the old postmaster.pid file.
> Unfortunately, I read in the fcntl man page:
> 
>     Locks are not inherited by a child process in a fork(2) system call.
> 
> This makes the idea much less attractive than I originally thought:
> a new backend would not automatically inherit a lock on the
> postmaster.pid file from the postmaster, but would have to open/lock it
> for itself.  That means there's a window where the new backend exists
> but would be invisible to a hypothetical new postmaster.
> 
> We could work around this with the following, very ugly protocol:
> 
> 1. Postmaster normally maintains fcntl read lock on its postmaster.pid
> file.  Each spawned backend immediately opens and read-locks
> postmaster.pid, too, and holds that file open until it dies.  (Thus
> wasting a kernel FD per backend, which is one of the less attractive
> things about this.)  If the backend is unable to obtain read lock on
> postmaster.pid, then it complains and dies.  We must use read locks
> here so that all these processes can hold them separately.
> 
> 2. If a newly started postmaster sees a pre-existing postmaster.pid
> file, it tries to obtain a *write* lock on that file.  If it fails,
> conclude that an old postmaster or backend is still alive; complain
> and quit.  If it succeeds, sit for say 1 second before deleting the file
> and creating a new one.  (The delay here is to allow any just-started
> old backends to fail to acquire read lock and quit.  A possible
> objection is that we have no way to guarantee 1 second is enough, though
> it ought to be plenty if the lock acquisition is just after the fork.)
> 
> One thing that worries me a little bit is that this means an fcntl
> read-lock request will exist inside the kernel for each active backend.
> Does anyone know of any performance problems or hard kernel limits we
> might run into with large numbers of backends (lots and lots of fcntl
> locks)?  At least the locks are on a file that we don't actually touch
> in the normal course of business.
> 
> A small savings is that the backends don't actually need to open new FDs
> for the postmaster.pid file; they can use the one they inherit from the
> postmaster, even though they do need to lock it again.  I'm not sure how
> much that saves inside the kernel, but at least something.
> 
> There are also the usual set of concerns about portability of flock,
> though this time we're locking a plain file and not a socket, so it
> shouldn't be as much trouble as it was before.
> 
> Comments?  Does anyone see a better way to do it?

Possibly...

What about encoding the shm id in the pidfile?  Then one can just ask
how many processes are attached to that segment?  (if it doesn't
exist, one can assume all backends have exited)

you want the field 'shm_nattch'
    The shmid_ds struct is defined as follows:
    struct shmid_ds {        struct ipc_perm shm_perm;   /* operation permission structure */        int
shm_segsz; /* size of segment in bytes */        pid_t           shm_lpid;   /* process ID of last shared memory op */
     pid_t           shm_cpid;   /* process ID of creator */        short           shm_nattch; /* number of current
attaches*/        time_t          shm_atime;  /* time of last shmat() */        time_t          shm_dtime;  /* time of
lastshmdt() */        time_t          shm_ctime;  /* time of last change by shmctl() */        void
*shm_internal;/* sysv stupidity */    };
 


--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

06 March 2001, 13:35:34

Alfred Perlstein <bright@wintelcom.net> writes:
> * Tom Lane <tgl@sss.pgh.pa.us> [010306 10:10] wrote:
>> The shmem key is driven primarily by port number
>> not data directory ...)

> This seems like a mistake.  

> I'm suprised you guys aren't just using some form of the FreeBSD
> ftok() algorithm for this:

This has been discussed before --- see the archives.  The conclusion was
that since ftok doesn't guarantee uniqueness, it adds nothing except
lack of predictability to the shmem key selection process.  We'd still
need logic to cope with key collisions, and given that, we might as well
select keys that have some obvious relationship to user-visible
parameters, viz the port number.  As is, you can fairly easily tell
which shmem segment belongs to which postmaster from the shmem key;
with ftok-derived keys, you couldn't tell a thing.

>> Comments?  Does anyone see a better way to do it?

> What about encoding the shm id in the pidfile?  Then one can just ask
> how many processes are attached to that segment?  (if it doesn't
> exist, one can assume all backends have exited)

Hmm ... that might actually be a pretty good idea.  A small problem is
that the shm key isn't yet selected at the time we initially create the
lockfile, but I can't think of any reason that we could not go back and
append the key to the lockfile afterwards.

> you want the field 'shm_nattch'

Are there any portability problems with relying on shm_nattch to be
available?  If not, I like this a lot...
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 13:44:57

* Tom Lane <tgl@sss.pgh.pa.us> [010306 10:35] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> 
> > What about encoding the shm id in the pidfile?  Then one can just ask
> > how many processes are attached to that segment?  (if it doesn't
> > exist, one can assume all backends have exited)
> 
> Hmm ... that might actually be a pretty good idea.  A small problem is
> that the shm key isn't yet selected at the time we initially create the
> lockfile, but I can't think of any reason that we could not go back and
> append the key to the lockfile afterwards.
> 
> > you want the field 'shm_nattch'
> 
> Are there any portability problems with relying on shm_nattch to be
> available?  If not, I like this a lot...

Well it's available on FreeBSD and Solaris, I'm sure Redhat has
some deamon that resets the value to 0 periodically just for kicks
so it might not be viable... :)

Seriously, there's some dispute on the type that 'shm_nattch' is,
under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
it's 'short' (i should fix this. :)).

But since you're really only testing for 0'ness then it shouldn't
really be a problem.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

06 March 2001, 13:57:14

Alfred Perlstein <bright@wintelcom.net> writes:
>> Are there any portability problems with relying on shm_nattch to be
>> available?  If not, I like this a lot...

> Well it's available on FreeBSD and Solaris, I'm sure Redhat has
> some deamon that resets the value to 0 periodically just for kicks
> so it might not be viable... :)

I notice that our BeOS and QNX emulations of shmctl() don't support
IPC_STAT, but that could be dealt with, at least to the extent of
stubbing it out.

This does raise the question of what to do if shmctl(IPC_STAT) fails
for a reason other than EINVAL.  I think the conservative thing to do
is refuse to start up.  On EPERM, for example, it's possible that there
is a postmaster running in your PGDATA but with a different userid.

> Seriously, there's some dispute on the type that 'shm_nattch' is,
> under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
> it's 'short' (i should fix this. :)).

> But since you're really only testing for 0'ness then it shouldn't
> really be a problem.

We need not copy the value anywhere, so as long as the struct is
correctly declared in the system header files I don't think it matters
what the field type is ...
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

06 March 2001, 14:07:47

Peter Eisentraut wrote:
> Well, if you have something clever you want to do if the postmaster
> doesn't come down after an orderly shutdown then please share it.  The
> current alternatives are 'leave running' or 'kill -9'.  I know I'd prefer
> the former.

Well, my preferences aren't really relevant here.  I have a job to do as
an initscript in the RPMish environment -- and I really have to meet my
obligations (using the first personal pronoun there to anthropomorph the
initscript to a person, allowing us to have a little sympathy for the
poor shell script's plight :-)).

My preference is to let it float in limbo -- if it's in limbo and won't
come out, then we have bigger issues.

However, I could do something really sneaky in the RedHat environment
and let init do the dirty work for me -- but, again, I am not at all
guaranteed that things will come down orderly -- if it is at all
possible for me to bring about an orderly (if slow) shutdown that does
terminate as the rest of the system needs it to do, then I'll attempt to
do so.

But, the immediate issue is preventing chaotic stops within the
initscript, so I'm going to experiment with things and see if I can make
the initscript hang -- if I can't, then I'll likely put in the 'killproc
postmaster -INT' with escalation to -TERM if it doesn't come down within
sixty seconds (and, no, I am not going to sleep 60 then check things --
I am going to sleep 1 and loop sixty times) -- no need to unnecessarily
delay system shutdown (and potential restart).  And I won't put in the
-KILL unless I can find a safe and thorough way to do so.

Or I may go ahead and pg_ctl-ize things and let pg_ctl do the dirty
work, as that IS what pg_ctl is supposed to accomplish.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Peter Eisentraut

Date:

06 March 2001, 14:18:20

Alfred Perlstein writes:

> Seriously, there's some dispute on the type that 'shm_nattch' is,
> under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
> it's 'short' (i should fix this. :)).

What I don't like is that my /usr/include/sys/shm.h (through other
headers) has:

typedef unsigned long int shmatt_t;

/* Data structure describing a set of semaphores.  */
struct shmid_ds {   struct ipc_perm shm_perm;           /* operation permission struct */   size_t shm_segsz;
       /* size of segment in bytes */   __time_t shm_atime;                 /* time of last shmat() */   unsigned long
int__unused1;   __time_t shm_dtime;                 /* time of last shmdt() */   unsigned long int __unused2;
__time_tshm_ctime;                 /* time of last change by shmctl() */   unsigned long int __unused3;   __pid_t
shm_cpid;                  /* pid of creator */   __pid_t shm_lpid;                   /* pid of last shmop */
shmatt_tshm_nattch;                /* number of current attaches */   unsigned long int __unused4;   unsigned long int
__unused5;};

whereas /usr/src/linux/include/shm.h has:

struct shmid_ds {       struct ipc_perm         shm_perm;       /* operation perms */       int
shm_segsz;     /* size of segment (bytes) */       __kernel_time_t         shm_atime;      /* last attach time */
__kernel_time_t        shm_dtime;      /* last detach time */       __kernel_time_t         shm_ctime;      /* last
changetime */       __kernel_ipc_pid_t      shm_cpid;       /* pid of creator */       __kernel_ipc_pid_t
shm_lpid;      /* pid of last operator */       unsigned short          shm_nattch;     /* no. of current attaches */
   unsigned short          shm_unused;     /* compatibility */       void                    *shm_unused2;   /* ditto -
usedby DIPC */       void                    *shm_unused3;   /* unused */

};

Not only note the shm_nattch type, but also shm_segsz, and the "unused"
fields in between.  I don't know a thing about the Linux kernel sources,
but this doesn't seem right.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 14:18:36

* Tom Lane <tgl@sss.pgh.pa.us> [010306 11:03] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> >> Are there any portability problems with relying on shm_nattch to be
> >> available?  If not, I like this a lot...
> 
> > Well it's available on FreeBSD and Solaris, I'm sure Redhat has
> > some deamon that resets the value to 0 periodically just for kicks
> > so it might not be viable... :)
> 
> I notice that our BeOS and QNX emulations of shmctl() don't support
> IPC_STAT, but that could be dealt with, at least to the extent of
> stubbing it out.

Well since we already have spinlocks, I can't see why we can't
keep the refcount and spinlock in a special place in the shm
for all cases?

> This does raise the question of what to do if shmctl(IPC_STAT) fails
> for a reason other than EINVAL.  I think the conservative thing to do
> is refuse to start up.  On EPERM, for example, it's possible that there
> is a postmaster running in your PGDATA but with a different userid.

Yes, if possible a more meaningfull error message and pointer to
some docco would be nice or even a nice "i don't care, i killed
all the backends, just start darnit" flag, it's really no fun at
all to have to attempt to decypher some cryptic error message at
3am when the database/system is acting up. :)

> > Seriously, there's some dispute on the type that 'shm_nattch' is,
> > under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
> > it's 'short' (i should fix this. :)).
> 
> > But since you're really only testing for 0'ness then it shouldn't
> > really be a problem.
> 
> We need not copy the value anywhere, so as long as the struct is
> correctly declared in the system header files I don't think it matters
> what the field type is ...

Yup, my point exactly.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

06 March 2001, 14:25:05

Alfred Perlstein <bright@wintelcom.net> writes:
> * Tom Lane <tgl@sss.pgh.pa.us> [010306 11:03] wrote:
>> I notice that our BeOS and QNX emulations of shmctl() don't support
>> IPC_STAT, but that could be dealt with, at least to the extent of
>> stubbing it out.

> Well since we already have spinlocks, I can't see why we can't
> keep the refcount and spinlock in a special place in the shm
> for all cases?

No, we mustn't go there.  If the kernel isn't keeping the refcount
then it's worse than useless: as soon as some process crashes without
decrementing its refcount, you have a condition that you can't recover
from without reboot.

What I'm currently imagining is that the stub implementations will just
return a failure code for IPC_STAT, and the outer code will in turn fail
with a message along the lines of "It looks like there's a pre-existing
shmem block (id XXX) still in use.  If you're sure there are no old
backends still running, remove the shmem block with ipcrm(1), or just
delete $PGDATA/postmaster.pid."  I dunno what shmem management tools
exist on BeOS/QNX, but deleting the lockfile will definitely suppress
the startup interlock ;-).

> Yes, if possible a more meaningfull error message and pointer to
> some docco would be nice

Is the above good enough?
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

06 March 2001, 14:33:15

Peter Eisentraut wrote:
> Not only note the shm_nattch type, but also shm_segsz, and the "unused"
> fields in between.  I don't know a thing about the Linux kernel sources,
> but this doesn't seem right.

Red Hat 7, right?  My RedHat 7 system isn't running RH 7 right now (it's
this notebook that I'm running Win95 on right now), but see which RPM's
own the two headers.  You may be in for a shock.  IIRC, the first system
include is from the 2.4 kernel, and the second in the kernel source tree
is from the 2.2 kernel.

Odd, but not really broken.  Should be fixed in the latest public beta
of RedHat, that actually has the 2.4 kernel.  I can't really say any
more about that, however.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

06 March 2001, 14:43:17

Peter Eisentraut <peter_e@gmx.net> writes:
> What I don't like is that my /usr/include/sys/shm.h (through other
> headers) has [foo]
> whereas /usr/src/linux/include/shm.h has [bar]

Are those declarations perhaps bit-compatible?  Looks a tad endian-
dependent, though ...
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 14:47:29

* Tom Lane <tgl@sss.pgh.pa.us> [010306 11:30] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > * Tom Lane <tgl@sss.pgh.pa.us> [010306 11:03] wrote:
> >> I notice that our BeOS and QNX emulations of shmctl() don't support
> >> IPC_STAT, but that could be dealt with, at least to the extent of
> >> stubbing it out.
> 
> > Well since we already have spinlocks, I can't see why we can't
> > keep the refcount and spinlock in a special place in the shm
> > for all cases?
> 
> No, we mustn't go there.  If the kernel isn't keeping the refcount
> then it's worse than useless: as soon as some process crashes without
> decrementing its refcount, you have a condition that you can't recover
> from without reboot.

Not if the postmaster outputs the following:

> What I'm currently imagining is that the stub implementations will just
> return a failure code for IPC_STAT, and the outer code will in turn fail
> with a message along the lines of "It looks like there's a pre-existing
> shmem block (id XXX) still in use.  If you're sure there are no old
> backends still running, remove the shmem block with ipcrm(1), or just
> delete $PGDATA/postmaster.pid."  I dunno what shmem management tools
> exist on BeOS/QNX, but deleting the lockfile will definitely suppress
> the startup interlock ;-).
> 
> > Yes, if possible a more meaningfull error message and pointer to
> > some docco would be nice
> 
> Is the above good enough?

Sure. :)

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 14:51:56

* Lamar Owen <lamar.owen@wgcr.org> [010306 11:39] wrote:
> Peter Eisentraut wrote:
> > Not only note the shm_nattch type, but also shm_segsz, and the "unused"
> > fields in between.  I don't know a thing about the Linux kernel sources,
> > but this doesn't seem right.
> 
> Red Hat 7, right?  My RedHat 7 system isn't running RH 7 right now (it's
> this notebook that I'm running Win95 on right now), but see which RPM's
> own the two headers.  You may be in for a shock.  IIRC, the first system
> include is from the 2.4 kernel, and the second in the kernel source tree
> is from the 2.2 kernel.
> 
> Odd, but not really broken.  Should be fixed in the latest public beta
> of RedHat, that actually has the 2.4 kernel.  I can't really say any
> more about that, however.

Y'know, I was only kidding about Linux going out of its way to
defeat the 'shm_nattch' trick... *sigh*

As a FreeBSD developer I'm wondering if Linux keeps compatibility
calls around for old binaries or not.  Any idea?

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 15:08:50

* Tom Lane <tgl@sss.pgh.pa.us> [010306 11:49] wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > What I don't like is that my /usr/include/sys/shm.h (through other
> > headers) has [foo]
> > whereas /usr/src/linux/include/shm.h has [bar]
> 
> Are those declarations perhaps bit-compatible?  Looks a tad endian-
> dependent, though ...

Of course not, the size of the struct changed (short->unsigned
long, basically int16_t -> uint32_t), because the kernel and userland
in Linux are hardly in sync you have the fun of guessing if you
get:

old struct -> old syscall (ok)
new struct -> old syscall (boom)
old struct -> new syscall (boom)
new struct -> new syscall (ok)

Honestly I think this problem should be left to the vendor to fix
properly (if it needs fixing), the sysV API was published at least
6 years ago, they ought to have it mostly correct by now.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Tom Lane

Date:

06 March 2001, 15:21:12

Alfred Perlstein <bright@wintelcom.net> writes:
> Of course not, the size of the struct changed (short->unsigned
> long, basically int16_t -> uint32_t), because the kernel and userland
> in Linux are hardly in sync you have the fun of guessing if you
> get:

> old struct -> old syscall (ok)
> new struct -> old syscall (boom)
> old struct -> new syscall (boom)
> new struct -> new syscall (ok)

Ugh.  However, it looks like it might be fairly fail-soft: if we
have the wrong declaration then we pick up some other field of the
struct, and probably end up complaining because nattch appears nonzero.
Recovery method (clean up the shm seg or delete lockfile) is the same.

I'm still inclined to go with this; it beats corrupting the WAL log,
and the fcntl(SETLK) alternative has its own set of portability
booby-traps.
        regards, tom lane

Re: How to shoot yourself in the foot: kill -9 postmaster

From

ncm@zembu.com (Nathan Myers)

Date:

06 March 2001, 15:46:25

On Tue, Mar 06, 2001 at 08:19:12PM +0100, Peter Eisentraut wrote:
> Alfred Perlstein writes:
> 
> > Seriously, there's some dispute on the type that 'shm_nattch' is,
> > under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
> > it's 'short' (i should fix this. :)).
> 
> What I don't like is that my /usr/include/sys/shm.h (through other
> headers) has:
> 
> typedef unsigned long int shmatt_t;
> 
> /* Data structure describing a set of semaphores.  */
> struct shmid_ds
>   {
>     struct ipc_perm shm_perm;           /* operation permission struct */
>     size_t shm_segsz;                   /* size of segment in bytes */
>     __time_t shm_atime;                 /* time of last shmat() */
>     unsigned long int __unused1;
>     __time_t shm_dtime;                 /* time of last shmdt() */
>     unsigned long int __unused2;
>     __time_t shm_ctime;                 /* time of last change by shmctl() */
>     unsigned long int __unused3;
>     __pid_t shm_cpid;                   /* pid of creator */
>     __pid_t shm_lpid;                   /* pid of last shmop */
>     shmatt_t shm_nattch;                /* number of current attaches */
>     unsigned long int __unused4;
>     unsigned long int __unused5;
>   };
> 
> whereas /usr/src/linux/include/shm.h has:
> 
> struct shmid_ds {
>         struct ipc_perm         shm_perm;       /* operation perms */
>         int                     shm_segsz;      /* size of segment (bytes) */
>         __kernel_time_t         shm_atime;      /* last attach time */
>         __kernel_time_t         shm_dtime;      /* last detach time */
>         __kernel_time_t         shm_ctime;      /* last change time */
>         __kernel_ipc_pid_t      shm_cpid;       /* pid of creator */
>         __kernel_ipc_pid_t      shm_lpid;       /* pid of last operator */
>         unsigned short          shm_nattch;     /* no. of current attaches */
>         unsigned short          shm_unused;     /* compatibility */
>         void                    *shm_unused2;   /* ditto - used by DIPC */
>         void                    *shm_unused3;   /* unused */
> };
> 
> 
> Not only note the shm_nattch type, but also shm_segsz, and the "unused"
> fields in between.  I don't know a thing about the Linux kernel sources,
> but this doesn't seem right.

On Linux, /usr/src/linux/include is meaningless for anything in userland; 
it's meant only for building the kernel and kernel modules.  That Red Hat 
tends to expose it to user-level builds is a long-standing bug in Red 
Hat's distribution, in violation of the File Hierarchy Standard as well 
as explicit instructions from Linus & crew and from the maintainer of the 
C library.

User-level programs see what's in /usr/include, which only has to match 
what the C library wants.  It's the C library's job to do any mapping 
needed, and it does.  The C library is maintained very, very carefully
to keep binary compatibility with all old versions.  (One sometimes
encounters commercial programs that rely on a bug or undocumented/
usupported feature that disappears in a later library version.)

That is why there is no problem with version skew in the syscall
argument structures on a correctly-configured Linux system.  (On a
Red Hat system it is very easy to get them out of sync, but RH fans 
are used to problems.)

Nathan Myers
ncm@zembu.com

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Peter Eisentraut

Date:

06 March 2001, 15:47:19

Bruce Momjian writes:

> This will try a pg_ctl shutdown for 60 seconds, then kill pg_ctl.  You
> would then need a kill of you own.

pg_ctl automatically times out after 60 seconds.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

06 March 2001, 16:20:20

Nathan Myers wrote:
> That is why there is no problem with version skew in the syscall
> argument structures on a correctly-configured Linux system.  (On a
> Red Hat system it is very easy to get them out of sync, but RH fans
> are used to problems.)

Is RedHat bashing really necessary here?  At least they are payrolling
Second Chair on the Linux kernel hierarchy.  And they are very
supportive of PostgreSQL (by shipping us with their distribution).
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Samuel Sieb

Date:

06 March 2001, 16:56:31

On Tue, Mar 06, 2001 at 12:46:24PM -0800, Nathan Myers wrote:
> 
> On Linux, /usr/src/linux/include is meaningless for anything in userland; 
> it's meant only for building the kernel and kernel modules.  That Red Hat 
> tends to expose it to user-level builds is a long-standing bug in Red 
> Hat's distribution, in violation of the File Hierarchy Standard as well 
> as explicit instructions from Linus & crew and from the maintainer of the 
> C library.
> 
Red Hat's Fisher Beta has split the 2 includes, which caused an error trying
to compile a (I guess badly configured) kernel module.  The header files in
/usr/include now give an error if you try to build a kernel module that gets
header files from there.

So whether they were wrong in the past or not, they are now doing things the
way you say is proper.

Re: How to shoot yourself in the foot: kill -9 postmaster

From

"Martin A. Marques"

Date:

06 March 2001, 17:07:01

El Mar 06 Mar 2001 18:56, Samuel Sieb escribió:
> On Tue, Mar 06, 2001 at 12:46:24PM -0800, Nathan Myers wrote:
> > On Linux, /usr/src/linux/include is meaningless for anything in userland;
> > it's meant only for building the kernel and kernel modules.  That Red Hat
> > tends to expose it to user-level builds is a long-standing bug in Red
> > Hat's distribution, in violation of the File Hierarchy Standard as well
> > as explicit instructions from Linus & crew and from the maintainer of the
> > C library.
>
> Red Hat's Fisher Beta has split the 2 includes, which caused an error
> trying to compile a (I guess badly configured) kernel module.  The header
> files in /usr/include now give an error if you try to build a kernel module
> that gets header files from there.
>
> So whether they were wrong in the past or not, they are now doing things
> the way you say is proper.

I am very happy for seeing RedHat let out beta releases of there 
distribution. That's whats importante about it all.

-- 
System Administration: It's a dirty job, 
but someone told I had to do it.
-----------------------------------------------------------------
Martín Marqués            email:     martin@math.unl.edu.ar
Santa Fe - Argentina        http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 17:43:10

* Lamar Owen <lamar.owen@wgcr.org> [010306 13:27] wrote:
> Nathan Myers wrote:
> > That is why there is no problem with version skew in the syscall
> > argument structures on a correctly-configured Linux system.  (On a
> > Red Hat system it is very easy to get them out of sync, but RH fans
> > are used to problems.)
> 
> Is RedHat bashing really necessary here?  At least they are payrolling
> Second Chair on the Linux kernel hierarchy.  And they are very
> supportive of PostgreSQL (by shipping us with their distribution).

Just because they do some really nice things and have some really
nice stuff doesn't mean they should really get cut any slack for
doing things like shipping out of sync kernel/system headers, kill
-9'ing databases and having programs like 'tmpwatch' running on
the boxes.  It really shows a lack of understanding of how Unix is
supposed to run.

What they really need to do is hire some grey beards (old school
Unix folks) to QA the releases and keep stuff like this from
happening/shipping. 

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Cyril VELTER

Date:

06 March 2001, 19:14:38

  BeOS haven't this stat (I have a bunch of others but not this one).
  If I unsterstand correctly, you want to check if there is some backend 
still attached to shared mem segment of a given key ? In this case, I have an 
easy solution to fake the stat, because all segment have an encoded name 
containing this key, so I can count them.

   cyril

>
>Alfred Perlstein <bright@wintelcom.net> writes:
>>> Are there any portability problems with relying on shm_nattch to be
>>> available?  If not, I like this a lot...
>
>> Well it's available on FreeBSD and Solaris, I'm sure Redhat has
>> some deamon that resets the value to 0 periodically just for kicks
>> so it might not be viable... :)
>
>I notice that our BeOS and QNX emulations of shmctl() don't support
>IPC_STAT, but that could be dealt with, at least to the extent of
>stubbing it out.
>
>This does raise the question of what to do if shmctl(IPC_STAT) fails
>for a reason other than EINVAL.  I think the conservative thing to do
>is refuse to start up.  On EPERM, for example, it's possible that there
>is a postmaster running in your PGDATA but with a different userid.
>
>
>> Seriously, there's some dispute on the type that 'shm_nattch' is,
>> under Solaris it's "shmatt_t" (unsigned long afaik), under FreeBSD
>> it's 'short' (i should fix this. :)).
>
>> But since you're really only testing for 0'ness then it shouldn't
>> really be a problem.
>
>We need not copy the value anywhere, so as long as the struct is
>correctly declared in the system header files I don't think it matters
>what the field type is ...
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Alfred Perlstein

Date:

06 March 2001, 19:22:17

> >Alfred Perlstein <bright@wintelcom.net> writes:
> >>> Are there any portability problems with relying on shm_nattch to be
> >>> available?  If not, I like this a lot...
> >
> >> Well it's available on FreeBSD and Solaris, I'm sure Redhat has
> >> some deamon that resets the value to 0 periodically just for kicks
> >> so it might not be viable... :)
> >
> >I notice that our BeOS and QNX emulations of shmctl() don't support
> >IPC_STAT, but that could be dealt with, at least to the extent of
> >stubbing it out.

* Cyril VELTER <cyril.velter@libertysurf.fr> [010306 16:15] wrote:
> 
>    BeOS haven't this stat (I have a bunch of others but not this one).
> 
>    If I unsterstand correctly, you want to check if there is some backend 
> still attached to shared mem segment of a given key ? In this case, I have an 
> easy solution to fake the stat, because all segment have an encoded name 
> containing this key, so I can count them.

We need to be able to take a single shared memory segment and
determine if any other process is using it.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Lamar Owen

Date:

06 March 2001, 19:29:49

Alfred Perlstein wrote:
> What they really need to do is hire some grey beards (old school
> Unix folks) to QA the releases and keep stuff like this from
> happening/shipping.

Like the 250-strong RedHat Beta Team, of which I am a member? :-) I
can't disclose the discussions on that list, but, suffice to say the
traffic there is at least as great as the traffic on this one.

Of course, 7.1 hasn't shipped with a RedHat release yet -- and it's my
job to make sure the postmaster gets killed properly in my initscript
inside the package for 7.1 -- there will be no kill -9 unless it is an
emergency to do so for postmaster.

I've seen the advisories and the bug lists -- RedHat is not alone with
bugs -- not even unusual with bugs.  And every OS I know of (and you
too) has had a brown paper bag release before.  Even PostgreSQL, given
its high release quality standards, has had a brown paper bag release --
we all still make mistakes (I know -- I've made more than my share of
them).

Anyway, that's more than what the rest of the list wanted to read.
Replies to private e-mail, please. 
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: Red Hat bashing

From

ncm@zembu.com (Nathan Myers)

Date:

06 March 2001, 19:58:19

On Tue, Mar 06, 2001 at 04:20:13PM -0500, Lamar Owen wrote:
> Nathan Myers wrote:
> > That is why there is no problem with version skew in the syscall
> > argument structures on a correctly-configured Linux system.  (On a
> > Red Hat system it is very easy to get them out of sync, but RH fans
> > are used to problems.)
> 
> Is RedHat bashing really necessary here? 

I recognize that my last seven words above contributed nothing.
In the future I will only post strictly factual statements about
Red Hat and similarly charged topics, and keep the opinions to
myself.  I value the collegiality of this list too much to risk 
it further.  I offer my apologies for violating it.

By the way... do they call Red Hat "RedHat" at Red Hat? 

Nathan Myers
ncm@zembu.com

Re: Red Hat bashing

From

Lamar Owen

Date:

06 March 2001, 20:16:24

Nathan Myers wrote:
> it further.  I offer my apologies for violating it.

Well, an apology is not really necessary -- but I do get a little tired
at the treatment a good open source company gets at the hands of open
source advocates.  Yes, they make mistakes. Everyone does.
> By the way... do they call Red Hat "RedHat" at Red Hat?

No, they don't.  I don't know how I got into the habit of leaving out
the space, but the space is supposed to be there -- unless you are on
the Red Hat CD, where you will find a directory called 'RedHat'.

Oh well. Totally off topic.  If the from header had your personal
address in it (Reply-All only lets me reply to the list for that
message) I wouldn't grieve the list further with it.

My last words on that subject.  Let's go on making PostgreSQL better. 
And preventing the kill -9 will make PostgreSQL better, even if it is
masking a certain amount of shortsightedness on a certain initscripts
author's part. :-)
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Re: How to shoot yourself in the foot: kill -9 postmaster

From

teg@redhat.com (Trond Eivind Glomsrød)

Date:

06 March 2001, 20:51:21

ncm@zembu.com (Nathan Myers) writes:

> On Linux, /usr/src/linux/include is meaningless for anything in userland; 
> it's meant only for building the kernel and kernel modules.  That Red Hat 
> tends to expose it to user-level builds is a long-standing bug in Red 
> Hat's distribution

1) it isn't this way anyore
2) this was so for most distributions for a loong time, not a "Red  Hat" bug.

> in violation of the File Hierarchy Standard as well as explicit
> instructions from Linus & crew and from the maintainer of the C
> library.

Which obviously hasn't always been the case - the FHS isn't exactly old. 
Things have changed since then, we have followed.

-- 
Trond Eivind Glomsrød
Red Hat, Inc.

RE: How to shoot yourself in the foot: kill -9 postmaster

From

"Hiroshi Inoue"

Date:

06 March 2001, 23:37:19

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> The interlock has to be tightly tied to the PGDATA directory, because
> what we're trying to protect is the files in and under that directory.
> It seems that something based on file(s) in that directory is the way
> to go.
> 
> The best idea I've seen so far is Hiroshi's idea of having all the
> backends hold fcntl locks on the same file (probably postmaster.pid
> would do fine).  Then the new postmaster can test whether any backends
> are still alive by trying to lock the old postmaster.pid file.
> Unfortunately, I read in the fcntl man page:
> 
>     Locks are not inherited by a child process in a fork(2) system call.
> 

Yes flock() works well here but fcntl() doesn't.

> This makes the idea much less attractive than I originally thought:
> a new backend would not automatically inherit a lock on the
> postmaster.pid file from the postmaster, but would have to open/lock it
> for itself.  That means there's a window where the new backend exists
> but would be invisible to a hypothetical new postmaster.
> 
> We could work around this with the following, very ugly protocol:
> 
> 1. Postmaster normally maintains fcntl read lock on its postmaster.pid
> file.  Each spawned backend immediately opens and read-locks
> postmaster.pid, too, and holds that file open until it dies.  (Thus
> wasting a kernel FD per backend, which is one of the less attractive
> things about this.)  If the backend is unable to obtain read lock on
> postmaster.pid, then it complains and dies.  We must use read locks
> here so that all these processes can hold them separately.
> 
> 2. If a newly started postmaster sees a pre-existing postmaster.pid
> file, it tries to obtain a *write* lock on that file.  If it fails,
> conclude that an old postmaster or backend is still alive; complain
> and quit.  If it succeeds, sit for say 1 second before deleting the file
> and creating a new one.  (The delay here is to allow any just-started
> old backends to fail to acquire read lock and quit.  A possible
> objection is that we have no way to guarantee 1 second is enough, though
> it ought to be plenty if the lock acquisition is just after the fork.)
> 

I have another idea. My main point is to not remove the existent
pidfile. For example
1) A newly started postmaster tries to obtain a write lock on the  first byte of the pidfile. If it fails the
postmasterquit.
 
2) The postmaster tries to obtain a write lock on the second byte   of the pidfile. If it fails the postmaster quit.
3) The postmaster releases the lock of 2).
4) Each backend obtains a read-lock on the second byte of the   pidfile.

Regards,
Hiroshi Inoue

Re: How to shoot yourself in the foot: kill -9 postmaster

From

"Vadim Mikheev"

Date:

07 March 2001, 05:01:00

> I have spent several days now puzzling over the corrupted WAL logfile
> that Scott Parish was kind enough to send me from a 7.1beta4 crash.
> It looks a lot like two different series of transactions were getting
> written into the same logfile.  I'd been digging like mad in the WAL
> code to try to explain this as a buffer-management logic error, but
> after a fresh exchange of info it turns out that I was barking up the
> wrong tree.

Sorry about that. This is the same situation I was in myself couple
of times and "fresh exchange of info" was saving too -:)
Anyway it's good to know that it wasn't buffer/etc logic error -:)
(Actually, logs from you looked sooo grave so it becomes unclear
how WAL worked at all -:).

Nevertheless, subj is rised. BTW, does anybody know results of kill -9
in Oracle/Informix/etc? Just curious -:)

Vadim

Re: How to shoot yourself in the foot: kill -9 postmaster

From

Andrew McMillan

Date:

07 March 2001, 05:40:16

Vadim Mikheev wrote:
> 
> Nevertheless, subj is rised. BTW, does anybody know results of kill -9
> in Oracle/Informix/etc? Just curious -:)

Progress has no problem with it that I have ever seen.

Regards,                Andrew.
-- 
_____________________________________________________________________          Andrew McMillan, e-mail:
Andrew@catalyst.net.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267

Re: How to shoot yourself in the foot: kill -9 postmaster

From

teg@redhat.com (Trond Eivind Glomsrød)

Date:

08 March 2001, 12:28:19

Samuel Sieb <samuel@sieb.net> writes:

> On Tue, Mar 06, 2001 at 12:46:24PM -0800, Nathan Myers wrote:
> > 
> > On Linux, /usr/src/linux/include is meaningless for anything in userland; 
> > it's meant only for building the kernel and kernel modules.  That Red Hat 
> > tends to expose it to user-level builds is a long-standing bug in Red 
> > Hat's distribution, in violation of the File Hierarchy Standard as well 
> > as explicit instructions from Linus & crew and from the maintainer of the 
> > C library.
> > 
> Red Hat's Fisher Beta has split the 2 includes, which caused an error trying
> to compile a (I guess badly configured) kernel module.  

It was split in Red Hat Linux 7 as well.

-- 
Trond Eivind Glomsrød
Red Hat, Inc.