Thread: server auto-restarts and ipcs

server auto-restarts and ipcs

From
"Ed L."
Date:
A power failure led to failed postmaster restart using 7.4.6 (see output
below).  The short-term fix is usually to delete the pid file and restart.

I often wonder why ipcs never seems to show the shared memory
block in question?  Am I using the wrong command?  Does the key
mentioned by pgsql map to the key in the ipcs output?  And if the
shared segment is simply not there, would it be possible for pgsql to
figure that out ala Apache, search the process table, and go ahead
and restart if it didn't see a postmaster already running?  I'm sure this
has been asked and answered, I just couldn't find it via google...

TIA.

Ed

Database and process is pg746dba...

$ cat logs-pg746-7.4.6/server_log.Mon
pg_ctl: Another postmaster may be running.  Trying to start postmaster anyway.
2004-11-08 17:17:22.398 [18038] FATAL:  pre-existing shared memory block (key 9746001, ID 658210829) is still in use
HINT:  If you're sure there are no old server processes still running, remove the shared memory block with the command
"ipcrm",or just delete the file "/users/pg746dba/dbclusters/pg746/postgresql-7.4.6/data/postmaster.pid". 
pg_ctl: cannot start postmaster
Examine the log output.

$ ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 32768      ed        777        393216     2          dest
0x00000000 131073     root      644        110592     4          dest
0x00000000 3538946    ed        777        393216     2          dest
0x00000000 3670019    ed        777        393216     2          dest
0x00000000 4685828    ed        777        393216     2          dest
0x00000000 4816901    ed        777        393216     2          dest
0x00000000 4915206    ed        777        393216     2          dest
0x00000000 4980743    ed        777        393216     2          dest
0x00000000 5046280    ed        777        393216     2          dest
0x00000000 5111817    ed        777        393216     2          dest
0x00000000 5537802    root      644        110592     3          dest
0x00000000 6651915    ed        777        393216     2          dest
0x00000000 19595276   ed        666        14400      1          dest
0x00000000 11272205   root      644        110592     2          dest

------ Semaphore Arrays --------
key        semid      owner      perms      nsems

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages



Re: server auto-restarts and ipcs

From
Tom Lane
Date:
"Ed L." <pgsql@bluepolka.net> writes:
> A power failure led to failed postmaster restart using 7.4.6 (see output
> below).  The short-term fix is usually to delete the pid file and restart.

> I often wonder why ipcs never seems to show the shared memory
> block in question?

The shared memory block would certainly not still exist after a system
reboot, so what we have here is a misleading error message.  Looking at
the code, the most plausible explanation appears to be that
shmctl(IPC_STAT) is failing (which it ought to) and returning some errno
code different from EINVAL (which is the case we are expecting to see).
What platform are you on, and what does its shmctl(2) man page document
as error conditions?

            regards, tom lane

Re: server auto-restarts and ipcs

From
"Ed L."
Date:
On Monday November 8 2004 6:16, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > A power failure led to failed postmaster restart using 7.4.6 (see
> > output below).  The short-term fix is usually to delete the pid file
> > and restart.
> >
> > I often wonder why ipcs never seems to show the shared memory
> > block in question?
>
> The shared memory block would certainly not still exist after a system
> reboot, so what we have here is a misleading error message.  Looking at
> the code, the most plausible explanation appears to be that
> shmctl(IPC_STAT) is failing (which it ought to) and returning some errno
> code different from EINVAL (which is the case we are expecting to see).
> What platform are you on, and what does its shmctl(2) man page document
> as error conditions?

Platform is Linux 2.4.20-30.9 on i686 (Pentium 4, I think).

From man 2 schctl:

ERRORS
       On error, errno will be set to one of the following:

       EACCES      is  returned  if  IPC_STAT  is requested and
shm_perm.modes does not allow read access for shmid.

       EFAULT      The argument cmd has value  IPC_SET  or  IPC_STAT  but
the address pointed to by buf isn’t accessible.

       EINVAL      is  returned  if shmid is not a valid identifier, or cmd
is not a valid command.

       EIDRM       is returned if shmid points to a removed identifier.

       EPERM       is returned if IPC_SET or IPC_RMID is  attempted,  and
the effective user ID of the calling process is not the creator (as  found
in  shm_perm.cuid),  the  owner  (as  found  in shm_perm.uid), or the
super-user.

       EOVERFLOW   is  returned  if  IPC_STAT is attempted, and the gid or
uid value is too large to be stored in the structure pointed to by buf.


CONFORMING TO
       SVr4, SVID.  SVr4 documents additional error conditions EINVAL,
ENOENT, ENOSPC, ENOMEM, EEXIST.  Neither SVr4 nor SVID documents an EIDRM
error condition.


Re: server auto-restarts and ipcs

From
"Ed L."
Date:
On Monday November 8 2004 7:24, Ed L. wrote:
> On Monday November 8 2004 6:16, Tom Lane wrote:
> > "Ed L." <pgsql@bluepolka.net> writes:
> > > A power failure led to failed postmaster restart using 7.4.6 (see
> > > output below).  The short-term fix is usually to delete the pid file
> > > and restart.
> > >
> > > I often wonder why ipcs never seems to show the shared memory
> > > block in question?
> >
> > The shared memory block would certainly not still exist after a system
> > reboot, so what we have here is a misleading error message.  Looking at
> > the code, the most plausible explanation appears to be that
> > shmctl(IPC_STAT) is failing (which it ought to) and returning some
> > errno code different from EINVAL (which is the case we are expecting to
> > see). What platform are you on, and what does its shmctl(2) man page
> > document as error conditions?
>
> Platform is Linux 2.4.20-30.9 on i686 (Pentium 4, I think).

I recently saw this same thing happen from a power failure on several HPUX
boxes as well (I think running B.11.00/11.23 with 7.3.4/7.3.7, but not
sure).

Ed


Re: server auto-restarts and ipcs

From
Tom Lane
Date:
"Ed L." <pgsql@bluepolka.net> writes:
> A power failure led to failed postmaster restart using 7.4.6 (see
> output below).  The short-term fix is usually to delete the pid file
> and restart.

Thinking some more about this ... does anyone know the algorithm used
in Linux to assign shared memory segment IDs?

Your report shows about a dozen shmem segments in use; which would put
the probability of an accidental collision at pretty-tiny.  But if the
kernel's assignment algorithm is nonrandom then it'd be plausible for
the Postgres shmem ID from the previous system boot cycle to match
one of the shmem IDs already handed out in the current boot cycle.
In that case we'd get EACCES from shmctl() which we take to be a trouble
indication.  (This is probably over-conservatism, but I don't want to
relax it without knowing for sure that we need to.)

BTW, do you know what all those shmem segments are for?  My Linux box
shows only one segment in use besides the ones Postgres is using.

            regards, tom lane

Re: server auto-restarts and ipcs

From
"Ed L."
Date:
On Monday November 8 2004 8:41, Tom Lane wrote:
>
> BTW, do you know what all those shmem segments are for?  My Linux box
> shows only one segment in use besides the ones Postgres is using.

Looks like Ximian Evolution apps, X, Mozilla, Wombat, etc ...

Ed


Re: server auto-restarts and ipcs

From
Oliver Elphick
Date:
On Mon, 2004-11-08 at 17:47 -0700, Ed L. wrote:
> I often wonder why ipcs never seems to show the shared memory
> block in question?

The permissions of the shared memory block and the semaphore arrays are
600.  ipcs seems not to report objects which you cannot access.  Run
ipcs as root and you should see the PostgreQSL shared memory segment and
semaphores.

--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "O death, where is thy sting? O grave, where is
      thy victory?"             1 Corinthians 15:55


Re: server auto-restarts and ipcs

From
"Ed L."
Date:
On Tuesday November 9 2004 2:16, Oliver Elphick wrote:
> On Mon, 2004-11-08 at 17:47 -0700, Ed L. wrote:
> > I often wonder why ipcs never seems to show the shared memory
> > block in question?
>
> The permissions of the shared memory block and the semaphore arrays are
> 600.  ipcs seems not to report objects which you cannot access.  Run
> ipcs as root and you should see the PostgreQSL shared memory segment and
> semaphores.

I don't see them when running ipcs as root, either.  Not sure that would
make sense given the shared memory is created as the same user running
ipcs...

Ed



Re: server auto-restarts and ipcs

From
Oliver Elphick
Date:
On Tue, 2004-11-09 at 07:00 -0700, Ed L. wrote:
> On Tuesday November 9 2004 2:16, Oliver Elphick wrote:
> > On Mon, 2004-11-08 at 17:47 -0700, Ed L. wrote:
> > > I often wonder why ipcs never seems to show the shared memory
> > > block in question?
> >
> > The permissions of the shared memory block and the semaphore arrays are
> > 600.  ipcs seems not to report objects which you cannot access.  Run
> > ipcs as root and you should see the PostgreQSL shared memory segment and
> > semaphores.
>
> I don't see them when running ipcs as root, either.  Not sure that would
> make sense given the shared memory is created as the same user running
> ipcs...

If neither root nor their creator can see them, I assume they don't
exist.  Certainly, with Linux 2.6 and util-linux 2.12, ipcs sees the
postgres objects whether it is run by root or by the postgres user.

--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "O death, where is thy sting? O grave, where is
      thy victory?"             1 Corinthians 15:55


Re: server auto-restarts and ipcs

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> "Ed L." <pgsql@bluepolka.net> writes:
> > A power failure led to failed postmaster restart using 7.4.6 (see
> > output below).  The short-term fix is usually to delete the pid file
> > and restart.
>
> Thinking some more about this ... does anyone know the algorithm used
> in Linux to assign shared memory segment IDs?

At least in 2.6 it seems to avoid reuse of ids by keeping a global counter
that is incremented every time a segment is created which ranges from 0..128k
that it multiplies by 32k and adds to the array index (which is reused
quickly).

So it doesn't seem plausible that there was an id collision unless this was
different in 2.4.20. However looking at his list of ids they're all separated
by multiples of 32769 which is what you would expect from this algorithm at
least until they start being reused.

--
greg

Re: server auto-restarts and ipcs

From
Greg Stark
Date:
Greg Stark <gsstark@MIT.EDU> writes:

> At least in 2.6 it seems to avoid reuse of ids by keeping a global counter
> that is incremented every time a segment is created which ranges from 0..128k
> that it multiplies by 32k and adds to the array index (which is reused
> quickly).
>
> So it doesn't seem plausible that there was an id collision unless this was
> different in 2.4.20. However looking at his list of ids they're all separated
> by multiples of 32769 which is what you would expect from this algorithm at
> least until they start being reused.

Oh I missed the fact that you were talking about after a reboot. So the
algorithm I described would produce exactly the same sequence of ids after any
reboot given the same sequence of creation and deletions. Even if there's a
different sequence as long as the n'th creation is for the m'th array slot it
would get the same id. So collisions would be very common.

(though it seems the sequence is shared across all the ipc objects.)

--
greg

Re: server auto-restarts and ipcs

From
Tom Lane
Date:
Greg Stark <gsstark@MIT.EDU> writes:
> Oh I missed the fact that you were talking about after a reboot. So the
> algorithm I described would produce exactly the same sequence of ids after any
> reboot given the same sequence of creation and deletions. Even if there's a
> different sequence as long as the n'th creation is for the m'th array slot it
> would get the same id. So collisions would be very common.

This seems to square with Ed's complaint that he frequently sees a
collision after a reboot.  I've just committed some code that makes a
more extensive check as to whether a pre-existing segment actually has
any relevance to our data directory; should fix the problem.

            regards, tom lane

Re: server auto-restarts and ipcs

From
"Ed L."
Date:
On Tuesday November 9 2004 1:37, Tom Lane wrote:
> >> The shared memory block would certainly not still exist after a system
> >> reboot, so what we have here is a misleading error message.  Looking
> >> at the code, the most plausible explanation appears to be that
> >> shmctl(IPC_STAT) is failing (which it ought to) and returning some
> >> errno code different from EINVAL (which is the case we are expecting
> >> to see).
>
> I believe the attached patch will fix this problem for you, at least on
> the assumption that you are starting only one postmaster at system boot.

Just realizing we do start multiple postmasters under same user id when
upgrading a cluster (one on old port, one on new).

I noticed that ipcs on my linux box has a command-line option to list the
pid that created the segment.  Not sure if such a library exists in usable
form, but looking for segments owned by the downed postmaster's pid would
seem to be what is needed.  Just a thought...

Ed


Re: server auto-restarts and ipcs

From
Tom Lane
Date:
"Ed L." <pgsql@bluepolka.net> writes:
> I noticed that ipcs on my linux box has a command-line option to list the
> pid that created the segment.  Not sure if such a library exists in usable
> form, but looking for segments owned by the downed postmaster's pid would
> seem to be what is needed.  Just a thought...

[ thinks about it... ]  Nah, it's still not bulletproof, because in a
system reboot situation you can't trust the old PID either.  It could
easy be that the other guy gets both the PID and the shmem ID that
belonged to you last time.

I've committed changes for 8.0 that mark a shmem segment with the inode
of the associated data directory; that should be a stable enough ID to
handle all routine-reboot cases.  (If you had to restore your whole
filesystem from backup tapes, it might be wrong, but you're going to be
doing such recovery manually anyway ...)

            regards, tom lane

Re: server auto-restarts and ipcs

From
"Ed L."
Date:
On Tuesday November 9 2004 4:35, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > I noticed that ipcs on my linux box has a command-line option to list
> > the pid that created the segment.  Not sure if such a library exists in
> > usable form, but looking for segments owned by the downed postmaster's
> > pid would seem to be what is needed.  Just a thought...
>
> [ thinks about it... ]  Nah, it's still not bulletproof, because in a
> system reboot situation you can't trust the old PID either.  It could
> easy be that the other guy gets both the PID and the shmem ID that
> belonged to you last time.

I see.  Ipcs on my box also lists the date/time of shared memory segment
attach/detach/change (ipcs -t), but ...

> I've committed changes for 8.0 that mark a shmem segment with the inode
> of the associated data directory; that should be a stable enough ID to
> handle all routine-reboot cases.  (If you had to restore your whole
> filesystem from backup tapes, it might be wrong, but you're going to be
> doing such recovery manually anyway ...)

...that will remove a major hassle for us and lots of other.  Thanks.

Ed