Re: [GENERAL] server auto-restarts and ipcs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [GENERAL] server auto-restarts and ipcs
Date
Msg-id 26671.1100025872@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
"Ed L." <pgsql@bluepolka.net> writes:
> A power failure led to failed postmaster restart using 7.4.6 (see output
> below).  The short-term fix is usually to delete the pid file and restart.
> I often wonder why ipcs never seems to show the shared memory 
> block in question?

> 2004-11-08 17:17:22.398 [18038] FATAL:  pre-existing shared memory block (key 9746001, ID 658210829) is still in use

I did a bit of experimentation and found that the Linux kernel does seem
to reproducibly assign similar shmem IDs from one boot cycle to the
next.  Here's a smoking-gun case:

$ sudo ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0052e2c1 65536      postgres  600        10436608   1
0x00000000 131073     gdm       600        393216     2          dest
0x00530201 163842     tgl       600        10395648   2

[ reboot ]

$ sudo ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x0052e2c1 65536      postgres  600        10436608   1
0x00530201 98305      tgl       600        10395648   2
0x00000000 163842     gdm       600        393216     2          dest

The "tgl" entry is a manually-started postmaster, which in the second
boot cycle I was able to start before gdm came up.  Notice that gdm has
been handed out a shmid that belonged to a different userID in the
previous boot cycle.

What this says is that given a little bit of variability in the boot
cycle, it is fairly likely for the postmaster.pid file to contain a
shared memory ID that has already been assigned to another daemon in
the current boot cycle.  The way that PGSharedMemoryIsInUse() is coded,
this will result in a failure as exhibited by Ed, because shmctl() will
return EACCES and we interpret that as a conflicting shmem segment.
(The reason this is considered dangerous is it suggests that there might
be backends still alive from a crashed previous postmaster; we dare not
start new backends that are not in sync with the old ones.)

After thinking about this awhile, I believe that it is safe to consider
EACCES as a don't-care situation.  EACCES could only happen if the shmem
ID belongs to a different userid, which implies that it is not a
postgres shared memory segment.  Even if you are running postmasters
under multiple userids, this can be ignored, because all that we care
about is whether the shared memory segment could indicate the presence
of backends running in the current $PGDATA directory.  With the file
permissions that we use, it is not possible for a shared memory segment
to belong to a userid different from the one that owns the data
directory, and so any postmaster having a different userid must be
managing a different data directory.

So we could reduce our exposure to failure-to-start conditions by
allowing the EACCES case in PGSharedMemoryIsInUse.  Does anyone see
a flaw in this reasoning?

This isn't a complete solution, because if you are running multiple
postmasters under the *same* userid, they could still get confused.
We could probably fix that by marking each shmem seg to indicate which
data directory it goes with (eg, store the directory's inode number in
the seg header).  If we see an apparently live shmem segment of our own
userid, we could attach to it and check the header to determine whether
it's really a conflict or not.  There might be some portability issues
here though; didn't we find out that Windows doesn't really have inode
numbers?
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: Call for objections: simplify stable functions during
Next
From: Robert Treat
Date:
Subject: Re: Call for objections: simplify stable functions during estimation