Thread: two shared memory segments?

two shared memory segments?

From

"Ed L."

Date:

21 December 2005, 21:36:34

I have a cluster configured for ~800MB of shared memory cache
(shared_buffers = 100000), but ipcs shows TWO shared memory
segments of ~800MB belonging to that postmaster.  What kind of
a problem do I have here?

T      ID     KEY        MODE        OWNER     GROUP   CREATOR    CGROUP NATTCH  SEGSZ  CPID  LPID   ATIME    DTIME
CTIME 
Shared Memory:
m  114695 0x00000000 D-rw-------    pg       pg    pg       pg      2 861011968 17065 17065  7:00:07 13:38:22 13:38:22
m   16396 0x0089d911 --rw-------    pg       pg    pg       pg     47 861011968 17065 17065 13:38:22 no-entry 13:38:22

The "D" in the MODE for the first one means "the associated
shared memory segment has been removed.  It will disappear
when the last process attached to the segment detaches it."
(from 'man ipcs')

However, ipcs says pid 17065 (the live postmaster pid)
created them both.  The postmaster has been running for
about 130 days, but the ATIME/DTIME/CTIME columns seem to
suggest both segments are still being accessed.

Ed

Re: two shared memory segments?

From

Tom Lane

Date:

21 December 2005, 23:24:52

"Ed L." <pgsql@bluepolka.net> writes:
> I have a cluster configured for ~800MB of shared memory cache
> (shared_buffers = 100000), but ipcs shows TWO shared memory
> segments of ~800MB belonging to that postmaster.  What kind of
> a problem do I have here?

I'd say that you had a backend crash, causing the postmaster to abandon
the original shared memory segment and make a new one, but the old
segment is still attached to by a couple of processes.

There was a bug awhile back whereby the stats support processes failed
to detach from shared memory and thus would cause a dead shmem segment
to hang around like this.  What PG version are you running?

            regards, tom lane

Re: two shared memory segments?

From

Tom Lane

Date:

22 December 2005, 00:34:40

Ed Loehr <ed@loehrtech.com> writes:
> On Wednesday December 21 2005 8:24 pm, Tom Lane wrote:
>> I'd say that you had a backend crash, causing the postmaster
>> to abandon the original shared memory segment and make a new
>> one, but the old segment is still attached to by a couple of
>> processes.

> Does that make sense even if the creating pid is the same for
> both?

Sure.  The postmaster survives backend crashes --- that's the point
of having a separate postmaster process at all.

>> There was a bug awhile back whereby the stats support
>> processes failed to detach from shared memory and thus would
>> cause a dead shmem segment to hang around like this.  What PG
>> version are you running?

> This is an old 7.3.7 cluster.

[ digs in CVS logs... ]  Hmm.  AFAICT that bug was fixed in 7.3.5:

2003-11-30 16:56  tgl

    * src/: backend/port/sysv_shmem.c, backend/postmaster/pgstat.c,
    include/storage/pg_shmem.h (REL7_3_STABLE): Back-patch fix to cause
    stats processes to detach from shared memory, so that they do not
    prevent the postmaster from deleting the shmem segment during crash
    recovery.

You sure it's a 7.3.7 postmaster?  Can you dig down to determine exactly
which processes are attached to the older shmem segment?

            regards, tom lane

Re: two shared memory segments?

From

Ed Loehr

Date:

22 December 2005, 12:19:26

On Wednesday December 21 2005 8:24 pm, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > I have a cluster configured for ~800MB of shared memory
> > cache (shared_buffers = 100000), but ipcs shows TWO shared
> > memory segments of ~800MB belonging to that postmaster.
> > What kind of a problem do I have here?
>
> I'd say that you had a backend crash, causing the postmaster
> to abandon the original shared memory segment and make a new
> one, but the old segment is still attached to by a couple of
> processes.

Does that make sense even if the creating pid is the same for
both?

> There was a bug awhile back whereby the stats support
> processes failed to detach from shared memory and thus would
> cause a dead shmem segment to hang around like this.  What PG
> version are you running?

This is an old 7.3.7 cluster.

Ed