Home > mailing lists

Re: 7.0RC1: possible query and backend problem - Mailing list pgsql-general

From	Tom Lane
Subject	Re: 7.0RC1: possible query and backend problem
Date	April 30, 2000 00:05:14
Msg-id	17744.957067447@sss.pgh.pa.us Whole thread Raw
In response to	Re: 7.0RC1: possible query and backend problem (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-general

Tree view

I wrote:
>>> IpcMemoryCreate: shmget failed (Invalid argument) key=5432110,
>>> size=144, permission=700

> Hmm, that is odd.  The thing that looks peculiar to me is that
> it seems to be calculating a different size for the segment than
> it did the first time through:

>> # ipcs -a
>> IPC status from <running system> as of Wed Apr 19 16:45:42 2000
>> T         ID      KEY        MODE        OWNER    GROUP  CREATOR
>> CGROUP NATTCH      SEGSZ  CPID  LPID   ATIME    DTIME    CTIME
>> Shared Memory:
>> m        800   0x0052e32e --rw------- postgres postgres postgres
>> postgres      0        120 12737 12737 13:01:36 13:01:36 13:01:36

> See the difference?  120 vs 144?  What's causing that I wonder...
> and would it explain the failure to reattach?

After further investigation, "Invalid argument" is the typical kernel
error code from shmget() if one tries to attach to an existing shared
memory segment that is smaller than one asked for.  So that's
consistent.  The size requested for the spinlock segment (which is
the only one of Postgres' three shmem segments that could be as small
as 144 bytes) is computed by "sizeof(struct foo)"; there is no way
that that is going to change from one invocation to the text.  But
the numbers 120 and 144 are consistent with the theory that the shmem
segment was originally created by Postgres 6.5 and you are now trying
to attach to it with Postgres 7.0 --- 7.0 has more spinlocks than 6.5
did.

Your trace appeared to show a working 7.0 postmaster getting this error
while trying to reinitialize.  That doesn't make any sense to me; if
the 7.0 postmaster had managed to start up originally, then it must
have found or created a suitably-sized shmem segment.  So I'm confused
about the details, but I've got to think that we are looking at some
sort of interference between 6.5 and 7.0 installations.  One possibility
is that after you started the 7.0 postmaster, you accidentally tried to
start a 6.5 postmaster on the same port number, and the 6.5 code managed
to resize the shared mem segment before failing because of the port
number conflict.  Not sure if that could happen --- my shmget() man
page doesn't say anything about changing the size of an already-existing
shmem segment, but maybe your Unix works differently.

Comments anyone?  If this actually is what happened, we should reorder
the startup sequence to check for port-number conflicts before any
shared memory segments are touched.  But I'm not sure about it.

            regards, tom lane

PS: if you don't see the connection between port number and shmem,
it's this: the key numbers used for shmem segments are computed from
the port number.  So different postmasters can coexist on one machine
if they have different port numbers; they'll get different shmem
segments.  But starting two postmasters on the same port is bad news.
I thought we had adequate interlocks against that, but now I'm wondering.

pgsql-general by date:

From: Andrew Schmeder
Date: 29 April 2000, 17:59:12
Subject: Re: Inconsistent query results...

From: "john huttley"
Date: 30 April 2000, 00:39:17
Subject: Create Triggers Documentation Error

Re: 7.0RC1: possible query and backend problem - Mailing list pgsql-general

Previous

Next