Re: Something fishy happening on frogmouth - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Something fishy happening on frogmouth
Date
Msg-id CA+Tgmob859sdnDuHAF31BE55qEoREnCzzWeqDbgkNB0d_F+zmA@mail.gmail.com
Whole thread Raw
In response to Something fishy happening on frogmouth  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Something fishy happening on frogmouth  (Andres Freund <andres@2ndquadrant.com>)
Re: Something fishy happening on frogmouth  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Oct 29, 2013 at 3:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The last two buildfarm runs on frogmouth have failed in initdb,
> like this:
>
> creating directory d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 100
> selecting default shared_buffers ... 128MB
> selecting dynamic shared memory implementation ... windows
> creating configuration files ... ok
> creating template1 database in d:/mingw-bf/root/HEAD/pgsql.2492/src/test/regress/./tmp_check/data/base/1 ... FATAL:
couldnot open shared memory segment "Global/PostgreSQL.851401618": Not enough space
 
> child process exited with exit code 1
>
> It shouldn't be failing like that, considering that we just finished
> probing for acceptable max_connections and shared_buffers without hitting
> any apparent limit.  I suppose it's possible that the final shm segment
> size is a bit larger than what was tested at the shared_buffer step,
> but that doesn't seem very likely to be the explanation.  What seems
> considerably more probable is that the probe for a shared memory
> implementation is screwing up the system state somehow.  It may not be
> unrelated that this machine was happy before commit d2aecae went in.

If I'm reading this correctly, the last three runs on frogmouth have
all failed, and all of them have failed with a complaint about,
specifically, Global/PostgreSQL.851401618.  Now, that really shouldn't
be happening, because the code to choose that number looks like this:
       dsm_control_handle = random();

One possibility that occurs to me is that if, for some reason, we're
using the same handle every time on Windows, and if Windows takes a
bit of time to reclaim the segment after the postmaster exits (which
is not hard to believe given some previous Windows behavior I've
seen), then running the postmaster lots of times in quick succession
(as initdb does) might fail.  I dunno what that has to do with the
patch, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Next
From: Andres Freund
Date:
Subject: Re: Something fishy happening on frogmouth