Re: PostgreSQL 7.4RC1 crashes on Panther - Mailing list pgsql-bugs

From Scott Goodwin
Subject Re: PostgreSQL 7.4RC1 crashes on Panther
Date
Msg-id AFFD4FA2-108B-11D8-8B0B-000A95A0910A@scottg.net
Whole thread Raw
In response to Re: PostgreSQL 7.4RC1 crashes on Panther  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi Tom,

On Nov 4, 2003, at 4:48 PM, Tom Lane wrote:

>> Here's the code that triggers it:
>> create function pltcl_call_handler() RETURNS LANGUAGE_HANDLER
>>     as 'pltcl.so' language 'c';
>
> I don't think so.  That's a startup failure; it can not be triggered by
> executing a SQL command, because if the postmaster is alive enough to
> accept a SQL command in the first place, it's already gotten past
> creation of the shared memory segment.

I have to differ here. This problem is being triggered by the create
function section above, it is doing it after startup, and it's doing it
on Mac OS 10.3. Here are the commands I'm using, in the order I'm using
them. I'll be glad to admit I'm the one screwing it up, but I don't see
where.

# Define vars
ROOT=/Users/scott/m
INSTALL=$ROOT/install
PG=$INSTALL/postgresql
PGLIB=$PG/lib
PGDATA=$ROOT/var/db
PORT=5432
DB=m

DYLD_LIBRARY_PATH=$INSTALL/tcl/lib:$INSTALL/postgresql/lib:$INSTALL/
openssl/lib
export DYLD_LIBRARY_PATH


# Initialize the database cluster
$PG/bin/initdb -D $PGDATA --locale=C -L $PG/share

...output of the above command is:

The files belonging to this database system will be owned by user
"scott".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /Users/scott/m/var/db... ok
creating directory /Users/scott/m/var/db/base... ok
creating directory /Users/scott/m/var/db/global... ok
creating directory /Users/scott/m/var/db/pg_xlog... ok
creating directory /Users/scott/m/var/db/pg_clog... ok
selecting default max_connections... 30
selecting default shared_buffers... 200
creating configuration files... ok
creating template1 database in /Users/scott/m/var/db/base/1... ok
initializing pg_shadow... ok
enabling unlimited row size for system tables... ok
initializing pg_depend... ok
creating system views... ok
loading pg_description... ok
creating conversions... ok
setting privileges on built-in objects... ok
creating information schema... ok
vacuuming database template1... ok
copying template1 to template0... ok

Success. You can now start the database server using:

     /Users/scott/m/install/postgresql/bin/postmaster -D
/Users/scott/m/var/db
or
     /Users/scott/m/install/postgresql/bin/pg_ctl -D
/Users/scott/m/var/db -l logfile start



# Start the database
$PG/bin/pg_ctl start -D $PGDATA -l $ROOT/database/postgres.log -o "-i"

...at this point the database is running, as shown by ps:

scott  2712   0.0  0.1    37288    936 std  S    12:10PM   0:00.02
/Users/scott/m/install/postgresql/bin/postmaster -i -D
/Users/scott/m/var/db
scott  2715   0.0  0.0    38276    168 std  S    12:10PM   0:00.00
/Users/scott/m/install/postgresql/bin/postmaster -i -D
/Users/scott/m/var/db
scott  2717   0.0  0.0    37288    260 std  S    12:10PM   0:00.00
/Users/scott/m/install/postgresql/bin/postmaster -i -D
/Users/scott/m/var/db

...and by the log file:

LOG:  database system was shut down at 2003-11-06 12:10:49 CST
LOG:  checkpoint record is at 0/9B13D8
LOG:  redo record is at 0/9B13D8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 534; next OID: 17142
LOG:  database system is ready


# Create the database
$PG/bin/psql -d template1 -c "create database $DB"

...output on the command line:
CREATE DATABASE


# Add PL/pgsql and PL/tcl
$PG/bin/psql -d $DB -f $OPS/database/sql/add_languages.sql

...output on the command line is:

psql:/Users/scott/m/ops/database/sql/add_languages.sql:13: server
closed the connection unexpectedly
         This probably means the server terminated abnormally
         before or while processing the request.
psql:/Users/scott/m/ops/database/sql/add_languages.sql:13: connection
to server was lost

...output in the log file is:

LOG:  server process (PID 2739) was terminated by signal 10
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
FATAL:  could not create shared memory segment: Cannot allocate memory
DETAIL:  Failed system call was shmget(key=5432001, size=3809280,
03600).
HINT:  This error usually means that PostgreSQL's request for a shared
memory segment exceeded available memory or swap space. To reduce the
request size (currently 3809280 bytes), reduce PostgreSQL's
shared_buffers parameter (currently 300) and/or its max_connections
parameter (currently 50).
         The PostgreSQL documentation contains more information about
shared memory configuration.

...at this point, the server is no longer running.



The add_languages.sql file contains:

create function plpgsql_call_handler() RETURNS LANGUAGE_HANDLER
    as 'plpgsql.so' language 'c';

create trusted procedural language 'plpgsql'
    HANDLER plpgsql_call_handler
    LANCOMPILER 'PL/pgSQL';

create function pltcl_call_handler() RETURNS LANGUAGE_HANDLER
    as 'pltcl.so' language 'c';

create trusted procedural language 'pltcl'
    HANDLER pltcl_call_handler
    LANCOMPILER 'PL/Tcl';


(Line 13 of my add_languages.sql corresponds to the creation of the
pltcl call handler -- I left off comments at the top of the file when I
copied and pasted it here).

The above process worked fine with PostgreSQL 7.3.4 on Mac OS 10.2.8.


The next thing I tried was reducing the shared memory footprint:
   max_connections = 10
   shared_buffers = 40

I then wiped out the database area, and followed the exact same process
above. This time around, it didn't complain about shmget problems, but
it still caught a SIGBUS; it restarted gracefully, as shown by the log
file:

LOG:  server process (PID 2959) was terminated by signal 10
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2003-11-06 12:28:02 CST
LOG:  checkpoint record is at 0/9B13D8
LOG:  redo record is at 0/9B13D8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 534; next OID: 17142
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 0/9B1418
LOG:  record with zero length at 0/9CDA00
LOG:  redo done at 0/9CD9DC
LOG:  database system is ready


The final thing I tried was altering the add_languages.sql file. I
commented out the parts that loaded Tcl, wiped out the database, and
followed the same procedure above, leaving max_connections and
shared_buffers as defaults (50 and 300). This worked great -- I can
load PL/pgsql fine, it's only when I attempt to load Tcl that it barfs.


>> Not sure whether this is a PostgreSQL problem or a Mac OS 10.3
>> problem,
>
> It's a user problem.  If you're going to run multiple
> shared-memory-using applications, it's up to you to adjust the kernel
> limit or the per-application requests to fit.  I can't tell from this
> what other app is using shared memory, though.  Are you trying to start
> more than one postmaster?  If not, see whether OS X provides "ipcs" ---
> that would give you some data about what shared-memory requests are
> already present in the system.

After this last test I started Mac OS X's Activity Monitor and looked
the postgres process -- there were three, as shown in the 'ps' output
above. Shared memory size was between 3 and 5 MB for each. This is on a
PowerBook with 1GB of memory, and with Activity Monitor showing 626MB
of that as being free. VM size is showing 3.84GB. I'm as sure as I can
be that I'm not running into a resource problem.


I added the following to the /System/Library/StartupItems/SystemTuning
file:

sysctl -w kern.sysv.shmmax=167772160 # bytes: 160 megs
sysctl -w kern.sysv.shmmin=1
sysctl -w kern.sysv.shmmni=32
sysctl -w kern.sysv.shmseg=8
sysctl -w kern.sysv.shmall=65536 # 4k pages: 256 megs

rebooted and reran the experiment -- problem still exists.


One thing I'm going to try next is using an earlier version of GCC.
Panther defaults to:

    gcc (GCC) 3.3 20030304 (Apple Computer, Inc. build 1495);

I've used gcc_select to go back to GCC 3.1 and I'm rebuilding all the
parts now.


I'll keep digging as I have time.


thanks,

/s.

pgsql-bugs by date:

Previous
From: Robert Grabowski
Date:
Subject: suggest: change alter user set search_path to raise notice not error
Next
From: Scott Goodwin
Date:
Subject: Re: PostgreSQL 7.4RC1 crashes on Panther