Thread: signal 11

signal 11

From
"Tanya Mamedalin"
Date:
============================================================================
                        POSTGRESQL BUG REPORT TEMPLATE
============================================================================


Your name               : Tanya Mamedalin
Your email address      : tmamedalin@registrypro.pro


System Configuration
---------------------
  Architecture (example: Intel Pentium)         : XSER305 INTEL PENTIUM 4
  Operating System (example: Linux 2.0.26 ELF)  : RedHat 7.3

  PostgreSQL version (example: PostgreSQL-7.4):   PostgreSQL-7.4

  Compiler used (example:  gcc 2.95.2)          : gcc version 2.96 20000731
(Red Hat Linux 7.3 2.96-110)


Please enter a FULL description of your problem:
------------------------------------------------

I was running the same Python code with Postgres 7.2.3 and psycopg 1.0.9
with no errors.  As soon as I upgraded to Postgres 7.4-0.3 and pyscopg 1.1.6
I started getting errors when trying to execute a certain query.  The
offending multi-part query is shown below.  It seemed to specifically not
like the subquery portion.  To circumvent the problem I changed the subquery
from ".pathname = (..subquery...)" to "e.pathname IN (...subquery...)"
suspecting that postgres was not happy with the one-to-one relation even
though the subquery was guaranteed to only return one record.



**---------------BEGIN ERROR----------------------------------------------
Traceback (most recent call last):
  File "./auto_escalate.py", line 100, in ?
    conn3 = c3.execute(third_query)
psycopg.ProgrammingError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

select distinct o.outage_id, o.machine, o.service, o.level,
                o.comments, o2.time, e.num
                from outages o, outages o2, escalation e
                where o.outage_id='31009'
                and e.pathname=(select pathname from escalation_link el
                where el.machine=o.machine and el.service=o.service)
                and e.level=o.level and o2.outage_id=o.outage_id
                and o2.status in ('ALERT','WARN','CRIT','DOWN','UNKNOWN')
order by level desc limit 1
There was an error connecting to the database
Traceback (most recent call last):
  File "./ops2server.py", line 60, in dbopen
    db=psycopg.connect(host=HOST,user=username, database=DB,
password=passwd)
OperationalError: FATAL:  the database system is starting up
**-----------------END ERROR----------------------------------------------


From the postgres logs I was showing that every time the offending query was
attempted the process would receive a signal 11, quit and restart.


**-----------------BEGIN LOG----------------------------------------------
2004-01-23 10:44:00 LOG:  database system is ready
2004-01-23 10:44:20 LOG:  connection received: host=[local] port=
2004-01-23 10:44:20 LOG:  connection authorized: user=postgres database=ops2
2004-01-23 10:44:45 LOG:  server process (PID 18880) was terminated by
signal 11
2004-01-23 10:44:45 LOG:  terminating any other active server processes
2004-01-23 10:44:45 LOG:  all server processes terminated; reinitializing
2004-01-23 10:44:45 LOG:  database system was interrupted at 2004-01-23
10:44:00 EST
2004-01-23 10:44:45 LOG:  checkpoint record is at 0/D0337C
2004-01-23 10:44:45 LOG:  redo record is at 0/D0337C; undo record is at 0/0;
shutdown TRUE
2004-01-23 10:44:45 LOG:  next transaction ID: 1242; next OID: 98209
2004-01-23 10:44:45 LOG:  database system was not properly shut down;
automatic recovery in progress
2004-01-23 10:44:45 LOG:  connection received: host=[local] port=
2004-01-23 10:44:45 LOG:  record with zero length at 0/D033BC
2004-01-23 10:44:45 LOG:  redo is not required
2004-01-23 10:44:45 FATAL:  the database system is starting up
2004-01-23 10:44:47 LOG:  database system is ready
**-----------------END LOG----------------------------------------------





Please describe a way to repeat the problem.   Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------


Possible ways to reproduce the error:  Run a multipart query similar to:

SELECT DISTINCT o.outage_id, o.machine, o.service, o.level, o.comments,
o2.time, e.num
       FROM outages o, outages o2, escalation e
       WHERE o.outage_id='31009'
       AND e.pathname = (SELECT pathname from escalation_link el
                         WHERE el.machine=o.machine AND el.service=o.service)
       AND e.level=o.level
     AND o2.outage_id=o.outage_id
       AND o2.status in ('ALERT','WARN','CRIT','DOWN','UNKNOWN')
     ORDER BY LEVEL DESC LIMIT 1;





Thanks,
Tanya Mamedalin

Re: signal 11

From
Tom Lane
Date:
"Tanya Mamedalin" <tmamedalin@registrypro.pro> writes:
> Possible ways to reproduce the error:  Run a multipart query similar to:

> SELECT DISTINCT o.outage_id, o.machine, o.service, o.level, o.comments,
> o2.time, e.num
>        FROM outages o, outages o2, escalation e
>        WHERE o.outage_id='31009'
>        AND e.pathname = (SELECT pathname from escalation_link el
>                          WHERE el.machine=o.machine AND el.service=o.service)
>        AND e.level=o.level
>      AND o2.outage_id=o.outage_id
>        AND o2.status in ('ALERT','WARN','CRIT','DOWN','UNKNOWN')
>      ORDER BY LEVEL DESC LIMIT 1;

This is difficult to do when you have not offered the definitions of the
tables used by the query ...

            regards, tom lane

Re: signal 11

From
Tom Lane
Date:
"Tanya Mamedalin" <tmamedalin@registrypro.pro> writes:
> I was running the same Python code with Postgres 7.2.3 and psycopg 1.0.9
> with no errors.  As soon as I upgraded to Postgres 7.4-0.3 and pyscopg 1.1.6
> I started getting errors when trying to execute a certain query.

BTW, this might be the same bug fixed here:
http://archives.postgresql.org/pgsql-committers/2003-11/msg00251.php

If setting "enable_hashjoin" to OFF makes the crash go away, then it
probably is the same bug, and 7.4.1 has a fix.  If you still see the
problem in 7.4.1, please file a more complete bug report.

            regards, tom lane

Re: signal 11

From
"Tanya Mamedalin"
Date:
Sorry, I wasn't sure how much information I should provide.  To be thorough:

-Tanya

----------------------------------------------------------------------------
-

-- TOC Entry ID 18 (OID 16580)
--
-- Name: escalation Type: TABLE Owner: postgres
--

CREATE TABLE "escalation" (
    "pathname" character varying(20) NOT NULL,
    "level" smallint,
    "contact" character varying(20),
    "notify_method" character varying(12),
    "days" character varying(12),
    "hours" character varying(12),
    "period" integer,
    "num" integer
);

--
-- TOC Entry ID 36 (OID 16611)
--
-- Name: outages Type: TABLE Owner: postgres
--

CREATE TABLE "outages" (
    "outage_id" serial NOT NULL,
    "machine" character varying(30),
    "service" character varying(30),
    "status" character varying(12) NOT NULL,
    "time" timestamp with time zone,
    "owner" character varying(60),
    "level" smallint DEFAULT 0 NOT NULL,
    "comments" text
);


--
-- TOC Entry ID 40 (OID 16618)
--
-- Name: escalation_link Type: TABLE Owner: postgres
--

CREATE TABLE "escalation_link" (
    "machine" character varying(30),
    "service" character varying(30),
    "status" character varying(12),
    "pathname" character varying(20)
);

--
-- TOC Entry ID 41 (OID 16618)
--
-- Name: escalation_link Type: ACL Owner:
--

REVOKE ALL on "escalation_link" from PUBLIC;
GRANT ALL on "escalation_link" to "postgres";
GRANT SELECT on "escalation_link" to "ops2";
GRANT SELECT on "escalation_link" to GROUP "operators";
GRANT ALL on "escalation_link" to GROUP "opsadmins";
--
-- TOC Entry ID 37 (OID 16611)
--
-- Name: outages Type: ACL Owner:
--

REVOKE ALL on "outages" from PUBLIC;
GRANT ALL on "outages" to "postgres";
GRANT INSERT,SELECT,UPDATE,DELETE on "outages" to "ops2";
GRANT INSERT,SELECT on "outages" to GROUP "operators";
GRANT ALL on "outages" to GROUP "opsadmins";
--
-- TOC Entry ID 19 (OID 16580)
--
-- Name: escalation Type: ACL Owner:
--

REVOKE ALL on "escalation" from PUBLIC;
GRANT ALL on "escalation" to "postgres";
GRANT SELECT on "escalation" to "ops2";
GRANT SELECT on "escalation" to GROUP "operators";



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, January 26, 2004 11:59 AM
To: Tanya Mamedalin
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] signal 11


"Tanya Mamedalin" <tmamedalin@registrypro.pro> writes:
> Possible ways to reproduce the error:  Run a multipart query similar to:

> SELECT DISTINCT o.outage_id, o.machine, o.service, o.level, o.comments,
> o2.time, e.num
>        FROM outages o, outages o2, escalation e
>        WHERE o.outage_id='31009'
>        AND e.pathname = (SELECT pathname from escalation_link el
>                          WHERE el.machine=o.machine AND el.service=o.service)
>        AND e.level=o.level
>      AND o2.outage_id=o.outage_id
>        AND o2.status in ('ALERT','WARN','CRIT','DOWN','UNKNOWN')
>      ORDER BY LEVEL DESC LIMIT 1;

This is difficult to do when you have not offered the definitions of the
tables used by the query ...

            regards, tom lane

Re: signal 11

From
Tom Lane
Date:
"Tanya Mamedalin" <tmamedalin@registrypro.pro> writes:
> Sorry, I wasn't sure how much information I should provide.  To be thorough:

Thanks for the background.  I tried the query with these table
definitions and it didn't crash for me.  This may mean that the bug is
fixed in 7.4.1, or it might just mean that the problem is not triggered
when executing the query with empty tables.  Please try 7.4.1 and report
back if you still see the failure.

            regards, tom lane