Re: exec_execute_message crash - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: exec_execute_message crash
Date
Msg-id 20091230.214635.09776596.t-ishii@sraoss.co.jp
Whole thread Raw
In response to exec_execute_message crush  (Tatsuo Ishii <ishii@postgresql.org>)
Responses Re: exec_execute_message crash  (Andrew Dunstan <andrew@dunslane.net>)
Re: exec_execute_message crash  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> While inspecting a complain from a pgpool user, I found that
> PostgreSQL crushes with following statck trace:
> 
> #0  0x0826436a in list_length (l=0xaabe4e28)
>     at ../../../src/include/nodes/pg_list.h:94
> #1  0x08262168 in IsTransactionStmtList (parseTrees=0xaabe4e28)
>     at postgres.c:2429
> #2  0x0826132e in exec_execute_message (portal_name=0x857bab0 "", max_rows=0)
>     at postgres.c:1824
> #3  0x08263b2a in PostgresMain (argc=4, argv=0x84f6c28,
>     username=0x84f6b08 "t-ishii") at postgres.c:3671
> #4  0x0823299e in BackendRun (port=0x8511e68) at postmaster.c:3449
> #5  0x08231f78 in BackendStartup (port=0x8511e68) at postmaster.c:3063
> #6  0x0822f90a in ServerLoop () at postmaster.c:1387
> #7  0x0822f131 in PostmasterMain (argc=3, argv=0x84f4bf8) at postmaster.c:1040
> #8  0x081c6217 in main (argc=3, argv=0x84f4bf8) at main.c:188

Ok, I think I understand what's going on.

parse
bind
describe
execute

This sequence of commands create cached plan in unnamed portal.

$5 = {name = 0x8574de4 "", prepStmtName = 0x0, heap = 0x8598400, resowner = 0x8598488, cleanup = 0x81632ca
<PortalCleanup>,createSubid = 1, sourceText = 0x85ab818 " SELECT <omitted>"..., commandTag = 0x84682ca "SELECT", stmts
=0xaabf43b0, cplan = 0xaabf4950, portalParams = 0x0, strategy = PORTAL_ONE_SELECT, cursorOptions = 4, status =
PORTAL_READY,queryDesc = 0x85abc20, tupDesc = 0x85ddcb0, formats = 0x85abc68, holdStore = 0x0, holdContext = 0x0,
atStart= 1 '\001', atEnd = 1 '\001', posOverflow = 0 '\0', portalPos = 0, creation_time = 315487957498169, visible = 1
'\001'}

The cached plan(portal->cplan) and statements(portal->stmts) are
created by exec_bind_message():
    /*     * Revalidate the cached plan; this may result in replanning.  Any     * cruft will be generated in
MessageContext. The plan refcount will     * be assigned to the Portal, so it will be released at portal     *
destruction.    */    cplan = RevalidateCachedPlan(psrc, false);    plan_list = cplan->stmt_list;
 

Please note that cplan and stmts belong to the same memory context.

Then following commands are coming:

parse invalid SQL thus abort a transaction
bind (error)
describe (error)
execute (crash)

parse causes transaction to abort, which causes call to
AbortCurrentTransaction->AbortTransaction->AtAbort_portals->ReleaseCachedPlan. It
calls ReleaseCachePlan(portal->cplan). ReleaseCachePlan calls
MemoryContextDelete(plan->context) which destroys both portal->cplan
and portal->stmts.

That was the reason why I had segfault by accessing portal->stmts.

To fix this I think exec_execute_message should throw an error if
portal->cleanup is NULL, since portal->cleanup is NULLed by
AtAbort_Portals at transaction abort (or portal is dropped).

Here is a suggested fix:

diff -c postgres.c~ postgres.c
*** postgres.c~    2009-06-18 19:08:08.000000000 +0900
--- postgres.c    2009-12-30 21:34:49.000000000 +0900
***************
*** 1804,1810 ****         dest = DestRemoteExecute;      portal = GetPortalByName(portal_name);
!     if (!PortalIsValid(portal))         ereport(ERROR,                 (errcode(ERRCODE_UNDEFINED_CURSOR),
     errmsg("portal \"%s\" does not exist", portal_name)));
 
--- 1804,1810 ----         dest = DestRemoteExecute;      portal = GetPortalByName(portal_name);
!     if (!PortalIsValid(portal) || (PortalIsValid(portal) && portal->cleanup == NULL))         ereport(ERROR,
      (errcode(ERRCODE_UNDEFINED_CURSOR),                  errmsg("portal \"%s\" does not exist", portal_name)));
 

--
Tatsuo Ishii
SRA OSS, Inc. Japan


pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Cancelling idle in transaction state
Next
From: Andrew Dunstan
Date:
Subject: Re: exec_execute_message crash