Thread: Re: [JDBC] Strange server error with current 8.0beta driver

Re: [JDBC] Strange server error with current 8.0beta driver

From
"Barry Lind"
Date:
Tom,

Here is what you requested.  (Thanks to Oliver for the good logging in
the jdbc driver).

I also have the test case (in java) down to the bare minimum that
generated the following output (that test case is attached).  (Note that
if the FETCH in the test case is not executed then the backend crashes;
with the FETCH you get an error: "ERROR: unrecognized node type: 0")

Thanks,
--Barry


PostgreSQL 8.0devel JDBC3 with SSL (build 308)
Trying to establish a protocol version 3 connection to localhost:5432
 FE=> StartupPacket(user=blind, database=fileswfs43,
client_encoding=UNICODE, DateStyle=ISO)
 <=BE AuthenticationOk
 <=BE ParameterStatus(client_encoding = UNICODE)
 <=BE ParameterStatus(DateStyle = ISO, MDY)
 <=BE ParameterStatus(integer_datetimes = off)
 <=BE ParameterStatus(is_superuser = on)
 <=BE ParameterStatus(server_encoding = UNICODE)
 <=BE ParameterStatus(server_version = 8.0.0beta4)
 <=BE ParameterStatus(session_authorization = blind)
 <=BE BackendKeyData(pid=3348,ckey=914259969)
 <=BE ReadyForQuery(I)
    compatible = 8.0
    loglevel = 2
    prepare threshold = 1
getConnection returning
driver[className=org.postgresql.Driver,org.postgresql.Driver@a9ae05]
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er@1df5a8f, maxRows=0, fetchSize=0, flags=0
 FE=> Parse(stmt=S_1,query="BEGIN",oids={})
 FE=> Bind(stmt=S_1,portal=null)
 FE=> Execute(portal=null,limit=0)
 FE=> Parse(stmt=S_2,query="DECLARE CUR CURSOR FOR SELECT 1",oids={})
 FE=> Bind(stmt=S_2,portal=null)
 FE=> Describe(portal=null)
 FE=> Execute(portal=null,limit=0)
 FE=> Sync
 <=BE ParseComplete [S_1]
 <=BE BindComplete [null]
 <=BE CommandStatus(BEGIN)
 <=BE ParseComplete [S_2]
 <=BE BindComplete [null]
 <=BE NoData
 <=BE CommandStatus(DECLARE CURSOR)
 <=BE ReadyForQuery(T)
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er@1db699b, maxRows=0, fetchSize=0, flags=0
 FE=> Parse(stmt=S_3,query="FETCH FORWARD 10 FROM CUR",oids={})
 FE=> Bind(stmt=S_3,portal=null)
 FE=> Describe(portal=null)
 FE=> Execute(portal=null,limit=0)
 FE=> Sync
 <=BE ParseComplete [S_3]
 <=BE BindComplete [null]
 <=BE RowDescription(1)
 <=BE DataRow
 <=BE CommandStatus(FETCH)
 <=BE ReadyForQuery(T)
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er@12a0f6c, maxRows=0, fetchSize=0, flags=0
 FE=> Parse(stmt=S_4,query="CLOSE CUR",oids={})
 FE=> Bind(stmt=S_4,portal=null)
 FE=> Describe(portal=null)
 FE=> Execute(portal=null,limit=0)
 FE=> Sync
 <=BE ParseComplete [S_4]
 <=BE BindComplete [null]
 <=BE NoData
 <=BE CommandStatus(CLOSE CURSOR)
 <=BE ReadyForQuery(T)
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er@134a7d8, maxRows=0, fetchSize=0, flags=0
 FE=> Bind(stmt=S_2,portal=null)
 FE=> Describe(portal=null)
 FE=> Execute(portal=null,limit=0)
 FE=> Sync
 <=BE BindComplete [null]
 <=BE NoData
 <=BE ErrorMessage(ERROR: unrecognized node type: 0
  Location: File: clauses.c, Routine: expression_tree_mutator, Line:
3220
  Server SQLState: XX000)
java.sql.SQLException: ERROR: unrecognized node type: 0
  Location: File: clauses.c, Routine: expression_tree_mutator, Line:
3220
  Server SQLState: XX000
    at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecu
torImpl.java:1356)
    at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImp
l.java:1151)
    at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:
166)
    at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:363)
    at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdb
c2Statement.java:308)
    at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:299)
    at test80.main(test80.java:31)
SQLException: SQLState(XX000)
 <=BE ReadyForQuery(E)
java.sql.SQLException: ERROR: unrecognized node type: 0
  Location: File: clauses.c, Routine: expression_tree_mutator, Line:
3220
  Server SQLState: XX000



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, November 23, 2004 7:10 AM
To: Barry Lind
Cc: pgsql-jdbc@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta
driver

"Barry Lind" <blind@xythos.com> writes:
> OK, getting closer.  The error happens if in jdbc I reuse
> PreparedStatement objects to reexecute the same set of queries
> multiple times.

The TRAP you showed looked like it might have something to do with
trying to execute code outside any transaction.  But I dunno how it got
there.

> If you still need a test case, let me know, and I will continue to
> package up what I have been working on.

What I'd actually rather have is a list of the exact sequence of
messages sent to the server.

            regards, tom lane


Attachment

Re: [JDBC] Strange server error with current 8.0beta driver

From
Oliver Jowett
Date:
Barry Lind wrote:

> I also have the test case (in java) down to the bare minimum that
> generated the following output (that test case is attached).  (Note that
> if the FETCH in the test case is not executed then the backend crashes;
> with the FETCH you get an error: "ERROR: unrecognized node type: 0")

I narrowed this down to:
  while (true) {    l_stmtDeclare.execute();  }

producing:

>  FE=> Parse(stmt=S_1,query="BEGIN",oids={})
>  FE=> Bind(stmt=S_1,portal=null)
>  FE=> Execute(portal=null,limit=0)
>  FE=> Parse(stmt=S_2,query="DECLARE CUR CURSOR FOR SELECT 1",oids={})
>  FE=> Bind(stmt=S_2,portal=null)
>  FE=> Describe(portal=null)
>  FE=> Execute(portal=null,limit=0)
>  FE=> Sync
>  <=BE ParseComplete [S_1]
>  <=BE BindComplete [null]
>  <=BE CommandStatus(BEGIN)
>  <=BE ParseComplete [S_2]
>  <=BE BindComplete [null]
>  <=BE NoData
>  <=BE CommandStatus(DECLARE CURSOR)
>  <=BE ReadyForQuery(T)
> simple execute, handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandler@7ecd2c3c, maxRows=0,
fetchSize=0,flags=0
 
>  FE=> Bind(stmt=S_2,portal=null)
>  FE=> Describe(portal=null)
>  FE=> Execute(portal=null,limit=0)
>  FE=> Sync
>  <=BE BindComplete [null]
>  <=BE NoData
>  <=BE ErrorMessage(ERROR: unrecognized node type: 2139062143
>   Location: File: clauses.c, Routine: expression_tree_mutator, Line: 3237
>   Server SQLState: XX000)

Valgrind says this is the culprit:

> ==26451== Invalid read of size 4
> ==26451==    at 0x8185C86: eval_const_expressions_mutator (clauses.c:1185)
> ==26451==    by 0x8185C32: eval_const_expressions (clauses.c:1152)
> ==26451==    by 0x817D1A6: preprocess_expression (planner.c:415)
> ==26451==    by 0x817CEBF: subquery_planner (planner.c:240)
> ==26451==    by 0x817CD59: planner (planner.c:129)
> ==26451==    by 0x810DF03: PerformCursorOpen (portalcmds.c:87)
> ==26451==    by 0x81C1402: PortalRunUtility (pquery.c:934)
> ==26451==    by 0x81C1762: PortalRunMulti (pquery.c:1001)
> ==26451==    by 0x81C0D8E: PortalRun (pquery.c:617)
> ==26451==    by 0x81BDDA7: exec_execute_message (postgres.c:1673)
> ==26451==    by 0x81BF6E1: PostgresMain (postgres.c:3035)
> ==26451==    by 0x818FC39: BackendRun (postmaster.c:2817)
> ==26451==    by 0x818F642: BackendStartup (postmaster.c:2453)
> ==26451==    by 0x818D989: ServerLoop (postmaster.c:1198)
> ==26451==    by 0x818CDBA: PostmasterMain (postmaster.c:917)
> ==26451==    by 0x81570F4: main (main.c:268)
> ==26451==  Address 0x1BBBF704 is 260 bytes inside a block of size 1024 free'd
> ==26451==    at 0x1B905460: free (vg_replace_malloc.c:153)
> ==26451==    by 0x8245706: AllocSetDelete (aset.c:466)
> ==26451==    by 0x82468B8: MemoryContextDelete (mcxt.c:193)
> ==26451==    by 0x8247BCF: PortalDrop (portalmem.c:384)
> ==26451==    by 0x82475B5: CreatePortal (portalmem.c:179)
> ==26451==    by 0x81BD735: exec_bind_message (postgres.c:1369)
> ==26451==    by 0x81BF4EF: PostgresMain (postgres.c:3023)
> ==26451==    by 0x818FC39: BackendRun (postmaster.c:2817)
> ==26451==    by 0x818F642: BackendStartup (postmaster.c:2453)
> ==26451==    by 0x818D989: ServerLoop (postmaster.c:1198)
> ==26451==    by 0x818CDBA: PostmasterMain (postmaster.c:917)
> ==26451==    by 0x81570F4: main (main.c:268)

With a bit of gdb work, I think what is happening is this:

The first Execute of S_2, running in portal context, calls the planner 
on the query contained in S_2's DeclareCursorStmt. The planner modifies 
the query tree in the course of planning it (specifically, it modifies 
parse->targetList). Memory allocated for the modified query comes from 
the portal context.

The portal context is freed implicitly by the second Bind of S_2 (second 
stack trace above).

The second Execute of S_2 then tries to use parse->targetList when 
planning (first stack trace above), but that's now pointing to freed 
memory. Boom.

Perhaps PerformCursorOpen should copy the query tree before planning, or 
plan in a different memory context?

-O


Re: [JDBC] Strange server error with current 8.0beta driver

From
Oliver Jowett
Date:
Oliver Jowett wrote:

> Perhaps PerformCursorOpen should copy the query tree before planning, or
> plan in a different memory context?

Patch attached. It moves query planning inside the new portal's memory
context. With this applied I can run Barry's testcase without errors,
and valgrind seems OK with it too.

-O
Index: src/backend/commands/portalcmds.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/portalcmds.c,v
retrieving revision 1.36
diff -c -r1.36 portalcmds.c
*** src/backend/commands/portalcmds.c    16 Sep 2004 16:58:28 -0000    1.36
--- src/backend/commands/portalcmds.c    24 Nov 2004 09:28:34 -0000
***************
*** 67,73 ****
       * query, so we are not expecting rule rewriting to do anything
       * strange.
       */
!     rewritten = QueryRewrite((Query *) stmt->query);
      if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query))
          elog(ERROR, "unexpected rewrite result");
      query = (Query *) linitial(rewritten);
--- 67,86 ----
       * query, so we are not expecting rule rewriting to do anything
       * strange.
       */
!
!     /* Create a new portal, and do all query planning on a copy of
!      * the query allocated in the new portal's memory context. The
!      * planner may modify the query, and it is not safe to have
!      * those modifications persist as we are ourselves running in a
!      * transient portal context.
!      */
!     portal = CreatePortal(stmt->portalname, false, false);
!
!     oldContext = MemoryContextSwitchTo(PortalGetHeapMemory(portal));
!
!     query = copyObject(stmt->query);
!
!     rewritten = QueryRewrite(query);
      if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query))
          elog(ERROR, "unexpected rewrite result");
      query = (Query *) linitial(rewritten);
***************
*** 86,102 ****

      plan = planner(query, true, stmt->options, NULL);

-     /*
-      * Create a portal and copy the query and plan into its memory
-      * context.
-      */
-     portal = CreatePortal(stmt->portalname, false, false);
-
-     oldContext = MemoryContextSwitchTo(PortalGetHeapMemory(portal));
-
-     query = copyObject(query);
-     plan = copyObject(plan);
-
      PortalDefineQuery(portal,
                        NULL,        /* unfortunately don't have sourceText */
                        "SELECT", /* cursor's query is always a SELECT */
--- 99,104 ----

Re: [JDBC] Strange server error with current 8.0beta driver

From
Tom Lane
Date:
Oliver Jowett <oliver@opencloud.com> writes:
>> Perhaps PerformCursorOpen should copy the query tree before planning, or 
>> plan in a different memory context?

> Patch attached. It moves query planning inside the new portal's memory 
> context. With this applied I can run Barry's testcase without errors, 
> and valgrind seems OK with it too.

I think the better solution is the first way (copy the querytree first).
The problem with the way you did it is that all the temporary structures
built by the planner will be left behind in the cursor's memory context,
and can't be reclaimed until the cursor is destroyed.  In the case of a
complex query that could represent a pretty serious memory leak.  It
seems better to eat the cost of copying the querytree an extra time,
especially since this way forms a patch that's easy to reverse whenever
we fix the planner to be less cavalier about scribbling on its input.

I've applied the attached patch instead (and analogously in 7.4 branch).
Would you confirm it fixes the problem you see?
        regards, tom lane

*** src/backend/commands/portalcmds.c.orig    Thu Sep 16 12:58:28 2004
--- src/backend/commands/portalcmds.c    Sun Nov 28 17:02:22 2004
***************
*** 62,73 ****         RequireTransactionChain((void *) stmt, "DECLARE CURSOR");      /*      * The query has been
throughparse analysis, but not rewriting or      * planning as yet.  Note that the grammar ensured we have a SELECT
* query, so we are not expecting rule rewriting to do anything      * strange.      */
 
!     rewritten = QueryRewrite((Query *) stmt->query);     if (list_length(rewritten) != 1 || !IsA(linitial(rewritten),
Query))        elog(ERROR, "unexpected rewrite result");     query = (Query *) linitial(rewritten);
 
--- 62,82 ----         RequireTransactionChain((void *) stmt, "DECLARE CURSOR");      /*
+      * Because the planner is not cool about not scribbling on its input,
+      * we make a preliminary copy of the source querytree.  This prevents
+      * problems in the case that the DECLARE CURSOR is in a portal and is
+      * executed repeatedly.  XXX the planner really shouldn't modify its
+      * input ... FIXME someday.
+      */
+     query = copyObject(stmt->query);
+ 
+     /*      * The query has been through parse analysis, but not rewriting or      * planning as yet.  Note that the
grammarensured we have a SELECT      * query, so we are not expecting rule rewriting to do anything      * strange.
*/
 
!     rewritten = QueryRewrite(query);     if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query))
elog(ERROR,"unexpected rewrite result");     query = (Query *) linitial(rewritten);