Re: libxml incompatibility - Mailing list pgsql-hackers

From Tom Lane
Subject Re: libxml incompatibility
Date
Msg-id 4223.1237691806@sss.pgh.pa.us
Whole thread Raw
In response to Re: libxml incompatibility  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> We could possibly use xmlMemGet() to fetch the prior settings and then
> restore them after we are done, but making sure that happens after an
> error would be a bit tricky.

I experimented with this a bit, and came up with the attached patch.
Basically what it does is revert libxml to its native memory management
methods anytime LibxmlContext doesn't exist.  It fixes Alvaro's original
test case and some variants that I stumbled across, but I can't say that
I have a lot of faith in it.  I see at least a couple of risk factors:

* it doesn't scale to the case where some other code is doing the same
kind of thing --- the pointers we saved during xml_init might or might
not still be appropriate to restore at end of transaction.

* suppose that a plperl function does some Perlish XML stuff, then calls
a SQL function that calls something in xml.c.  When we start up use of
LibxmlContext we'll wipe the internal state of libxml (which we *have*
to do; this still crashes trivially without the added xmlCleanupParser
call).  Can this break anything that the perl XML code is expecting to
still be valid when control gets back to it?

If this doesn't work then I'm afraid we'll need some radical rethinking
of the way we handle libxml memory management...

Please test.  I'm not much with either Perl or XML and have little
idea of how to stress this.

            regards, tom lane

Index: src/backend/utils/adt/xml.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.83
diff -c -r1.83 xml.c
*** src/backend/utils/adt/xml.c    7 Jan 2009 13:44:37 -0000    1.83
--- src/backend/utils/adt/xml.c    22 Mar 2009 03:00:34 -0000
***************
*** 40,45 ****
--- 40,49 ----
   * not very good about specifying this, but for now we assume that
   * xmlCleanupParser() will get rid of anything we need to worry about.
   *
+  * libxml's original memory management callbacks are saved when we create
+  * LibxmlContext, and restored when we delete it.  This is so that there
+  * is some hope for other code (eg, plperl) to use libxml without crashing.
+  *
   * We use palloc --- which will throw a longjmp on error --- for allocation
   * callbacks that officially should act like malloc, ie, return NULL on
   * out-of-memory.  This is a bit risky since there is a chance of leaving
***************
*** 93,98 ****
--- 97,106 ----

  static StringInfo xml_err_buf = NULL;
  static MemoryContext LibxmlContext = NULL;
+ static xmlFreeFunc libxml_freeFunc = NULL;
+ static xmlMallocFunc libxml_mallocFunc = NULL;
+ static xmlReallocFunc libxml_reallocFunc = NULL;
+ static xmlStrdupFunc libxml_strdupFunc = NULL;

  static void xml_init(void);
  static void xml_memory_init(void);
***************
*** 1224,1237 ****
       * sure it doesn't go away before we've called xmlCleanupParser().
       */
      if (LibxmlContext == NULL)
          LibxmlContext = AllocSetContextCreate(TopMemoryContext,
                                                "LibxmlContext",
                                                ALLOCSET_DEFAULT_MINSIZE,
                                                ALLOCSET_DEFAULT_INITSIZE,
                                                ALLOCSET_DEFAULT_MAXSIZE);

!     /* Re-establish the callbacks even if already set */
!     xmlMemSetup(xml_pfree, xml_palloc, xml_repalloc, xml_pstrdup);
  }

  static void
--- 1232,1260 ----
       * sure it doesn't go away before we've called xmlCleanupParser().
       */
      if (LibxmlContext == NULL)
+     {
+         /*
+          * First, run xmlCleanupParser() to get rid of any libxml data
+          * structures that exist now.  If there are any, they were created
+          * with the native memory-management functions and will cause big
+          * trouble if they're touched using our functions.
+          */
+         xmlCleanupParser();
+
+         /* Next, save away libxml's native memory-management functions */
+         xmlMemGet(&libxml_freeFunc, &libxml_mallocFunc,
+                   &libxml_reallocFunc, &libxml_strdupFunc);
+
+         /* Create the context (note this could fail) */
          LibxmlContext = AllocSetContextCreate(TopMemoryContext,
                                                "LibxmlContext",
                                                ALLOCSET_DEFAULT_MINSIZE,
                                                ALLOCSET_DEFAULT_INITSIZE,
                                                ALLOCSET_DEFAULT_MAXSIZE);

!         /* Establish our memory management callbacks */
!         xmlMemSetup(xml_pfree, xml_palloc, xml_repalloc, xml_pstrdup);
!     }
  }

  static void
***************
*** 1242,1247 ****
--- 1265,1274 ----
          /* Give libxml a chance to clean up dangling pointers */
          xmlCleanupParser();

+         /* Restore native memory-management functions */
+         xmlMemSetup(libxml_freeFunc, libxml_mallocFunc,
+                     libxml_reallocFunc, libxml_strdupFunc);
+
          /* And flush the context */
          MemoryContextDelete(LibxmlContext);
          LibxmlContext = NULL;

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Open 8.4 item list
Next
From: Robert Haas
Date:
Subject: Re: contrib function naming, and upgrade issues