Thread: Re: [BUGS] BUG #3860: xpath crashes backend when is querying xmlagg result

Re: [BUGS] BUG #3860: xpath crashes backend when is querying xmlagg result

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Alvaro Herrera <alvherre@commandprompt.com> writes:

> > Perhaps a better idea is to create a separate LibxmlContext memcxt,
> > child of QueryContext, and have xml_palloc etc always use that.  That
> > way it won't be reset between calls.  It probably also means we could
> > wire xml reset to transaction abort, making it a bit simpler.
>
> Might as well go back to letting it use malloc :-(.
>
> I actually don't see a problem with letting it use malloc for stuff that
> it is going to manage for itself.  I guess the problem comes with
> transient return values of libxml functions; we'd want to explicitly
> move those into palloc-based storage and then free() them.
>
> This looks like a rather large rewrite though.  Peter, have you any
> better ideas?

OK, after a lot of research I think the best way to deal with this is
what I suggest above.  With the attached patch, I cannot cause the
system to crash with any of the given examples.

Furthermore this may help clean up the PG_TRY blocks that are currently
sprinkled through the xml.c code.

Handling of subtransactions is no good at this point, but I think that
could easily be improved.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes:
> + void
> + AtEOXact_xml(void)
> + {
> +     if (LibxmlContext == NULL)
> +         return;
> +
> +     MemoryContextDelete(LibxmlContext);
> +     LibxmlContext = NULL;
> +
> +     xmlCleanupParser();
> + }

[ blink... ]  Shouldn't that be the other way around?  It looks to me
like xmlCleanupParser will be pfree'ing already-deleted data.  Did you
test this with CLOBBER_FREED_MEMORY active?

Also, I don't see how this patch fixes what I believe to be the
fundamental problem, which is that we have code paths that invoke
libxml without having previously initialized the alloc support.
I think the sequence

    if (LibxmlContext == NULL)
    {
        create LibxmlContext;
        xmlMemSetup(...);
    }

ought to be executed before *any* use of libxml stuff (which suggests
factoring it out as a subroutine...)

We also need to do some testing to see if letting the thing go until
transaction end is OK, or whether we will see unacceptable memory leaks
over long series of XML calls.  (In particular I suspect that we'd
better zap the context at subtransaction abort; but even without any
error, are we sure that normal operations don't leak memory?)

One thing I was wondering about earlier today is whether libxml isn't
expecting NULL-return-on-failure from the malloc-substitute routine.
If we take control away from it unexpectedly, I wouldn't be a bit
surprised if its data structures are left corrupt.  This might lead to
failures during cleanup.

I do like the idea of getting rid of the PG_TRY blocks and the
associated cleanup attempts; with the approach you're converging on
here, we need only assume that xmlCleanupParser() will get rid of
all staticly-allocated pointers and not crash, whereas right now we
are assuming a great deal of libxml stuff still works after an ereport
(which makes throwing ereport from xml_palloc even more risky).

            regards, tom lane

Re: [BUGS] BUG #3860: xpath crashes backend when is querying xmlagg result

From
Alvaro Herrera
Date:
Tom Lane escribió:

> One thing I was wondering about earlier today is whether libxml isn't
> expecting NULL-return-on-failure from the malloc-substitute routine.
> If we take control away from it unexpectedly, I wouldn't be a bit
> surprised if its data structures are left corrupt.  This might lead to
> failures during cleanup.

Hmm, this is a very good point.  I quick look at the source shows that
they are not very consistent on its own checking for memory allocation
errors.  For example, see a bug I just reported:

http://bugzilla.gnome.org/show_bug.cgi?id=508662

The problem is that many routines look like this:

xmlXPathNewNodeSet(xmlNodePtr val) {
    xmlXPathObjectPtr ret;

    ret = (xmlXPathObjectPtr) xmlMalloc(sizeof(xmlXPathObject));
    if (ret == NULL) {
        xmlXPathErrMemory(NULL, "creating nodeset\n");
        return(NULL);
    }


and others would call this code and then happily use the return value
without checking for null.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane escribi�:
>> One thing I was wondering about earlier today is whether libxml isn't
>> expecting NULL-return-on-failure from the malloc-substitute routine.
>> If we take control away from it unexpectedly, I wouldn't be a bit
>> surprised if its data structures are left corrupt.  This might lead to
>> failures during cleanup.

> Hmm, this is a very good point.  I quick look at the source shows that
> they are not very consistent on its own checking for memory allocation
> errors.  For example, see a bug I just reported:
> http://bugzilla.gnome.org/show_bug.cgi?id=508662

Ugh.  So we're pretty much damned if we do and damned if we don't.

Given what you showed, it is certain that we are at risk if we return
NULL, whereas it is merely hypothetical that we are at risk if we
longjmp.  So let's stick to the palloc infrastructure for now.

            regards, tom lane