Re: Memory bug in dsnowball_lexize - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Memory bug in dsnowball_lexize
Date
Msg-id CAE-h2Tq8XpyxPfUhkh=uv3Q8S3Z9VZz=E4m4rhTQGRyEzXhqkg@mail.gmail.com
Whole thread Raw
In response to Re: Memory bug in dsnowball_lexize  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Memory bug in dsnowball_lexize
List pgsql-hackers
On Thu, May 23, 2019 at 8:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Mark Dilger <hornschnorter@gmail.com> writes:
> > In src/backend/snowball/libstemmer/utilities.c, 'create_s' uses
> > malloc (not palloc) to allocate memory, and on memory exhaustion
> > returns NULL rather than throwing an exception.
>
> Actually not, see macros in src/include/snowball/header.h.

You are correct.  Thanks for the pointer.

> > In src/backend/snowball/dict_snowball.c, 'dsnowball_lexize'
> > calls 'SN_set_current' and ignores the return value, thereby
> > failing to notice the error, if any.
>
> Hm.  This seems like possibly a bug, in that even if we cover the
> malloc issue, there's no API guarantee that OOM is the only possible
> reason for reporting failure.

Ok, that sounds fair.  Since the memory is being palloc'd, I suppose
it would be safe to just ereport when the return value is -1?

> > There is a comment higher up in dict_snowball.c that seems to
> > use some handwaving about all this, or perhaps it is documenting
> > something else entirely.  In any event, I find the documentation
> > about dictCtx insufficient to explain why this memory handling
> > is correct.
>
> Fair complaint --- do you want to propose some new wording that
> references what header.h does?

Perhaps something along these lines?

        /*
-        * snowball saves alloced memory between calls, so we should
run it in our
-        * private memory context. Note, init function is executed in long lived
-        * context, so we just remember CurrentMemoryContext
+        * snowball saves alloced memory between calls, which we force to be
+        * allocated using palloc and friends via preprocessing macros (see
+        * snowball/header.h), so we should run snowball in our private memory
+        * context.  Note, init function is executed in long lived
context, so we
+        * just remember CurrentMemoryContext.
         */



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Minimal logical decoding on standbys
Next
From: Tom Lane
Date:
Subject: Re: Why could GEQO produce plans with lower costs than the standard_join_search?