Thread: [HACKERS] Custom allocators in libpq

[HACKERS] Custom allocators in libpq

From
Aaron Patterson
Date:
Hello!

I would like to be able to configure libpq with custom malloc functions.
The reason is that we have a Ruby wrapper that exposes libpq in Ruby.
The problem is that Ruby's GC doesn't know how much memory has been
allocated by libpq, so no pressure is applied to the GC when it should
be.  Ruby exports malloc functions that automatically apply GC pressure,
and I'd like to be able to configure libpq to use those malloc
functions.

I've attached two patches that add this functionality.  The first patch
introduces a new function `PQunescapeByteaConn` which takes a
connection (so we have a place to get the malloc functions).  We already
have `PQescapeBytea` and `PQescapeByteaConn`, this first patch gives us
the analogous `PQunescapeBytea` and `PQunescapeByteaConn`.

The second patch adds malloc function pointer fields to `PGEvent`,
`pg_result`, and `pg_conn` structs, and changes libpq internals to use
those allocators rather than directly calling `malloc`.

This patch doesn't replace all malloc calls to the configured ones, just
the mallocs related to creating result objects (which is what I'm
concerned with).

If there's something I'm missing, please let me know.  This is my first
patch to libpq, so I look forward to hearing feedback.  Thanks for your
time!

-- 
Aaron Patterson
http://tenderlovemaking.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Custom allocators in libpq

From
Tom Lane
Date:
Aaron Patterson <tenderlove@ruby-lang.org> writes:
> I would like to be able to configure libpq with custom malloc functions.

I can see the potential value of this ...

> This patch doesn't replace all malloc calls to the configured ones, just
> the mallocs related to creating result objects (which is what I'm
> concerned with).

... but it seems like you're giving up a lot of the possible uses if
you don't make it apply uniformly.  I admit I'm not sure how we'd handle
the initial creation of a connection object with a custom malloc.  The
obvious solution of requiring the functions to be specified at PQconnect
time seems to require Yet Another PQconnect Variant, which is not very
appetizing.

I also wonder whether you wouldn't want a passthrough argument.
For instance, one of the use-cases that springs to mind immediately is
teaching postgres_fdw and dblink to use this so that their result objects
are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead.
While I suppose we could have the hook functions always allocate in
CurrentMemoryContext, it'd likely be useful to be able to specify
"use context X" at creation time.
        regards, tom lane



Re: [HACKERS] Custom allocators in libpq

From
Aaron Patterson
Date:
On Mon, Aug 28, 2017 at 03:11:26PM -0400, Tom Lane wrote:
> Aaron Patterson <tenderlove@ruby-lang.org> writes:
> > I would like to be able to configure libpq with custom malloc functions.
> 
> I can see the potential value of this ...
> 
> > This patch doesn't replace all malloc calls to the configured ones, just
> > the mallocs related to creating result objects (which is what I'm
> > concerned with).
> 
> ... but it seems like you're giving up a lot of the possible uses if
> you don't make it apply uniformly.

I'm happy to make the changes uniformly!  I'll do that and update the
patch.

> I admit I'm not sure how we'd handle
> the initial creation of a connection object with a custom malloc.  The
> obvious solution of requiring the functions to be specified at PQconnect
> time seems to require Yet Another PQconnect Variant, which is not very
> appetizing.

Other libraries I've worked with allow me to malloc a struct, then pass
it to an initialization function.  This might take a bit of refactoring,
like introducing a new `PQconnectStart`, but might be worth while.

> I also wonder whether you wouldn't want a passthrough argument.
> For instance, one of the use-cases that springs to mind immediately is
> teaching postgres_fdw and dblink to use this so that their result objects
> are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead.
> While I suppose we could have the hook functions always allocate in
> CurrentMemoryContext, it'd likely be useful to be able to specify
> "use context X" at creation time.

We don't need this for the Ruby wrapper, but I've seen other libraries
do it.  I'm happy to add it as well.

Thanks!

-- 
Aaron Patterson
http://tenderlovemaking.com/



Re: [HACKERS] Custom allocators in libpq

From
Peter Eisentraut
Date:
On 8/28/17 15:11, Tom Lane wrote:
> ... but it seems like you're giving up a lot of the possible uses if
> you don't make it apply uniformly.  I admit I'm not sure how we'd handle
> the initial creation of a connection object with a custom malloc.  The
> obvious solution of requiring the functions to be specified at PQconnect
> time seems to require Yet Another PQconnect Variant, which is not very
> appetizing.

I would have expected a separate function just to register the callback
functions, before doing anything else with libpq.  Similar to libxml:
http://xmlsoft.org/xmlmem.html

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Custom allocators in libpq

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 8/28/17 15:11, Tom Lane wrote:
>> ... but it seems like you're giving up a lot of the possible uses if
>> you don't make it apply uniformly.  I admit I'm not sure how we'd handle
>> the initial creation of a connection object with a custom malloc.  The
>> obvious solution of requiring the functions to be specified at PQconnect
>> time seems to require Yet Another PQconnect Variant, which is not very
>> appetizing.

> I would have expected a separate function just to register the callback
> functions, before doing anything else with libpq.  Similar to libxml:
> http://xmlsoft.org/xmlmem.html

I really don't much care for libxml's solution, because it implies
global variables, with the attendant thread-safety issues.  That's
especially bad if you want a passthrough such as a memory context
pointer, since it's quite likely that different call sites would
need different passthrough values, even assuming that a single set
of callback functions would suffice for an entire application.
That latter assumption isn't so pleasant either.  One could expect
that by using such a solution, postgres_fdw could be expected to
break, say, a libpq-based DBI library inside plperl.
        regards, tom lane



Re: [HACKERS] Custom allocators in libpq

From
Craig Ringer
Date:
On 29 August 2017 at 05:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 8/28/17 15:11, Tom Lane wrote:
>> ... but it seems like you're giving up a lot of the possible uses if
>> you don't make it apply uniformly.  I admit I'm not sure how we'd handle
>> the initial creation of a connection object with a custom malloc.  The
>> obvious solution of requiring the functions to be specified at PQconnect
>> time seems to require Yet Another PQconnect Variant, which is not very
>> appetizing.

> I would have expected a separate function just to register the callback
> functions, before doing anything else with libpq.  Similar to libxml:
> http://xmlsoft.org/xmlmem.html

I really don't much care for libxml's solution, because it implies
global variables, with the attendant thread-safety issues.  That's
especially bad if you want a passthrough such as a memory context
pointer, since it's quite likely that different call sites would
need different passthrough values, even assuming that a single set
of callback functions would suffice for an entire application.
That latter assumption isn't so pleasant either.  One could expect
that by using such a solution, postgres_fdw could be expected to
break, say, a libpq-based DBI library inside plperl.

Yeah, the 'register a malloc() function pointer in a global via a registration function call' approach seems fine and dandy until you find yourself with an app that, via shared library loads, has more than one different user of libpq with its own ideas about memory allocation.

RTLD_LOCAL can help, but may introduce other issues.

So there doesn't seem much way around another PQconnect variant. Yay? We could switch to a struct-passing argument model, but by the time you add the necessary "nfields" argument to allow you to know how much of the struct you can safely access, etc, just adding new connect functions starts to look good in comparison.

Which reminds me, it kind of stinks that PQconnectdbParams and PQpingParams accept key and value char* arrays, but PQconninfoParse produces a PQconninfoOption* . This makes it seriously annoying when you want to parse a connstring, make some transformations and pass it to a connect function. I pretty much always just put the user's original connstring in 'dbname' and set expand_dbname = true instead.

It might make sense to have any new function accept PQconninfoOption*. Or a variant of PQconninfoParse that populates k/v arrays with 'n' extra fields allocated and zeroed on return, I guess.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services