Thread: [HACKERS] Custom allocators in libpq
Hello! I would like to be able to configure libpq with custom malloc functions. The reason is that we have a Ruby wrapper that exposes libpq in Ruby. The problem is that Ruby's GC doesn't know how much memory has been allocated by libpq, so no pressure is applied to the GC when it should be. Ruby exports malloc functions that automatically apply GC pressure, and I'd like to be able to configure libpq to use those malloc functions. I've attached two patches that add this functionality. The first patch introduces a new function `PQunescapeByteaConn` which takes a connection (so we have a place to get the malloc functions). We already have `PQescapeBytea` and `PQescapeByteaConn`, this first patch gives us the analogous `PQunescapeBytea` and `PQunescapeByteaConn`. The second patch adds malloc function pointer fields to `PGEvent`, `pg_result`, and `pg_conn` structs, and changes libpq internals to use those allocators rather than directly calling `malloc`. This patch doesn't replace all malloc calls to the configured ones, just the mallocs related to creating result objects (which is what I'm concerned with). If there's something I'm missing, please let me know. This is my first patch to libpq, so I look forward to hearing feedback. Thanks for your time! -- Aaron Patterson http://tenderlovemaking.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Aaron Patterson <tenderlove@ruby-lang.org> writes: > I would like to be able to configure libpq with custom malloc functions. I can see the potential value of this ... > This patch doesn't replace all malloc calls to the configured ones, just > the mallocs related to creating result objects (which is what I'm > concerned with). ... but it seems like you're giving up a lot of the possible uses if you don't make it apply uniformly. I admit I'm not sure how we'd handle the initial creation of a connection object with a custom malloc. The obvious solution of requiring the functions to be specified at PQconnect time seems to require Yet Another PQconnect Variant, which is not very appetizing. I also wonder whether you wouldn't want a passthrough argument. For instance, one of the use-cases that springs to mind immediately is teaching postgres_fdw and dblink to use this so that their result objects are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead. While I suppose we could have the hook functions always allocate in CurrentMemoryContext, it'd likely be useful to be able to specify "use context X" at creation time. regards, tom lane
On Mon, Aug 28, 2017 at 03:11:26PM -0400, Tom Lane wrote: > Aaron Patterson <tenderlove@ruby-lang.org> writes: > > I would like to be able to configure libpq with custom malloc functions. > > I can see the potential value of this ... > > > This patch doesn't replace all malloc calls to the configured ones, just > > the mallocs related to creating result objects (which is what I'm > > concerned with). > > ... but it seems like you're giving up a lot of the possible uses if > you don't make it apply uniformly. I'm happy to make the changes uniformly! I'll do that and update the patch. > I admit I'm not sure how we'd handle > the initial creation of a connection object with a custom malloc. The > obvious solution of requiring the functions to be specified at PQconnect > time seems to require Yet Another PQconnect Variant, which is not very > appetizing. Other libraries I've worked with allow me to malloc a struct, then pass it to an initialization function. This might take a bit of refactoring, like introducing a new `PQconnectStart`, but might be worth while. > I also wonder whether you wouldn't want a passthrough argument. > For instance, one of the use-cases that springs to mind immediately is > teaching postgres_fdw and dblink to use this so that their result objects > are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead. > While I suppose we could have the hook functions always allocate in > CurrentMemoryContext, it'd likely be useful to be able to specify > "use context X" at creation time. We don't need this for the Ruby wrapper, but I've seen other libraries do it. I'm happy to add it as well. Thanks! -- Aaron Patterson http://tenderlovemaking.com/
On 8/28/17 15:11, Tom Lane wrote: > ... but it seems like you're giving up a lot of the possible uses if > you don't make it apply uniformly. I admit I'm not sure how we'd handle > the initial creation of a connection object with a custom malloc. The > obvious solution of requiring the functions to be specified at PQconnect > time seems to require Yet Another PQconnect Variant, which is not very > appetizing. I would have expected a separate function just to register the callback functions, before doing anything else with libpq. Similar to libxml: http://xmlsoft.org/xmlmem.html -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 8/28/17 15:11, Tom Lane wrote: >> ... but it seems like you're giving up a lot of the possible uses if >> you don't make it apply uniformly. I admit I'm not sure how we'd handle >> the initial creation of a connection object with a custom malloc. The >> obvious solution of requiring the functions to be specified at PQconnect >> time seems to require Yet Another PQconnect Variant, which is not very >> appetizing. > I would have expected a separate function just to register the callback > functions, before doing anything else with libpq. Similar to libxml: > http://xmlsoft.org/xmlmem.html I really don't much care for libxml's solution, because it implies global variables, with the attendant thread-safety issues. That's especially bad if you want a passthrough such as a memory context pointer, since it's quite likely that different call sites would need different passthrough values, even assuming that a single set of callback functions would suffice for an entire application. That latter assumption isn't so pleasant either. One could expect that by using such a solution, postgres_fdw could be expected to break, say, a libpq-based DBI library inside plperl. regards, tom lane
On 29 August 2017 at 05:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
-- Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 8/28/17 15:11, Tom Lane wrote:
>> ... but it seems like you're giving up a lot of the possible uses if
>> you don't make it apply uniformly. I admit I'm not sure how we'd handle
>> the initial creation of a connection object with a custom malloc. The
>> obvious solution of requiring the functions to be specified at PQconnect
>> time seems to require Yet Another PQconnect Variant, which is not very
>> appetizing.
> I would have expected a separate function just to register the callback
> functions, before doing anything else with libpq. Similar to libxml:
> http://xmlsoft.org/xmlmem.html
I really don't much care for libxml's solution, because it implies
global variables, with the attendant thread-safety issues. That's
especially bad if you want a passthrough such as a memory context
pointer, since it's quite likely that different call sites would
need different passthrough values, even assuming that a single set
of callback functions would suffice for an entire application.
That latter assumption isn't so pleasant either. One could expect
that by using such a solution, postgres_fdw could be expected to
break, say, a libpq-based DBI library inside plperl.
Yeah, the 'register a malloc() function pointer in a global via a registration function call' approach seems fine and dandy until you find yourself with an app that, via shared library loads, has more than one different user of libpq with its own ideas about memory allocation.
RTLD_LOCAL can help, but may introduce other issues.
So there doesn't seem much way around another PQconnect variant. Yay? We could switch to a struct-passing argument model, but by the time you add the necessary "nfields" argument to allow you to know how much of the struct you can safely access, etc, just adding new connect functions starts to look good in comparison.
Which reminds me, it kind of stinks that PQconnectdbParams and PQpingParams accept key and value char* arrays, but PQconninfoParse produces a PQconninfoOption* . This makes it seriously annoying when you want to parse a connstring, make some transformations and pass it to a connect function. I pretty much always just put the user's original connstring in 'dbname' and set expand_dbname = true instead.
It might make sense to have any new function accept PQconninfoOption*. Or a variant of PQconninfoParse that populates k/v arrays with 'n' extra fields allocated and zeroed on return, I guess.