Re: Allow substitute allocators for PGresult. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Allow substitute allocators for PGresult.
Date
Msg-id 14204.1321055335@sss.pgh.pa.us
Whole thread Raw
In response to Re: Allow substitute allocators for PGresult.  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Allow substitute allocators for PGresult.
List pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 11.11.2011 11:18, Kyotaro HORIGUCHI wrote:
>> The comment at the the begging of pqexpbuffer.c says that libpq
>> should not rely on palloc(). Besides, Tom Lane said that palloc
>> should not be visible outside the backend(*1) and I agree with
>> it.
>> 
>> *1: http://archives.postgresql.org/pgsql-hackers/1999-02/msg00364.php
>> 
>> On the other hand, in dblink, dblink-plus (our product!), and
>> maybe FDW's connect to other PostgreSQL servers are seem to copy
>> the result of the query contained in PGresult into tuple store. I
>> guess that this is in order to avoid memory leakage on
>> termination in halfway.
>> 
>> But it is rather expensive to copy whole PGresult, and the
>> significance grows as the data received gets larger. Furthermore,
>> it requires about twice as much memory as the net size of the
>> data. And it is fruitless to copy'n modify libpq or reinvent it
>> from scratch. So we shall be happy to be able to use palloc's in
>> libpq at least for PGresult for such case in spite of the policy.
>> 
>> For these reasons, I propose to make allocators for PGresult
>> replaceable.

> You could use the resource owner mechanism to track them.

Heikki's idea is probably superior so far as PG backend usage is
concerned in isolation, but I wonder if there are scenarios where a
client application would like to be able to manage libpq's allocations.
If so, Kyotaro-san's approach would solve more problems than just
dblink's.

However, the bigger picture here is that I think Kyotaro-san's desire to
not have dblink return a tuplestore may be misplaced.  Tuplestores can
spill to disk, while PGresults don't; so the larger the result, the
more important it is to push it into a tuplestore and PQclear it as soon
as possible.

Despite that worry, it'd likely be a good idea to adopt one or the other
of these solutions anyway, because I think there are corner cases where
dblink.c can leak a PGresult --- for instance, what if dblink_res_error
fails due to out-of-memory before reaching PQclear?  And we could get
rid of the awkward and none-too-cheap PG_TRY blocks that it uses to try
to defend against such leaks in other places.

So I'm in favor of making a change along that line, although I'd want
to see more evidence before considering changing dblink to not return
tuplestores.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [GENERAL] VACUUM touching file but not updating relation
Next
From: Josh Kupershmidt
Date:
Subject: fix for psql's \dd version check