Re: Allow substitute allocators for PGresult. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Allow substitute allocators for PGresult.
Date
Msg-id 20111223.163828.75337757.horiguchi.kyotaro@oss.ntt.co.jp
Whole thread Raw
In response to Re: Allow substitute allocators for PGresult.  (Robert Haas <robertmhaas@gmail.com>)
Responses Speed dblink using alternate libpq tuple storage  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
Hello, thank you for taking the time for comment.

At Wed, 21 Dec 2011 11:09:59 -0500, Robert Haas <robertmhaas@gmail.com> wrote
> I find the names of the functions added here to be quite
> confusing and would suggest renaming them.  I expected
> PQgetAsCstring to do something similar to PQgetvalue, but the
> code is completely different,

To be honest, I've also felt that kind of perplexity. If the
problem is simply of the naming, I can propose the another name
"PQreadAttValue"... This is not so good too...
But...

> and even after reading the documentation I still don't
> understand what that function is supposed to be used for.  Why
> "as cstring"?  What would the other option be?

Is it a problem of the poor description? Or about the raison
d'être of the function?

The immediate cause of the existence of the function is that
getAnotherTuple internally stores the field values of the tuples
sent from the server, in the form of PGresAttValue, and I have
found only one route to store a tuple into TupleStore is
BuildeTupleFromCStrings() to tupelstore_puttuple() which is
dblink does in materializeResult(), and of cource C-string is the
most natural format in C program, and I have hesitated to modify
execTuples.c, and I wanted to hide the details of PGresAttValue.

Assuming that the values are passed as PGresAttValue* is given
(for the reasons of performance and the extent of the
modification), the "adding tuple" functions should get the value
from the struct. This can be done in two ways from the view of
authority (`encapsulation', in other words) and convenience, one
is with the knowledge of the structure, and the other is without
it. PQgetAsCstring is the latter approach. (And it is
inconsistent with the fact that the definition of PGresAttValue
is moved into lipq-fe.h from libpq-int.h. The details of the
structure should be hidden like PGresult in this approach).

But it is not obvious that the choice is better than the another
one. If we consider that PGresAttValue is too simple and stable
to hide the details, PQgetAsCString will be taken off and the
problem will go out. PGresAttValue needs documentation in this
case.

I prefer to handle PGresAttValue directly if no problem.

> I also don't think the "add tuple" terminology is particularly good.
> It's not obvious from the name that what you're doing is overriding
> the way memory is allocated and results are stored.

This phrase is taken from pqAddTuple() in fe-exec.c at first and
have not been changed after the function is integrated with other
functions.

I propose "tuple storage handler" for the alternative.

- typedef void *(*addTupleFunction)(...);
+ typedef void *(*tupleStorageHandler)(...);

- typedef enum { ADDTUP_*, } AddTupFunc;
+ typedef enum { TSHANDLER_*, } TSHandlerCommand;

- void *PQgetAddTupleParam(...);
+ void *PQgetTupleStrageHandlerContext(...);

- void PQregisterTupleAdder(...);
+ void PQregisterTupleStoreHandler(...);

- addTupleFunction PGresult.addTupleFunc;
+ tupleStorageHandler PGresult.tupleStorageHandlerFunc;

- void *PGresult.addTuleFuncParam;
+ void *PGresult.tupleStorageHandlerContext;

- char *PGresult.addTuleFuncErrMes;
+ void *PGresult.tupelStrageHandlerErrMes;

> Also, what about the problem Tom mentioned here?
>
> http://archives.postgresql.org/message-id/1042.1321123761@sss.pgh.pa.us

The plan that simply replace malloc's with something like
palloc's is abandoned for the narrow scope.

dblink-plus copies whole PGresult into TupleStore in order to
avoid making orphaned memory on SIGINT. The resource owner
mechanism is principally applicable to that but practically hard
for the reason that current implementation without radically
modification couldn't accept it.. In addition to that, dblink
also does same thing for maybe the same reason with dblink-plus
and another reason as far as I heard.

Whatever the reason is, both dblink and dblink-plus do the same
thing that could lower the performance than expected.

If TupleStore(TupleDesc) is preferable to PGresult for in-backend
use and oridinary(client-use) libpq users can handle only
PGresult, the mechanism like this patch would be reuired to
maintain the compatibility, I think. To the contrary, if there is
no significant reason to use TupleStore in backend use - it
implies that existing mechanisms like resource owner can save the
backend inexpensively from possible inconvenience caused by using
PGresult storage in backends - PGresult should be used as it is.

I think TupleStore prepared to be used in backend is preferable
for the usage and don't want to get data making detour via
PGresult.


regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: WIP: explain analyze with 'rows' but not timing
Next
From: Heikki Linnakangas
Date:
Subject: Re: Moving more work outside WALInsertLock