Thread: SPI question (or not): trying to read from Large Objects from within a function

SPI question (or not): trying to read from Large Objects from within a function

From
David Helgason
Date:
What:
I'm having trouble finding out how to find the current PGconn
connection inside a C function. Looking through the documentation
didn't give this up. Could anyone suggest where to look? I didn't even
see anything similar to this in the SPI_* documentation. Perhaps I am
totally mislead here?

Why:
I am writing an wrapper around librsync, allowing differential updating
of large amounts of data (librsync wraps the clever algorithm of
rsync).

The first function I'm wrapping is the one which generates a signature
(a hash of each block of data, with some specified block-size) from a
LO. Its signature would be:

    create function rsync_signature(oid /* of an Large Object */) returns
bytea

But I can't figure out how to get the current PGconn to be able to run
the lo_* functions.


On another not, would it be possible to avoid using Large Objects, and
use TOAST columns instead? Ie. is it possible to quickly read/write
partial toast columns (I know it's not possible for clients, but on the
server side?).


There may be more questions later, but I'll try to pay back by
submitting the final implementation to contrib/ (if anyone is
interested). It'll allow for really fast incremental updates of a
columns, which I'll use to make storing of huge blobs less of a pain
(although it depends on the client also speaking rsync-ese, but that'll
be included with the package).


Regards,

d.
--
David Helgason
Over the Edge Entertainments




David Helgason <david@uti.is> writes:
> I'm having trouble finding out how to find the current PGconn
> connection inside a C function.

What makes you think that "*the* current PGconn" is a valid concept?
libpq has always supported multiple active connections.

            regards, tom lane

Thank you very much,

I figured I needed to open my own using SPI_connect(). I had assumed
that there was sth like a
the-connection-this-functions-is-begin-run-through.

Now I'm having problems with

size_t inBufSize = 8192;
char* inBuffer = (char*)palloc(inBufSize);
int bytes_read = DatumGetInt32(DirectFunctionCall3(loread,
Int32GetDatum(fd), CStringGetDatum(inBuffer),
UInt32GetDatum(inBufSize)));

which returns an extremely large number in bytes_read (consistently
46235672), regardless of the contents of inBufSize.

I tried using lo_lseek(fd, 0, SEEK_END) on this fd already, which
correctly returned the size of the Large Object, so it's not a question
of an invalid descriptor. Also that seek didn't effect the result at
all. I guess it's wrong usage of the DatumGet*() / *GetDatum()
functions, but I can't see where.

Any suggestions?

d.

On 7. jan 2004, at 05:40, Tom Lane wrote:

> David Helgason <david@uti.is> writes:
>> I'm having trouble finding out how to find the current PGconn
>> connection inside a C function.
>
> What makes you think that "*the* current PGconn" is a valid concept?
> libpq has always supported multiple active connections.
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your
>       joining column's datatypes do not match
>


Sorry for spamming.... I'm getting the hang of this, and figured this
one out myself :)

These internal functions (loread/lowrite) have quite different
signatures from their C equivalents (as opposed to lo_lseek). Found out
from the sources that I was using it very incorrectly. But discovered
lo_read with a signature different from that documented as the Large
Object client interface: ones which don't take a connection parameter
at all. This really simplifies my code, which can now be:

> size_t inBufSize = 8192;
> char* inBuffer = (char*)palloc(inBufSize);
> int bytes_read = DatumGetInt32(DirectFunctionCall3(loread,
> Int32GetDatum(fd), CStringGetDatum(inBuffer),
> UInt32GetDatum(inBufSize)));
int bytes_read = lo_read(fd, inBuffer, inBufSize);

and all is well... just too bad there aren't similarly simple versions
of the other lo_{lseek,open,...}.

Thanks for the audience, and keep up the good work!

d.

On 7. jan 2004, at 06:22, David Helgason wrote:

> Thank you very much,
>
> I figured I needed to open my own using SPI_connect(). I had assumed
> that there was sth like a
> the-connection-this-functions-is-begin-run-through.
>
> Now I'm having problems with
>
>
> which returns an extremely large number in bytes_read (consistently
> 46235672), regardless of the contents of inBufSize.
>
> I tried using lo_lseek(fd, 0, SEEK_END) on this fd already, which
> correctly returned the size of the Large Object, so it's not a
> question of an invalid descriptor. Also that seek didn't effect the
> result at all. I guess it's wrong usage of the DatumGet*() /
> *GetDatum() functions, but I can't see where.
>
> Any suggestions?
>
> d.
>
> On 7. jan 2004, at 05:40, Tom Lane wrote:
>
>> David Helgason <david@uti.is> writes:
>>> I'm having trouble finding out how to find the current PGconn
>>> connection inside a C function.
>>
>> What makes you think that "*the* current PGconn" is a valid concept?
>> libpq has always supported multiple active connections.
>>
>>             regards, tom lane
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 9: the planner will ignore your desire to choose an index scan if
>> your
>>       joining column's datatypes do not match
>>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>


David Helgason <david@uti.is> writes:
> These internal functions (loread/lowrite) have quite different
> signatures from their C equivalents (as opposed to lo_lseek). Found out
> from the sources that I was using it very incorrectly.

I had just realized from reading your last message that you were trying
to write server-side functions, and reading the client-side
documentation to do so :-(.  The server-side stuff is really not the
same API at all, though it tends to use the same function names.
In general, you have to be prepared to read the source code when writing
server-side stuff.

            regards, tom lane