Thread: SPI_getvalue calls output function w/o pushing existing SPI connection + 2 extra issues

SPI_getvalue calls output function w/o pushing existing SPI connection + 2 extra issues

From
"J. Greg Davidson"
Date:
I have a user defined type implemented in C and SPI which has been
crashing a lot.  I found a small enough case to trace the whole thing
with gdb and found that while connected to SPI I was doing a get_value
on a type T implemented in C and SPI.  SPI_getvalue led to a call to T's
output routine which called SPI_connect which failed because the
previous SPI connection was still in force. The whole transaction was
aborted, so without a long gdb session I wouldn't have caught it.
Ironically the call to SPI_getvalue was in my debugging code!

This behavior was unexpected and very hard to track down.  If
SPI_getvalue is supposed to be this fragile, it would be good to
document that.  I'm now experimenting with creating a wrapper function
which surrounds SPI_getvalue with SPI_push and SPI_pop.

On a related issue - it would be very handy to have something which can
be called or tested from a type extension module to find out if an SPI
connection exists.  Such a feature would be especially useful for
asserts.  As far as I can tell from reading the server code, any such
state is hidden in static variables.

Finally, I tried to submit this bug at
http://wwwmaster.postgresql.org/system/handleform.php
as requested by the documentation at PostgreSQL.org, but it rejected my
attempt saying "The email address you entered does not appear to be
valid."   The email address in question is "jgd@well.com" which is
indeed valid and is the one I use to receive postgresql mailing lists.

Thanks for reading this feedback and keep up the good work,

_Greg
J. Greg Davidson wrote:
> I have a user defined type implemented in C and SPI which has been
> crashing a lot.  I found a small enough case to trace the whole thing
> with gdb and found that while connected to SPI I was doing a get_value
> on a type T implemented in C and SPI.  SPI_getvalue led to a call to T's
> output routine which called SPI_connect which failed because the
> previous SPI connection was still in force. The whole transaction was
> aborted, so without a long gdb session I wouldn't have caught it.
> Ironically the call to SPI_getvalue was in my debugging code!

The question that jumps at me is why are you using SPI inside a type's
output function?  You should really avoid doing that.

If you absolutely need to do it, enclosing the function in SPI_push/pop
seems to me the least bad answer.
Alvaro Herrera <alvherre@commandprompt.com> writes:
> The question that jumps at me is why are you using SPI inside a type's
> output function?  You should really avoid doing that.

Offhand it seems like it should work, but the real problem is that there
are probably a ton of code paths besides SPI_getvalue() that would need
to be changed to make it bulletproof.  I don't have a big problem with
adding SPI_push/pop inside SPI_getvalue, but what else might need to
change?

> If you absolutely need to do it, enclosing the function in SPI_push/pop
> seems to me the least bad answer.

That would be the Wrong Thing if the function could ever be called when
*not* connected to SPI, which of course is the normal case for a type
I/O function.  Correct code would be something like

    bool connected = false;

    if (_SPI_curid + 1 == _SPI_connected)        /* connected */
    {
        connected = true;
        SPI_push();
    }

    ... do it ...

    if (connected)
        SPI_pop();

and this is only possible inside spi.c because those variables aren't
exported.

            regards, tom lane