Thread: *_to_xml() should copy SPI_processed/SPI_tuptable

*_to_xml() should copy SPI_processed/SPI_tuptable

From
Chapman Flack
Date:
I encountered this in 9.5 and haven't yet written a reproducer
for 10 or 11, but the code in question looks the same in master.


In xml.c, query_to_xml_internal() contains a loop that refers
to SPI_processed every iteration:

for (i = 0; i < SPI_processed; i++)
    SPI_sql_row_to_xmlelement(i, result, tablename, nulls,
                              tableforest, targetns, top_level);

likewise, that function it is calling contains a loop that
repeatedly dereferences SPI_tuptable:

for (i = 1; i <= SPI_tuptable->tupdesc->natts; i++)
...
... map_sql_value_to_xml_value(colval,
                               SPI_gettypeid(SPI_tuptable->tupdesc, ...


and map_sql_value_to_xml_value can use OidOutputFunctionCall. If that
attribute's data type has an output function that uses SPI, the next
iteration here dereferences a null SPI_tuptable.

The root of the problem does not seem to be an output function that
uses SPI; indeed, the chance that one might do so seems to have been
the original motivation for OutputFunctionCall().

The essence of the problem here seems to be simply that these routines
in xml.c are not making local-variable copies of SPI_processed and
SPI_tuptable, as the docs for SPI_execute recommend.

Would it be worth also expanding that note in the docs? It currently
says:

---
All SPI query-execution functions set both SPI_processed and
SPI_tuptable .... Save these two global variables into local procedure
variables if you need to access the result table of SPI_execute or
another query-execution function across later calls.
---

but might it bear emphasis that those later calls can happen
indirectly from things that you call (e.g. output functions)?
Those will be nested in their own SPI_connect/SPI_finish, which
a reasonable person might guess would save/restore those globals,
but they don't; SPI_finish merely clears them, on the assumption
that you've got your local copies if you need them.

-Chap


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> In xml.c, query_to_xml_internal() contains a loop that refers
> to SPI_processed every iteration:
> for (i = 0; i < SPI_processed; i++)
>     SPI_sql_row_to_xmlelement(i, result, tablename, nulls,
>                               tableforest, targetns, top_level);
> likewise, that function it is calling contains a loop that
> repeatedly dereferences SPI_tuptable:
> for (i = 1; i <= SPI_tuptable->tupdesc->natts; i++)
> ...
> ... map_sql_value_to_xml_value(colval,
>                                SPI_gettypeid(SPI_tuptable->tupdesc, ...
> and map_sql_value_to_xml_value can use OidOutputFunctionCall
> [ which could call arbitrary code that clobbers SPI_tuptable ]

Ugh.  I wonder how many other places have latent issues of the same kind.

> The essence of the problem here seems to be simply that these routines
> in xml.c are not making local-variable copies of SPI_processed and
> SPI_tuptable, as the docs for SPI_execute recommend.

I think that's blaming the messenger TBH.  The *real* problem is the
damfool decision to use global variables for this in the first place.
Should we change that somehow?

Really the right API would have entailed something like

int SPI_execute(const char *src, bool read_only, long tcount,
                SPITupleTable **tuptable, uint64 *processed);

... and I guess we'd have to think about SPI_lastoid too, so maybe
returning a struct would be smarter than a pointer-per-value.

So some possible alternatives are:

* Just do it.  Probably breaks too much 3rd-party code to be very
popular, though.

* Invent SPI_execute_r() and so on with APIs as above, and deprecate
the old functions, but don't remove 'em right away.  Grotty, and it
does little to fix any latent problems that may exist outside core.

* Replace SPI_tuptable et al with macros that access fields in the
current SPI stack level (similar to the way that, eg, errno works
on most modern platforms).  This seems do-able, if a bit grotty.

None of this is very back-patchable, so I guess we need a one-off
backpatched change for query_to_xml_internal and anyplace else that
looks to be at serious risk.

            regards, tom lane


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Chapman Flack
Date:
On 09/05/18 18:07, Tom Lane wrote:

> * Replace SPI_tuptable et al with macros that access fields in the
> current SPI stack level (similar to the way that, eg, errno works
> on most modern platforms).  This seems do-able, if a bit grotty.

It would mean they'd have to *be* in the stack frame, where they
currently aren't; they're ignored by SPI_connect and just zeroed
by SPI_finish, on the grounds that they're transient and you've
copied them if you care.

Another alternative might be to have SPI_connect save them and
SPI_finish put them back, which leaves you just responsible for
reasoning about your own code. You'd still be expected to save them
across your own uses of other SPI calls, but no longer exposed to
spooky action at a distance from nested uses of SPI in stuff you call.

-Chap


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> On 09/05/18 18:07, Tom Lane wrote:
>> * Replace SPI_tuptable et al with macros that access fields in the
>> current SPI stack level (similar to the way that, eg, errno works
>> on most modern platforms).  This seems do-able, if a bit grotty.

> It would mean they'd have to *be* in the stack frame, where they
> currently aren't;

Right, I was assuming that was obvious ...

> Another alternative might be to have SPI_connect save them and
> SPI_finish put them back, which leaves you just responsible for
> reasoning about your own code. You'd still be expected to save them
> across your own uses of other SPI calls, but no longer exposed to
> spooky action at a distance from nested uses of SPI in stuff you call.

Hmm.  I'd thought about that briefly and concluded that it didn't offer
a full fix, but on reflection it's not clear why it'd be any less of
a fix than the macroized-SPI_tuptable approach.  You end up with
per-SPI-stack-level storage either way, and while that isn't perfect
it does go a long way, as you say.  More, this has the huge advantage
of being back-patchable, because there'd be no API/ABI change.

            regards, tom lane


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Tom Lane
Date:
I wrote:
> Chapman Flack <chap@anastigmatix.net> writes:
>> Another alternative might be to have SPI_connect save them and
>> SPI_finish put them back, which leaves you just responsible for
>> reasoning about your own code. You'd still be expected to save them
>> across your own uses of other SPI calls, but no longer exposed to
>> spooky action at a distance from nested uses of SPI in stuff you call.

> Hmm.  I'd thought about that briefly and concluded that it didn't offer
> a full fix, but on reflection it's not clear why it'd be any less of
> a fix than the macroized-SPI_tuptable approach.  You end up with
> per-SPI-stack-level storage either way, and while that isn't perfect
> it does go a long way, as you say.  More, this has the huge advantage
> of being back-patchable, because there'd be no API/ABI change.

Here's a proposed patch along that line.  I included SPI_result (SPI's
equivalent of errno) in the set of saved-and-restored variables,
so that all the exposed SPI globals behave the same.

Also, in further service of providing insulation between SPI nesting
levels, I thought it'd be a good idea for SPI_connect() to reset the
globals to zero/NULL after saving them.  This ensures that a nesting
level can't accidentally be affected by the state of an outer level.
It's barely possible that there's somebody out there who's *intentionally*
accessing state from a calling SPI level, but that seems like it'd be
pretty dangerous programming practice.  Still, maybe there's an argument
for omitting that hunk in released branches.

Comments?

            regards, tom lane

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 5756365..11ca800 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -36,10 +36,16 @@
 #include "utils/typcache.h"
 
 
+/*
+ * These global variables are part of the API for various SPI functions
+ * (a horrible API choice, but it's too late now).  To reduce the risk of
+ * interference between different SPI callers, we save and restore them
+ * when entering/exiting a SPI nesting level.
+ */
 uint64        SPI_processed = 0;
 Oid            SPI_lastoid = InvalidOid;
 SPITupleTable *SPI_tuptable = NULL;
-int            SPI_result;
+int            SPI_result = 0;
 
 static _SPI_connection *_SPI_stack = NULL;
 static _SPI_connection *_SPI_current = NULL;
@@ -132,6 +138,10 @@ SPI_connect_ext(int options)
     _SPI_current->queryEnv = NULL;
     _SPI_current->atomic = (options & SPI_OPT_NONATOMIC ? false : true);
     _SPI_current->internal_xact = false;
+    _SPI_current->outer_processed = SPI_processed;
+    _SPI_current->outer_lastoid = SPI_lastoid;
+    _SPI_current->outer_tuptable = SPI_tuptable;
+    _SPI_current->outer_result = SPI_result;
 
     /*
      * Create memory contexts for this procedure
@@ -154,6 +164,15 @@ SPI_connect_ext(int options)
     /* ... and switch to procedure's context */
     _SPI_current->savedcxt = MemoryContextSwitchTo(_SPI_current->procCxt);
 
+    /*
+     * Reset API global variables so that current caller cannot accidentally
+     * depend on state of an outer caller.
+     */
+    SPI_processed = 0;
+    SPI_lastoid = InvalidOid;
+    SPI_tuptable = NULL;
+    SPI_result = 0;
+
     return SPI_OK_CONNECT;
 }
 
@@ -176,12 +195,13 @@ SPI_finish(void)
     _SPI_current->procCxt = NULL;
 
     /*
-     * Reset result variables, especially SPI_tuptable which is probably
+     * Restore outer API variables, especially SPI_tuptable which is probably
      * pointing at a just-deleted tuptable
      */
-    SPI_processed = 0;
-    SPI_lastoid = InvalidOid;
-    SPI_tuptable = NULL;
+    SPI_processed = _SPI_current->outer_processed;
+    SPI_lastoid = _SPI_current->outer_lastoid;
+    SPI_tuptable = _SPI_current->outer_tuptable;
+    SPI_result = _SPI_current->outer_result;
 
     /* Exit stack level */
     _SPI_connected--;
@@ -274,9 +294,11 @@ SPICleanup(void)
 {
     _SPI_current = NULL;
     _SPI_connected = -1;
+    /* Reset API global variables, too */
     SPI_processed = 0;
     SPI_lastoid = InvalidOid;
     SPI_tuptable = NULL;
+    SPI_result = 0;
 }
 
 /*
@@ -336,18 +358,20 @@ AtEOSubXact_SPI(bool isCommit, SubTransactionId mySubid)
         }
 
         /*
-         * Pop the stack entry and reset global variables.  Unlike
+         * Restore outer global variables and pop the stack entry.  Unlike
          * SPI_finish(), we don't risk switching to memory contexts that might
          * be already gone.
          */
+        SPI_processed = connection->outer_processed;
+        SPI_lastoid = connection->outer_lastoid;
+        SPI_tuptable = connection->outer_tuptable;
+        SPI_result = connection->outer_result;
+
         _SPI_connected--;
         if (_SPI_connected < 0)
             _SPI_current = NULL;
         else
             _SPI_current = &(_SPI_stack[_SPI_connected]);
-        SPI_processed = 0;
-        SPI_lastoid = InvalidOid;
-        SPI_tuptable = NULL;
     }
 
     if (found && isCommit)
diff --git a/src/include/executor/spi_priv.h b/src/include/executor/spi_priv.h
index 401fd99..0da3a41 100644
--- a/src/include/executor/spi_priv.h
+++ b/src/include/executor/spi_priv.h
@@ -42,6 +42,12 @@ typedef struct
                                  * transactions */
     bool        internal_xact;    /* SPI-managed transaction boundary, skip
                                  * cleanup */
+
+    /* saved values of API global variables for previous nesting level */
+    uint64        outer_processed;
+    Oid            outer_lastoid;
+    SPITupleTable *outer_tuptable;
+    int            outer_result;
 } _SPI_connection;
 
 /*

Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Peter Eisentraut
Date:
On 06/09/2018 18:25, Tom Lane wrote:
> Here's a proposed patch along that line.  I included SPI_result (SPI's
> equivalent of errno) in the set of saved-and-restored variables,
> so that all the exposed SPI globals behave the same.
> 
> Also, in further service of providing insulation between SPI nesting
> levels, I thought it'd be a good idea for SPI_connect() to reset the
> globals to zero/NULL after saving them.  This ensures that a nesting
> level can't accidentally be affected by the state of an outer level.
> It's barely possible that there's somebody out there who's *intentionally*
> accessing state from a calling SPI level, but that seems like it'd be
> pretty dangerous programming practice.  Still, maybe there's an argument
> for omitting that hunk in released branches.

These look like good changes.  I'm not sure about backpatching them, but
as you say the chances that someone wants to do this intentionally are low.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 06/09/2018 18:25, Tom Lane wrote:
>> Also, in further service of providing insulation between SPI nesting
>> levels, I thought it'd be a good idea for SPI_connect() to reset the
>> globals to zero/NULL after saving them.  This ensures that a nesting
>> level can't accidentally be affected by the state of an outer level.
>> It's barely possible that there's somebody out there who's *intentionally*
>> accessing state from a calling SPI level, but that seems like it'd be
>> pretty dangerous programming practice.  Still, maybe there's an argument
>> for omitting that hunk in released branches.

> These look like good changes.  I'm not sure about backpatching them, but
> as you say the chances that someone wants to do this intentionally are low.

Yeah.  I'd put a higher probability on the idea that this will fix
somebody's latent bug in code that's never been tested inside an
outer level of SPI call.  It would, for example, work to rely on
SPI_tuptable being initially NULL, as long as you hadn't tried it
inside an outer call.

            regards, tom lane


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> Also, in further service of providing insulation between SPI
 Tom> nesting levels, I thought it'd be a good idea for SPI_connect() to
 Tom> reset the globals to zero/NULL after saving them. This ensures
 Tom> that a nesting level can't accidentally be affected by the state
 Tom> of an outer level. It's barely possible that there's somebody out
 Tom> there who's *intentionally* accessing state from a calling SPI
 Tom> level, but that seems like it'd be pretty dangerous programming
 Tom> practice. Still, maybe there's an argument for omitting that hunk
 Tom> in released branches.

 Tom> Comments?

+1 to backpatching it all, including the initialization in SPI_connect.

-- 
Andrew (irc:RhodiumToad)


Re: *_to_xml() should copy SPI_processed/SPI_tuptable

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Comments?

> +1 to backpatching it all, including the initialization in SPI_connect.

Done.

            regards, tom lane