Thread: [PATCH] Use correct types and limits for PL/Perl SPI query results
[PATCH] Use correct types and limits for PL/Perl SPI query results
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Hi hackers, Commit 23a27b039d94ba359286694831eafe03cd970eef changed the type of numbers-of-tuples-processed counters to uint64 and adjusted various PLs to cope with this. I noticed the PL/Perl changes did not take full advantage of what Perl is capable of handling, so here's a patch that improves that. 1) Perl's integers are at least pointer-sized and either signed or unsigned, so can potentially hold up to 2⁶⁴-1. Floating point numbers can also be larger than double (up to 128bit), allowing for exact representation of integers beyond 2⁵³. Hence, adjust the setting of the "processed" hash item to use UV_MAX for the limit and (NV) or (UV) for the casts. 2) Perl 5.20 and later use SSize_t for array indices, so can cope with up to SSize_t_max items in an array (if you have the memory). 3) To be able to actually return such result sets from sp_exec_query(), I had to change the repalloc() in spi_printtup() to repalloc_huge(). -- "A disappointingly low fraction of the human race is, at any given time, on fire." - Stig Sandbeck Mathisen >From f6d38f1a5c7d6135dd5f580aa8ed38bb74162a0f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Sun, 13 Mar 2016 01:21:43 +0000 Subject: [PATCH] Use correct types and limits for PL/Perl SPI query results Perl's integers are pointer-sized, so can hold more than INT_MAX on LP64 platforms, and come in both signed (IV) and unsigned (UV). Floating point values (NV) may also be larger than double. Since Perl 5.19.4 array indices are SSize_t instead of I32, so allow up to SSize_t_max on those versions. The limit is not imposed just by av_extend's argument type, but all the array handling code, so remove the speculative comment. Finally, change spi_printtup() to use repalloc_huge() to be able to return a tuple table of more than 1GiB. --- src/backend/executor/spi.c | 2 +- src/pl/plperl/plperl.c | 13 ++++--------- src/pl/plperl/plperl.h | 7 +++++++ 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 3d04c23..fd94179 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -1800,7 +1800,7 @@ spi_printtup(TupleTableSlot *slot, DestReceiver *self) /* Double the size of the pointer array */ tuptable->free = tuptable->alloced; tuptable->alloced += tuptable->free; - tuptable->vals = (HeapTuple *) repalloc(tuptable->vals, + tuptable->vals = (HeapTuple *) repalloc_huge(tuptable->vals, tuptable->alloced * sizeof(HeapTuple)); } diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c index 269f7f3..d405179 100644 --- a/src/pl/plperl/plperl.c +++ b/src/pl/plperl/plperl.c @@ -3104,9 +3104,9 @@ plperl_spi_execute_fetch_result(SPITupleTable *tuptable, uint64 processed, hv_store_string(result, "status", cstr2sv(SPI_result_code_string(status))); hv_store_string(result, "processed", - (processed > (uint64) INT_MAX) ? - newSVnv((double) processed) : - newSViv((int) processed)); + (processed > (uint64) UV_MAX) ? + newSVnv((NV) processed) : + newSVuv((UV) processed)); if (status > 0 && tuptable) { @@ -3114,12 +3114,7 @@ plperl_spi_execute_fetch_result(SPITupleTable *tuptable, uint64 processed, SV *row; uint64 i; - /* - * av_extend's 2nd argument is declared I32. It's possible we could - * nonetheless push more than INT_MAX elements into a Perl array, but - * let's just fail instead of trying. - */ - if (processed > (uint64) INT_MAX) + if (processed > AV_SIZE_MAX) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), errmsg("query result has too many rows to fit in a Perl array"))); diff --git a/src/pl/plperl/plperl.h b/src/pl/plperl/plperl.h index 9179bbd..2f376ee 100644 --- a/src/pl/plperl/plperl.h +++ b/src/pl/plperl/plperl.h @@ -100,4 +100,11 @@ void plperl_spi_freeplan(char *); void plperl_spi_cursor_close(char *); char *plperl_sv_to_literal(SV *, char *); +/* Perl 5.19.4 changed array indices from I32 to SSize_t */ +#if PERL_BCDVERSION >= 0x5019004 +#define AV_SIZE_MAX ((uint64) SSize_t_MAX) +#else +#define AV_SIZE_MAX ((uint64) I32_MAX) +#endif + #endif /* PL_PERL_H */ -- 2.7.0
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes: > 1) Perl's integers are at least pointer-sized and either signed or > unsigned, so can potentially hold up to 2⁶⁴-1. Floating point numbers > can also be larger than double (up to 128bit), allowing for exact > representation of integers beyond 2⁵³. Hence, adjust the setting of > the "processed" hash item to use UV_MAX for the limit and (NV) or > (UV) for the casts. I thought about using UV where feasible, but it was not clear to me whether unsigned numbers behave semantically differently from signed ones in Perl. If they do, the change you suggest would create a small semantic change in the behavior of code using spi_exec_query. I'm not sure it's worth taking any risk of that, considering we already discourage people from using spi_exec_query for large results. > 2) Perl 5.20 and later use SSize_t for array indices, so can cope with > up to SSize_t_max items in an array (if you have the memory). I don't much like having such hard-wired knowledge about Perl versions in our code. Is there a way to identify this change other than #if PERL_BCDVERSION >= 0x5019004 ? > 3) To be able to actually return such result sets from sp_exec_query(), > I had to change the repalloc() in spi_printtup() to repalloc_huge(). Hmm, good point. I wonder if there are any hazards to doing that. regards, tom lane
Re: [PATCH] Use correct types and limits for PL/Perl SPI query results
From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes: >> 1) Perl's integers are at least pointer-sized and either signed or >> unsigned, so can potentially hold up to 2⁶⁴-1. Floating point numbers >> can also be larger than double (up to 128bit), allowing for exact >> representation of integers beyond 2⁵³. Hence, adjust the setting of >> the "processed" hash item to use UV_MAX for the limit and (NV) or >> (UV) for the casts. > > I thought about using UV where feasible, but it was not clear to me > whether unsigned numbers behave semantically differently from signed ones > in Perl. If they do, the change you suggest would create a small semantic > change in the behavior of code using spi_exec_query. I'm not sure it's > worth taking any risk of that, considering we already discourage people > from using spi_exec_query for large results. IVs and UVs are semantically equivalent, and Perl will automatically convert between them (and NVs) as necessary, i.e. when crossing IV_MAX/UV_MAX/IV_MIN. >> 2) Perl 5.20 and later use SSize_t for array indices, so can cope with >> up to SSize_t_max items in an array (if you have the memory). > > I don't much like having such hard-wired knowledge about Perl versions > in our code. Is there a way to identify this change other than > #if PERL_BCDVERSION >= 0x5019004 ? There isn't, unfortunately. I could add e.g. AVidx_t and _MAX defines, but that wouldn't be available until 5.26 (May 2017) at the earliest, since full code freeze for May's 5.24 is next week. >> 3) To be able to actually return such result sets from sp_exec_query(), >> I had to change the repalloc() in spi_printtup() to repalloc_huge(). > > Hmm, good point. I wonder if there are any hazards to doing that. One hazard would be that people not heeding the warning in spi_exec_query will get a visit from the OOM killer (or death by swap) instead of a friendly "invalid memory alloc request size". > regards, tom lane ilmari -- - Twitter seems more influential [than blogs] in the 'gets reported in the mainstream press' sense at least. - Matt McLeod - That'd be because the content of a tweet is easier to condense down to a mainstream media article. - Calle Dybedahl
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I thought about using UV where feasible, but it was not clear to me >> whether unsigned numbers behave semantically differently from signed ones >> in Perl. If they do, the change you suggest would create a small semantic >> change in the behavior of code using spi_exec_query. I'm not sure it's >> worth taking any risk of that, considering we already discourage people >> from using spi_exec_query for large results. > IVs and UVs are semantically equivalent, and Perl will automatically > convert between them (and NVs) as necessary, i.e. when crossing > IV_MAX/UV_MAX/IV_MIN. OK, thanks for the authoritative statement. >> I don't much like having such hard-wired knowledge about Perl versions >> in our code. Is there a way to identify this change other than >> #if PERL_BCDVERSION >= 0x5019004 ? > There isn't, unfortunately. Sigh ... it was worth asking anyway. >>> 3) To be able to actually return such result sets from sp_exec_query(), >>> I had to change the repalloc() in spi_printtup() to repalloc_huge(). >> Hmm, good point. I wonder if there are any hazards to doing that. > One hazard would be that people not heeding the warning in > spi_exec_query will get a visit from the OOM killer (or death by swap) > instead of a friendly "invalid memory alloc request size". Yeah. But there are plenty of other ways to drive a backend into swap hell, and the whole point of this change is to eliminate arbitrary boundaries on query result size. So let's do it. Thanks for the patch and discussion! regards, tom lane