Thread: Improper use about DatumGetInt32
Hi In (/contrib/bloom/blutils.c:277), I found it use DatumGetInt32 to get UInt32 type. Is it more appropriate to use DatumGetUInt32 here? See the attachment for the patch Bes regards, houzj
Attachment
On Mon, Sep 21, 2020 at 6:47 AM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote: > > In (/contrib/bloom/blutils.c:277), I found it use DatumGetInt32 to get UInt32 type. > Is it more appropriate to use DatumGetUInt32 here? > Makes sense. +1 for the patch. I think with the existing code also we don't have any problem. If we assume that the hash functions return uint32, with DatumGetInt32() we are typecasting that uint32 result to int32, we are assigning it to uint32 i.e. typecasting int32 back to uint32. Eventually, I think, we will see the proper value in hashVal. I did a small experiment to prove this [1]. uint32 hashVal; hashVal = DatumGetInt32(FunctionCall1Coll(&state->hashFn[attno], state->collations[attno], value)); It's good to run a few test cases/test suites(if they exist) that hit this part of the code, just to ensure we don't break anything. [1] int main() { unsigned int u = 3 * 1024 * 1024 * 1024; printf("%u\n", u); int i = u; printf("%d\n", i); unsigned int u1 = i; printf("%u\n", u1); return 0; } Output of the above snippet: 3221225472 -1073741824 3221225472 With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Sun, Sep 20, 2020 at 9:17 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote: > In (/contrib/bloom/blutils.c:277), I found it use DatumGetInt32 to get UInt32 type. > Is it more appropriate to use DatumGetUInt32 here? Typically, the DatumGetBlah() function that you pick should match the SQL data type that the function is returning. So if the function returns pg_catalog.int4, which corresponds to the C data type int32, you would use DatumGetInt32. There is no SQL type corresponding to the C data type uint32, so I'm not sure why we even have DatumGetUInt32. I'm sort of suspicious that there's some fuzzy thinking going on there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Typically, the DatumGetBlah() function that you pick should match the > SQL data type that the function is returning. So if the function > returns pg_catalog.int4, which corresponds to the C data type int32, > you would use DatumGetInt32. There is no SQL type corresponding to the > C data type uint32, so I'm not sure why we even have DatumGetUInt32. xid? regards, tom lane
Hi, On 2020-09-21 14:08:22 -0400, Robert Haas wrote: > There is no SQL type corresponding to the C data type uint32, so I'm > not sure why we even have DatumGetUInt32. I'm sort of suspicious that > there's some fuzzy thinking going on there. I think we mostly use it for the few places where we currently expose data as a signed integer on the SQL level, but internally actually treat it as a unsigned data. There's not a lot of those, but there e.g. is pg_class.relpages. There also may be places where we use it for functions that can be created but not called from SQL (using the INTERNAL type). - Andres
On Mon, Sep 21, 2020 at 3:53 PM Andres Freund <andres@anarazel.de> wrote: > On 2020-09-21 14:08:22 -0400, Robert Haas wrote: > > There is no SQL type corresponding to the C data type uint32, so I'm > > not sure why we even have DatumGetUInt32. I'm sort of suspicious that > > there's some fuzzy thinking going on there. > > I think we mostly use it for the few places where we currently expose > data as a signed integer on the SQL level, but internally actually treat > it as a unsigned data. So why is the right solution to that not DatumGetInt32() + a cast to uint32? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Sep 21, 2020 at 3:53 PM Andres Freund <andres@anarazel.de> wrote: >> I think we mostly use it for the few places where we currently expose >> data as a signed integer on the SQL level, but internally actually treat >> it as a unsigned data. > So why is the right solution to that not DatumGetInt32() + a cast to uint32? You're ignoring the xid use-case, for which DatumGetUInt32 actually is the right thing. I tend to agree though that if the SQL argument is of a signed type, the least API-abusing answer is a signed DatumGetXXX macro followed by whatever cast you need. regards, tom lane
On Wed, Sep 23, 2020 at 1:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Robert Haas <robertmhaas@gmail.com> writes: > > On Mon, Sep 21, 2020 at 3:53 PM Andres Freund <andres@anarazel.de> wrote: > >> I think we mostly use it for the few places where we currently expose > >> data as a signed integer on the SQL level, but internally actually treat > >> it as a unsigned data. > > > So why is the right solution to that not DatumGetInt32() + a cast to uint32? > > You're ignoring the xid use-case, for which DatumGetUInt32 actually is > the right thing. There is DatumGetTransactionId() which should be used instead. That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's there but only defined in xid.c. So pg_xact_commit_timestamp(), pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use PG_GETARG_UNIT32. IMO those should be changed to use PG_GETARG_TRANSACTIONID. That would require moving PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where other PG_GETARG_* are. > I tend to agree though that if the SQL argument is > of a signed type, the least API-abusing answer is a signed DatumGetXXX > macro followed by whatever cast you need. > I looked for some uses of PG_GETARG_UNIT32() which is the counterpart of DatumGetUint32(). Found some buggy usages apart from the ones which can be converted to PG_GETARG_TRANSACTIONID listed above. normal_rand() for example returns a huge number of rows and takes forever if we pass a negative first argument to it. Someone could misuse that for a DOS attack or it could be just an accident that they pass a negative value to that function and the query takes forever. explain analyze select count(*) from normal_rand(-1000000, 1.0, 1.0); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=12.50..12.51 rows=1 width=8) (actual time=2077574.718..2077574.719 rows=1 loops=1) -> Function Scan on normal_rand (cost=0.00..10.00 rows=1000 width=0) (actual time=1005176.149..1729994.366 rows=4293967296 loops=1) Planning Time: 0.346 ms Execution Time: 2079034.835 ms get_raw_page() also does similar thing but the effect is not as dangerous SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1; ERROR: block number 4294967295 is out of range for relation "test1" Similarly for bt_page_stats() and bt_page_items() PFA patches to correct those. There's Oracle compatible chr() which also uses PG_GETARG_UINT32() but it's (accidentally?) reporting the negative inputs correctly because it filters out very large values and reports those using %d. It's arguable whether we should change that, so I have left it untouched. But I think we should change that as well and get rid of PG_GETARG_UNIT32 altogether. This will prevent any future misuse. -- Best Wishes, Ashutosh Bapat
Attachment
On 2020-Sep-23, Ashutosh Bapat wrote: > > You're ignoring the xid use-case, for which DatumGetUInt32 actually is > > the right thing. > > There is DatumGetTransactionId() which should be used instead. > That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's > there but only defined in xid.c. So pg_xact_commit_timestamp(), > pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use > PG_GETARG_UNIT32. IMO those should be changed to use > PG_GETARG_TRANSACTIONID. That would require moving > PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where > other PG_GETARG_* are. Hmm, yeah, I think this would be a good idea. > get_raw_page() also does similar thing but the effect is not as dangerous > SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1; > ERROR: block number 4294967295 is out of range for relation "test1" > Similarly for bt_page_stats() and bt_page_items() Hmm, but page numbers above signed INT_MAX are valid. So this would prevent reading all legitimate pages past that.
On Fri, 16 Oct 2020 at 19:26, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2020-Sep-23, Ashutosh Bapat wrote:
> > You're ignoring the xid use-case, for which DatumGetUInt32 actually is
> > the right thing.
>
> There is DatumGetTransactionId() which should be used instead.
> That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's
> there but only defined in xid.c. So pg_xact_commit_timestamp(),
> pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use
> PG_GETARG_UNIT32. IMO those should be changed to use
> PG_GETARG_TRANSACTIONID. That would require moving
> PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where
> other PG_GETARG_* are.
Hmm, yeah, I think this would be a good idea.
The patch 0003 does that.
> get_raw_page() also does similar thing but the effect is not as dangerous
> SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1;
> ERROR: block number 4294967295 is out of range for relation "test1"
> Similarly for bt_page_stats() and bt_page_items()
Hmm, but page numbers above signed INT_MAX are valid. So this would
prevent reading all legitimate pages past that.
Best Wishes,
Ashutosh
Attachment
I have committed 0003. For 0001, normal_rand(), I think you should reject negative arguments with an error. For 0002, I think you should change the block number arguments to int8, same as other contrib modules do.
On 02.11.2020 18:59, Peter Eisentraut wrote: > I have committed 0003. > > For 0001, normal_rand(), I think you should reject negative arguments > with an error. I've updated 0001. The change is trivial, see attached. > > For 0002, I think you should change the block number arguments to > int8, same as other contrib modules do. > Agree. It will need a bit more work, though. Probably a new version of pageinspect contrib, as the public API will change. Ashutosh, are you going to continue working on it? -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 2020-11-02 16:59, Peter Eisentraut wrote: > I have committed 0003. > > For 0001, normal_rand(), I think you should reject negative arguments > with an error. I have committed a fix for that. > For 0002, I think you should change the block number arguments to int8, > same as other contrib modules do. Looking further into this, almost all of pageinspect needs to be updated to handle block numbers larger than INT_MAX correctly. Attached is a patch for this. It is meant to work like other contrib modules, such as pg_freespace and pg_visibility. I haven't tested this much yet.
Attachment
On 2020-Nov-25, Peter Eisentraut wrote: > bt_page_stats(PG_FUNCTION_ARGS) > { > text *relname = PG_GETARG_TEXT_PP(0); > - uint32 blkno = PG_GETARG_UINT32(1); > + int64 blkno = PG_GETARG_INT64(1); As a matter of style, I think it'd be better to have an int64 variable that gets the value from PG_GETARG_INT64(), then you cast that to another variable that's a BlockNumber and use that throughout the rest of the code. So you'd avoid changes like this: > static bytea *get_raw_page_internal(text *relname, ForkNumber forknum, > - BlockNumber blkno); > + int64 blkno); where the previous coding was correct, and the new one is dubious and it forces you to add unnecessary range checks in that function: > @@ -144,11 +144,16 @@ get_raw_page_internal(text *relname, ForkNumber forknum, BlockNumber blkno) > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("cannot access temporary tables of other sessions"))); > > + if (blkno < 0 || blkno > MaxBlockNumber) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("invalid block number"))); > +
On 2020-11-25 20:04, Alvaro Herrera wrote: > On 2020-Nov-25, Peter Eisentraut wrote: > >> bt_page_stats(PG_FUNCTION_ARGS) >> { >> text *relname = PG_GETARG_TEXT_PP(0); >> - uint32 blkno = PG_GETARG_UINT32(1); >> + int64 blkno = PG_GETARG_INT64(1); > > As a matter of style, I think it'd be better to have an int64 variable > that gets the value from PG_GETARG_INT64(), then you cast that to > another variable that's a BlockNumber and use that throughout the rest > of the code. So you'd avoid changes like this: > >> static bytea *get_raw_page_internal(text *relname, ForkNumber forknum, >> - BlockNumber blkno); >> + int64 blkno); > > where the previous coding was correct, and the new one is dubious and it > forces you to add unnecessary range checks in that function: > >> @@ -144,11 +144,16 @@ get_raw_page_internal(text *relname, ForkNumber forknum, BlockNumber blkno) >> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), >> errmsg("cannot access temporary tables of other sessions"))); >> >> + if (blkno < 0 || blkno > MaxBlockNumber) >> + ereport(ERROR, >> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), >> + errmsg("invalid block number"))); >> + The point of the patch is to have the range check somewhere. If you just cast it, then you won't notice out of range arguments. Note that other contrib modules that take block numbers work the same way.
On 2020-Nov-26, Peter Eisentraut wrote: > The point of the patch is to have the range check somewhere. If you just > cast it, then you won't notice out of range arguments. Note that other > contrib modules that take block numbers work the same way. I'm not saying not to do that; just saying we should not propagate it to places that don't need it. get_raw_page gets its page number from PG_GETARG_INT64(), and the range check should be there. But then it calls get_raw_page_internal, and it could pass a BlockNumber -- there's no need to pass an int64. So get_raw_page_internal does not need a range check.
On 2020-11-26 14:27, Alvaro Herrera wrote: > On 2020-Nov-26, Peter Eisentraut wrote: > >> The point of the patch is to have the range check somewhere. If you just >> cast it, then you won't notice out of range arguments. Note that other >> contrib modules that take block numbers work the same way. > > I'm not saying not to do that; just saying we should not propagate it to > places that don't need it. get_raw_page gets its page number from > PG_GETARG_INT64(), and the range check should be there. But then it > calls get_raw_page_internal, and it could pass a BlockNumber -- there's > no need to pass an int64. So get_raw_page_internal does not need a > range check. Yeah, I had it like that for a moment, but then you need to duplicate the check in get_raw_page() and get_raw_page_fork(). I figured since get_raw_page_internal() does all the other argument checking also, it seems sensible to put the block range check there too. But it's not a big deal either way.
On Wed, Nov 25, 2020 at 8:13 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 02.11.2020 18:59, Peter Eisentraut wrote:
> I have committed 0003.
>
> For 0001, normal_rand(), I think you should reject negative arguments
> with an error.
I've updated 0001. The change is trivial, see attached.
>
> For 0002, I think you should change the block number arguments to
> int8, same as other contrib modules do.
>
Agree. It will need a bit more work, though. Probably a new version of
pageinspect contrib, as the public API will change.
Ashutosh, are you going to continue working on it?
Sorry I was away on Diwali vacation so couldn't address Peter's comments in time. Thanks for taking this further. I will review Peter's patch.
--
Best Wishes,
Ashutosh
On Thu, Nov 26, 2020 at 9:57 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
On 2020-11-26 14:27, Alvaro Herrera wrote:
> On 2020-Nov-26, Peter Eisentraut wrote:
>
>> The point of the patch is to have the range check somewhere. If you just
>> cast it, then you won't notice out of range arguments. Note that other
>> contrib modules that take block numbers work the same way.
>
> I'm not saying not to do that; just saying we should not propagate it to
> places that don't need it. get_raw_page gets its page number from
> PG_GETARG_INT64(), and the range check should be there. But then it
> calls get_raw_page_internal, and it could pass a BlockNumber -- there's
> no need to pass an int64. So get_raw_page_internal does not need a
> range check.
Yeah, I had it like that for a moment, but then you need to duplicate
the check in get_raw_page() and get_raw_page_fork(). I figured since
get_raw_page_internal() does all the other argument checking also, it
seems sensible to put the block range check there too. But it's not a
big deal either way.
--
Best Wishes,
Ashutosh
On 2020-11-27 13:37, Ashutosh Bapat wrote: > Yeah, I had it like that for a moment, but then you need to duplicate > the check in get_raw_page() and get_raw_page_fork(). I figured since > get_raw_page_internal() does all the other argument checking also, it > seems sensible to put the block range check there too. But it's not a > big deal either way. > > > FWIW, my 2c. Though I agree with both sides, I > prefer get_raw_page_internal() accepting BlockNumber, since that's what > it deals with and not the entire int8. Patch updated this way. I agree it's better that way.
Attachment
On 2020-Nov-30, Peter Eisentraut wrote: > Patch updated this way. I agree it's better that way. Thanks, LGTM.
On 2020-11-30 16:32, Alvaro Herrera wrote: > On 2020-Nov-30, Peter Eisentraut wrote: > >> Patch updated this way. I agree it's better that way. > > Thanks, LGTM. For a change like this, do we need to change the C symbol names, so that there is no misbehavior if the shared library is not updated at the same time as the extension is upgraded in SQL?
On 2020-Dec-03, Peter Eisentraut wrote: > On 2020-11-30 16:32, Alvaro Herrera wrote: > > On 2020-Nov-30, Peter Eisentraut wrote: > > > > > Patch updated this way. I agree it's better that way. > > > > Thanks, LGTM. > > For a change like this, do we need to change the C symbol names, so that > there is no misbehavior if the shared library is not updated at the same > time as the extension is upgraded in SQL? Good question. One point is that since the changes to the arguments are just in the way we read the values from the Datum C-values, there's no actual ABI change. So if I understand correctly, there's no danger of a crash; there's just a danger of misinterpreting a value. I don't know if it's possible to determine (at function execution time) that we're running with the old extension version; if so it might suffice to throw a warning but still have the SQL function run the same C function. If we really think that we ought to differentiate, then we could do what pg_stat_statement does, and have a separate C function that's called with the obsolete signature (pg_stat_statements_1_8 et al).
On Fri, Dec 04, 2020 at 03:58:22PM -0300, Alvaro Herrera wrote: > I don't know if it's possible to determine (at function execution time) > that we're running with the old extension version; if so it might > suffice to throw a warning but still have the SQL function run the same > C function. Hmm. You could look after extversion? Usually we just handle that with compatibility routines. > If we really think that we ought to differentiate, then we could do what > pg_stat_statement does, and have a separate C function that's called > with the obsolete signature (pg_stat_statements_1_8 et al). With the 1.8 flavor, it is possible to pass down a negative number and it may not fail depending on the number of blocks of the relation, so I think that you had better have a compatibility layer if a user has the new binaries but is still on 1.8. And that's surely a safe move. -- Michael
Attachment
On 2020-12-25 08:45, Michael Paquier wrote: >> If we really think that we ought to differentiate, then we could do what >> pg_stat_statement does, and have a separate C function that's called >> with the obsolete signature (pg_stat_statements_1_8 et al). > With the 1.8 flavor, it is possible to pass down a negative number > and it may not fail depending on the number of blocks of the relation, > so I think that you had better have a compatibility layer if a user > has the new binaries but is still on 1.8. And that's surely a safe > move. I think on 64-bit systems it's actually safe, but on 32-bit systems (with USE_FLOAT8_BYVAL), if you use the new binaries with the old SQL-level definitions, you'd get the int4 that is passed in interpreted as a pointer, which would lead to very bad things. So I think we need to create new functions with a different C symbol. I'll work on that.
On 2021-01-08 10:21, Peter Eisentraut wrote: > I think on 64-bit systems it's actually safe, but on 32-bit systems > (with USE_FLOAT8_BYVAL), if you use the new binaries with the old > SQL-level definitions, you'd get the int4 that is passed in interpreted > as a pointer, which would lead to very bad things. So I think we need > to create new functions with a different C symbol. I'll work on that. Updated patch that does that.
Attachment
On Fri, Jan 08, 2021 at 04:54:47PM +0100, Peter Eisentraut wrote: > Updated patch that does that. Thanks. Looks sane seen from here. +/* LCOV_EXCL_START */ Does it really make sense to add those markers here? It seems to me that we would ignore any new coverage if regression tests based on older versions are added (we may really want to have such tests for more in-core extensions to be able to verify the portability of an extension, but that's not the job of this patch of course). - elog(ERROR, "block 0 is a meta page"); + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("block 0 is a meta page"))); [...] + errmsg("block number %llu is out of range for relation \"%s\"", This does not follow the usual style for error reports that should not be written as full sentences? Maybe something like "invalid block number %u referring to meta page" and "block number out of range for relation %s: %llu"? -- Michael
Attachment
On 2021-01-09 02:46, Michael Paquier wrote: > +/* LCOV_EXCL_START */ > Does it really make sense to add those markers here? It seems to me > that we would ignore any new coverage if regression tests based on > older versions are added (we may really want to have such tests for > more in-core extensions to be able to verify the portability of an > extension, but that's not the job of this patch of course). If we had a way to do such testing then we wouldn't need these markers. But AFAICT, we don't. > - elog(ERROR, "block 0 is a meta page"); > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("block 0 is a meta page"))); > [...] > + errmsg("block number %llu is out of range for relation \"%s\"", > This does not follow the usual style for error reports that should not > be written as full sentences? Maybe something like "invalid block > number %u referring to meta page" and "block number out of range for > relation %s: %llu"? There are many error messages that say "[something] is out of range". I don't think banning that would serve any purpose.
On Sat, Jan 09, 2021 at 01:41:39PM +0100, Peter Eisentraut wrote: > If we had a way to do such testing then we wouldn't need these markers. But > AFAICT, we don't. Not sure I am following your point here. Taking your patch, it is possible to trigger the version of get_raw_page() <= 1.8 just with something like the following: create extension pageinspect version "1.8"; select get_raw_page('pg_class', 0); There are no in-core regression tests that check the compatibility of extensions with older versions, but it is technically possible to do so. -- Michael
Attachment
On 2021-01-11 09:09, Michael Paquier wrote: > On Sat, Jan 09, 2021 at 01:41:39PM +0100, Peter Eisentraut wrote: >> If we had a way to do such testing then we wouldn't need these markers. But >> AFAICT, we don't. > > Not sure I am following your point here. Taking your patch, it is > possible to trigger the version of get_raw_page() <= 1.8 just with > something like the following: > create extension pageinspect version "1.8"; > select get_raw_page('pg_class', 0); > > There are no in-core regression tests that check the compatibility of > extensions with older versions, but it is technically possible to do > so. Interesting idea. Here is a patch that incorporates that.
Attachment
On Wed, Jan 13, 2021 at 09:27:37AM +0100, Peter Eisentraut wrote: > Interesting idea. Here is a patch that incorporates that. Thanks for adding some coverage. This patch needs a small rebase as Heikki has just introduced some functions for gist, bumping the module to 1.9 (no need to bump to 1.10, right?). I don't have more comments by reading the code and my tests have passed after applying the patch on top of df10ac62. I would have also added some tests that check after blkno < 0 and > MaxBlockNumber in all the functions where it can be triggered as that's cheap for 1.8 and 1.9, but that it's a minor point. -- Michael
Attachment
On 2021-01-14 09:00, Michael Paquier wrote: > I don't have more comments by reading the code and my tests have > passed after applying the patch on top of df10ac62. I would have also > added some tests that check after blkno < 0 and > MaxBlockNumber in > all the functions where it can be triggered as that's cheap for 1.8 > and 1.9, but that it's a minor point. committed with some additional tests