Re: Improper use about DatumGetInt32 - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Improper use about DatumGetInt32
Date
Msg-id CAExHW5s7e_4F3ANDTVZ8ESMojCnFUrpanC9srTrui8HMb895Zg@mail.gmail.com
Whole thread Raw
In response to Re: Improper use about DatumGetInt32  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Improper use about DatumGetInt32  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Wed, Sep 23, 2020 at 1:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Mon, Sep 21, 2020 at 3:53 PM Andres Freund <andres@anarazel.de> wrote:
> >> I think we mostly use it for the few places where we currently expose
> >> data as a signed integer on the SQL level, but internally actually treat
> >> it as a unsigned data.
>
> > So why is the right solution to that not DatumGetInt32() + a cast to uint32?
>
> You're ignoring the xid use-case, for which DatumGetUInt32 actually is
> the right thing.

There is DatumGetTransactionId() which should be used instead.
That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's
there but only defined in xid.c. So pg_xact_commit_timestamp(),
pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use
PG_GETARG_UNIT32. IMO those should be changed to use
PG_GETARG_TRANSACTIONID. That would require moving
PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where
other PG_GETARG_* are.

> I tend to agree though that if the SQL argument is
> of a signed type, the least API-abusing answer is a signed DatumGetXXX
> macro followed by whatever cast you need.
>

I looked for some uses of PG_GETARG_UNIT32() which is the counterpart
of DatumGetUint32(). Found some buggy usages apart from the ones which
can be converted to PG_GETARG_TRANSACTIONID listed above.
normal_rand() for example returns a huge number of rows and takes
forever if we pass a negative first argument to it. Someone could
misuse that for a DOS attack or it could be just an accident that they
pass a negative value to that function and the query takes forever.
explain analyze select count(*) from normal_rand(-1000000, 1.0, 1.0);
                                                               QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=12.50..12.51 rows=1 width=8) (actual
time=2077574.718..2077574.719 rows=1 loops=1)
   ->  Function Scan on normal_rand  (cost=0.00..10.00 rows=1000
width=0) (actual time=1005176.149..1729994.366 rows=4293967296
loops=1)
 Planning Time: 0.346 ms
 Execution Time: 2079034.835 ms

get_raw_page() also does similar thing but the effect is not as dangerous
SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1;
  ERROR:  block number 4294967295 is out of range for relation "test1"
Similarly for bt_page_stats() and bt_page_items()

PFA patches to correct those.

There's Oracle compatible chr() which also uses PG_GETARG_UINT32() but
it's (accidentally?) reporting the negative inputs correctly because
it filters out very large values and reports those using %d. It's
arguable whether we should change that, so I have left it untouched.
But I think we should change that as well and get rid of
PG_GETARG_UNIT32 altogether. This will prevent any future misuse.


--
Best Wishes,
Ashutosh Bapat

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: new heapcheck contrib module
Next
From: Fujii Masao
Date:
Subject: Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)