Home > mailing lists

Thread: bigint out of range

bigint out of range

From

Daulat Ram

Date:

16 May 2019, 15:31:00

Hello team ,

We are getting ERROR: bigint out of range. Please help on this.

ERROR: bigint out of range

kbdb=# INSERT INTO kb_dar_ran_url_check (url_hash,stat_date,topic_id,site_id,url,status,user_comments,review_comments) VALUES (72894677781133866997924561390146294513,E'19-04-2019',32793,1035,E'https://bikez.com/search/index.php',1,NULL,NULL);

ERROR: bigint out of range

Table structure is :

Table "kb_test.kb_dar_ran_url_check"

Column | Type | Collation | Nullable | Default

-----------------+-------------------------+-----------+----------+---------

status | bigint | | |

url_hash | bigint | | not null |

url | character varying(4000) | | not null |

review_comments | character varying(4000) | | |

user_comments | character varying(4000) | | |

stat_date | character varying(128) | | not null |

topic_id | numeric(38,0) | | not null |

site_id | numeric(38,0) | | not null |

Partition key: LIST (stat_date)

Thanks,

Daulat

Re: bigint out of range

From

Geoff Winkless

Date:

16 May 2019, 15:42:01

On Thu, 16 May 2019 at 16:31, Daulat Ram <Daulat.Ram@exponential.com> wrote:

Hello team ,
We are getting ERROR: bigint out of range. Please help on this.

Bigint is -9223372036854775808 to 9223372036854775807.

https://www.postgresql.org/docs/current/datatype-numeric.html

Re: bigint out of range

From

"David G. Johnston"

Date:

16 May 2019, 15:48:51

On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com> wrote:

url_hash | bigint | | not null |

Change the type of url_hash; make it text instead of bigint.

As a bonus:

Use text instead of arbitrary varchar(4000) fields and add, e.g., check (length(url) < 4000) or something better.

Needing 38 digit integers for "id" fields seems odd.

Not sure what kind of date would need 128 characters to represent; ideally you'd just make that a "date" typed field and normalize on input.

David J.

Re: bigint out of range

From

Tony Shelver

Date:

16 May 2019, 16:51:34

Adding to what David said, I don't see much point of having a bigint status. Usually status attributes are a fixed set of values that can be checked programmatically.

Re: bigint out of range

From

"Peter J. Holzer"

Date:

18 May 2019, 09:35:19

On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
> On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com> wrote:
>
>
>     url_hash        | bigint                  |           | not null |
>
>
> Change the type of url_hash; make it text instead of bigint.

Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it.

> As a bonus:
>
> Use text instead of arbitrary varchar(4000) fields and add, e.g., check (length
> (url) < 4000) or something better.
>
> Needing 38 digit integers for "id" fields seems odd.

I guess the schema was copied from Oracle. In Oracle, all numbers are
really 38 digit decimal floating point numbers and the limit for
varchar2 is 4000.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: bigint out of range

From

"David G. Johnston"

Date:

18 May 2019, 17:49:53

On Saturday, May 18, 2019, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
> On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com> wrote:
>
>
> url_hash | bigint | | not null |
>
>
> Change the type of url_hash; make it text instead of bigint.

Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it.

You don’t perform math on a hash thus its not a number no matter that it may contain only digits.

David J.

Re: bigint out of range

From

"Peter J. Holzer"

Date:

18 May 2019, 19:27:03

On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
> On Saturday, May 18, 2019, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
>     On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
>     > On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
>     wrote:
>     >
>     >
>     >     url_hash        | bigint                  |           | not null |
>     >
>     >
>     > Change the type of url_hash; make it text instead of bigint.
>
>     Or numeric(38, 0). I think it isn't coincidence that he tries to store
>     a 38-digit number in it.
>
>
> You don’t perform math on a hash

That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.

> thus its not a number

This is just silly. All hash functions I have ever encountered compute a
single fixed size integer from a stream of integers. The result may be
larger than a machine word, in which case the representation in C (or a
similar low level language) may be an array of words (or bytes), but
it's still an integer.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: bigint out of range

From

Ron

Date:

18 May 2019, 20:19:22

On 5/18/19 2:27 PM, Peter J. Holzer wrote:
> On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
>> On Saturday, May 18, 2019, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>>
>>      On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
>>      > On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
>>      wrote:
>>      >
>>      >
>>      >     url_hash        | bigint                  |           | not null |
>>      >
>>      >
>>      > Change the type of url_hash; make it text instead of bigint.
>>
>>      Or numeric(38, 0). I think it isn't coincidence that he tries to store
>>      a 38-digit number in it.
>>
>>
>> You don’t perform math on a hash
> That's not generally true. Hashes are used for further computation for
> example in hash tables or in cryptography.

How is it "using math" to use a hash key in a hash lookup table?

-- 
Angular momentum makes the world go 'round.

Re: bigint out of range

From

"Peter J. Holzer"

Date:

18 May 2019, 20:49:41

On 2019-05-18 15:19:22 -0500, Ron wrote:
> On 5/18/19 2:27 PM, Peter J. Holzer wrote:
> > On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
> > > You don’t perform math on a hash
> > That's not generally true. Hashes are used for further computation for
> > example in hash tables or in cryptography.
>
> How is it "using math" to use a hash key in a hash lookup table?

hash modulo table size.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: bigint out of range

From

Ron

Date:

18 May 2019, 22:14:59

On 5/18/19 3:49 PM, Peter J. Holzer wrote:

On 2019-05-18 15:19:22 -0500, Ron wrote:

On 5/18/19 2:27 PM, Peter J. Holzer wrote:

On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:

You don’t perform math on a hash

That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.

How is it "using math" to use a hash key in a hash lookup table?

hash modulo table size.

I've seen that used when the tablespace is pre-allocated, and you hash modulo the tablespace page number. (Yes, performance tanks when you start filling up pages.) How do you hash on the (ever growing) table size?

--
Angular momentum makes the world go 'round.

Re: bigint out of range

From

"Peter J. Holzer"

Date:

18 May 2019, 22:39:52

On 2019-05-18 17:14:59 -0500, Ron wrote:
> On 5/18/19 3:49 PM, Peter J. Holzer wrote:
>
>     On 2019-05-18 15:19:22 -0500, Ron wrote:
>
>         On 5/18/19 2:27 PM, Peter J. Holzer wrote:
>
>             On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
>
>                 You don’t perform math on a hash
>
>             That's not generally true. Hashes are used for further computation for
>             example in hash tables or in cryptography.
>
>         How is it "using math" to use a hash key in a hash lookup table?
>
>     hash modulo table size.
>
>
> I've seen that used when the tablespace is pre-allocated, and you hash modulo
> the tablespace page number.  (Yes, performance tanks when you start filling up
> pages.)  How do you hash on the (ever growing) table size?

The hash function returns a number in a range much larger than the
possible number of buckets. 64 bits is a good choice today.

To determine the bucket you need to reduce this number to something in
the range [0, nr_buckets). This is where modulo comes in:

i = h % nr_buckets

If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries.

(If nr_buckets is a power of two, the modulo operation can be
efficiently implemented by using bitwise and)

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: bigint out of range

From

Ron

Date:

19 May 2019, 00:16:19

On 5/18/19 5:39 PM, Peter J. Holzer wrote:
> On 2019-05-18 17:14:59 -0500, Ron wrote:
>> On 5/18/19 3:49 PM, Peter J. Holzer wrote:
>>
>>      On 2019-05-18 15:19:22 -0500, Ron wrote:
>>
>>          On 5/18/19 2:27 PM, Peter J. Holzer wrote:
>>
>>              On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
>>
>>                  You don’t perform math on a hash
>>
>>              That's not generally true. Hashes are used for further computation for
>>              example in hash tables or in cryptography.
>>
>>          How is it "using math" to use a hash key in a hash lookup table?
>>
>>      hash modulo table size.
>>
>>
>> I've seen that used when the tablespace is pre-allocated, and you hash modulo
>> the tablespace page number.  (Yes, performance tanks when you start filling up
>> pages.)  How do you hash on the (ever growing) table size?
> The hash function returns a number in a range much larger than the
> possible number of buckets. 64 bits is a good choice today.
>
> To determine the bucket you need to reduce this number to something in
> the range [0, nr_buckets). This is where modulo comes in:
>
> i = h % nr_buckets
>
> If the the table fills up, you increase nr_buckets, reallocate and
> rehash all entries.

Ouch.  Response time on a big table would take a serious hit if that rehash 
happened in the middle of the day on a big OLTP system.  Even worse if it 
were a 24x365 system, because you couldn't schedule an enlargement/rehash 
during a down period.


-- 
Angular momentum makes the world go 'round.

Re: bigint out of range

From

"Peter J. Holzer"

Date:

19 May 2019, 10:43:52

On 2019-05-18 19:16:19 -0500, Ron wrote:
> On 5/18/19 5:39 PM, Peter J. Holzer wrote:
> > On 2019-05-18 17:14:59 -0500, Ron wrote:
> > > On 5/18/19 3:49 PM, Peter J. Holzer wrote:
> > >      On 2019-05-18 15:19:22 -0500, Ron wrote:
> > >          On 5/18/19 2:27 PM, Peter J. Holzer wrote:
> > >              On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
> > >
> > >                  You don’t perform math on a hash
> > >
> > >              That's not generally true. Hashes are used for further computation for
> > >              example in hash tables or in cryptography.
> > >
> > >          How is it "using math" to use a hash key in a hash lookup table?
> > >
> > >      hash modulo table size.
> > >
> > >
> > > I've seen that used when the tablespace is pre-allocated, and you hash modulo
> > > the tablespace page number.  (Yes, performance tanks when you start filling up
> > > pages.)  How do you hash on the (ever growing) table size?
> > The hash function returns a number in a range much larger than the
> > possible number of buckets. 64 bits is a good choice today.
> >
> > To determine the bucket you need to reduce this number to something in
> > the range [0, nr_buckets). This is where modulo comes in:
> >
> > i = h % nr_buckets
> >
> > If the the table fills up, you increase nr_buckets, reallocate and
> > rehash all entries.
>
> Ouch.  Response time on a big table would take a serious hit if that rehash
> happened in the middle of the day on a big OLTP system.

So that might be a reason not to use hash indexes tables where the worst
case insert time is a concern (the average insert time is still O(1)).

But please be aware that I answered your question 'How is it "using
math" to use a hash key?', not 'How are hash indexes in PostgreSQL
implemented?'. So my answer covered the most simple and generic
implementation.

You can split buckets lazily, and in fact PostgreSQL does this. I just
looked at the code briefly, but I don't really understand how it works.
Guess I would have to read Margo Seltzer's paper paper for that.

Still, even with that optimization (or tradeoff - it may make the lookup
slower) the README notes that splits are expensive.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

Re: bigint out of range

From

Ron

Date:

19 May 2019, 13:38:14

On 5/19/19 5:43 AM, Peter J. Holzer wrote:
[snip]
> But please be aware that I answered your question 'How is it "using
> math" to use a hash key?', not 'How are hash indexes in PostgreSQL
> implemented?'. So my answer covered the most simple and generic
> implementation.

I understand.

-- 
Angular momentum makes the world go 'round.

Re: bigint out of range

From

Stephen Frost

Date:

19 May 2019, 13:54:22

Greetings,

* David G. Johnston (david.g.johnston@gmail.com) wrote:
> On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
> wrote:
>
> > url_hash        | bigint                  |           | not null |
>
> Change the type of url_hash; make it text instead of bigint.

Making it text wastes a bunch of space actually, since it's really a
binary value.  I tend to recommend using a bytea for storing hashes.
Given that hashes are really fixed width, it'd be nice if we had a set
of proper datatypes for them, perhaps not so much to avoid the 1-byte
overhead from storing as a variable-length bytea, but because we could
then avoid having a 7-byte hole due to padding if the hash is followed
by a bigint or such.

Thanks,

Stephen

Attachment

signature.asc

Re: bigint out of range

From

Stephen Frost

Date:

19 May 2019, 14:04:34

Greetings,

* Peter J. Holzer (hjp-pgsql@hjp.at) wrote:
> On 2019-05-18 19:16:19 -0500, Ron wrote:
> > > If the the table fills up, you increase nr_buckets, reallocate and
> > > rehash all entries.
> >
> > Ouch.  Response time on a big table would take a serious hit if that rehash
> > happened in the middle of the day on a big OLTP system.

As noted below, that isn't actually how it works with PG's hash indexes.

> So that might be a reason not to use hash indexes tables where the worst
> case insert time is a concern (the average insert time is still O(1)).

The worst-case insert time is certainly worse than the average but it's
not nearly as bad as being imagined here.

> You can split buckets lazily, and in fact PostgreSQL does this. I just
> looked at the code briefly, but I don't really understand how it works.
> Guess I would have to read Margo Seltzer's paper paper for that.
>
> Still, even with that optimization (or tradeoff - it may make the lookup
> slower) the README notes that splits are expensive.

There's multiple different levels, really, from "best case" where the
new item can just be placed directly on to a page that has free space,
to "slightly worse case" where an overflow page has to be added, to
"even worse case" where a prior page split has to be finished, to
"probably worst case" where a full page split has to happen.  There
might even be some pathological cases where a prior page split has to be
finished and then an overflow page has to be added and then another page
split has to be done, or some such.  Even so, for equality-based
lookups (where the data set has few or zero duplicates), hash indexes
can work quite well, but it's of course good to understand that inserts
aren't always going to be exactly the same speed every time.

Thanks,

Stephen

Attachment

signature.asc