Thread: Issues with factorial operator
I'm working with a customer that recently discovered that some code had generated the following nice query... SELECT ... WHERE table_id = 92838278! AND ... So their production server now has several processes that are trying to compute some absurdly large factorial. There's two issues here: 1) the computation doesn't check for signals. This means both a plain kill and pg_cancel_backend() are useless. 2) Even though the answer is going to be an obscene number of digits, and that's supposed to be fed into a numeric, there's no overflow or bounds checking occurring. This is true even if I store into a field defined as numeric: decibel=# create table n(n numeric); CREATE TABLE decibel=# insert into n select 3333!; INSERT 0 1 decibel=# select char_length(trim(n, '0')) from n;char_length ------------- 9466 (1 row) So at the very least the documentation is confusing: The type numeric can store numbers with up to 1000 digits of precision and perform calculations exactly. ... Specifying NUMERIC without any precision or scale creates a column in which numeric values of any precision and scale can be stored, up to the implementation limit on precision. Yet here we have a numeric that's storing nearly 10,000 digits of precision. -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
It makes sense with factorial function to do an error check on the domain. Calculate beforehand, and figure out what the largest sensible domain value is. For instance, in Maple, I get this: > y:=92838278!; Error, object too large > The error message returns instantly. For reasonably large values, it might make sense to pre-compute factorials and store them in an array. It should also be possible to store 1/2 of Pascal's triangle in memory and demand load that memory segment the first time someone asks for factorials or combinations or permutations. Just a thought. > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Jim C. Nasby > Sent: Friday, June 08, 2007 6:45 PM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] Issues with factorial operator > > I'm working with a customer that recently discovered that some code had > generated the following nice query... > > SELECT ... WHERE table_id = 92838278! AND ... > > So their production server now has several processes that are trying to > compute some absurdly large factorial. There's two issues here: > > 1) the computation doesn't check for signals. This means both a plain > kill and pg_cancel_backend() are useless. > > 2) Even though the answer is going to be an obscene number of digits, > and that's supposed to be fed into a numeric, there's no overflow or > bounds checking occurring. This is true even if I store into a field > defined as numeric: > > decibel=# create table n(n numeric); > CREATE TABLE > decibel=# insert into n select 3333!; > INSERT 0 1 > decibel=# select char_length(trim(n, '0')) from n; > char_length > ------------- > 9466 > (1 row) > > So at the very least the documentation is confusing: > > The type numeric can store numbers with up to 1000 digits of precision > and perform calculations exactly. > ... > Specifying > > NUMERIC > > without any precision or scale creates a column in which numeric values > of any precision and scale can be stored, up to the implementation limit > on precision. > > Yet here we have a numeric that's storing nearly 10,000 digits of > precision. > -- > Jim Nasby decibel@decibel.org > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Hi, 2007/6/9, Dann Corbit <DCorbit@connx.com>: > It makes sense with factorial function to do an error check on the > domain. Calculate beforehand, and figure out what the largest sensible > domain value is. well, in fact what we need is to calculate log10(n!) first to see if the result will get exceeded. > > For instance, in Maple, I get this: > > y:=92838278!; > Error, object too large > > > > The error message returns instantly. > > For reasonably large values, it might make sense to pre-compute > factorials and store them in an array. >It should also be possible to > store 1/2 of Pascal's triangle in memory and demand load that memory > segment the first time someone asks for factorials or combinations or > permutations. there may be too much memories to waste in that case... :-( Regards CUI Shijun
> -----Original Message----- > From: Cui Shijun [mailto:rancpine@gmail.com] > Sent: Friday, June 08, 2007 11:11 PM > To: Dann Corbit > Cc: Jim C. Nasby; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Issues with factorial operator > > Hi, > > 2007/6/9, Dann Corbit <DCorbit@connx.com>: > > It makes sense with factorial function to do an error check on the > > domain. Calculate beforehand, and figure out what the largest sensible > > domain value is. > > well, in fact what we need is to calculate log10(n!) first to see if > the result will get exceeded. #include <math.h> double log10nfactorialestimate(unsigned n) { unsigned i; double estimate = 0; for (i = 1; i < n; i++) estimate += log10(n); return estimate; } #ifdef UNIT_TEST #include <stdio.h> #include <time.h> int main(void) { clock_t start, end; double answer; start = clock(); end = clock(); answer= log10nfactorialestimate(92838278); printf("log 10 of 92838278! is pretty close to %g and took %g seconds\n", answer, (end - start) / (1.0 * CLOCKS_PER_SEC)); return 0; } #endif /* C:\tmp>cl /W4 /Ox /DUNIT_TEST log10EST.C Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86 Copyright (C) Microsoft Corporation. All rights reserved. log10EST.C Microsoft (R) Incremental Linker Version 8.00.50727.42 Copyright (C) Microsoft Corporation. All rights reserved. /out:log10EST.exe log10EST.obj C:\tmp>log10est log 10 of 92838278! is pretty close to 7.3971e+008 and took 0 seconds */ > > > > For instance, in Maple, I get this: > > > y:=92838278!; > > Error, object too large > > > > > > > The error message returns instantly. > > > > For reasonably large values, it might make sense to pre-compute > > factorials and store them in an array. > >It should also be possible to > > store 1/2 of Pascal's triangle in memory and demand load that memory > > segment the first time someone asks for factorials or combinations or > > permutations. > > there may be too much memories to waste in that case... :-( 64 bit address space is coming. Are you ready for it? > Regards > CUI Shijun
2007/6/9, Dann Corbit <DCorbit@connx.com>: > #include <math.h> > > double log10nfactorialestimate(unsigned n) > { > unsigned i; > double estimate = 0; > for (i = 1; i < n; i++) > estimate += log10(n); > return estimate; > } > > #ifdef UNIT_TEST > #include <stdio.h> > #include <time.h> > int main(void) > { > clock_t start, > end; > double answer; > start = clock(); > end = clock(); > answer = log10nfactorialestimate(92838278); > printf("log 10 of 92838278! is pretty close to %g and took %g > seconds\n", > answer, (end - start) / (1.0 * CLOCKS_PER_SEC)); > return 0; > } > #endif > /* > C:\tmp>cl /W4 /Ox /DUNIT_TEST log10EST.C > Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 > for 80x86 > Copyright (C) Microsoft Corporation. All rights reserved. > > log10EST.C > Microsoft (R) Incremental Linker Version 8.00.50727.42 > Copyright (C) Microsoft Corporation. All rights reserved. > > /out:log10EST.exe > log10EST.obj > > C:\tmp>log10est > log 10 of 92838278! is pretty close to 7.3971e+008 and took 0 seconds > */ Hum... I think there is a little improvement: when n is too large,(say n>10, 000) we can use Stirling's formula to get the estimated value of n! :-)
> -----Original Message----- [snip] > Hum... I think there is a little improvement: when n is too large,(say > n>10, 000) we can use Stirling's formula to get the estimated value of > n! :-) Or (rather) the log base 10 of Stirling's formula. The n! estimator will overflow for sure, unless we take the log of it. Rather than all that, why not just figure out what the largest number of digits we will allow is and then don't allow inputs that will generate more than that. The program I gave could be run with the target accuracy as the break out of the loop and then the test would be: <type> factorial(<type> n) { if (n > CONSTANT_PRECOMPUTED_LIMIT) return NULL; else { return compute_actual_factorial(n); } }
yeah, simple and correct, I like that. :-) 2007/6/9, Dann Corbit <DCorbit@connx.com>: > > -----Original Message----- > [snip] > > Hum... I think there is a little improvement: when n is too large,(say > > n>10, 000) we can use Stirling's formula to get the estimated value of > > n! :-) > > Or (rather) the log base 10 of Stirling's formula. The n! estimator > will overflow for sure, unless we take the log of it. > > Rather than all that, why not just figure out what the largest number of > digits we will allow is and then don't allow inputs that will generate > more than that. > > The program I gave could be run with the target accuracy as the break > out of the loop and then the test would be: > > <type> factorial(<type> n) > { > if (n > CONSTANT_PRECOMPUTED_LIMIT) > return NULL; > else > { > return compute_actual_factorial(n); > } > } >
"Jim C. Nasby" <decibel@decibel.org> writes: > So at the very least the documentation is confusing: > The type numeric can store numbers with up to 1000 digits of precision > and perform calculations exactly. This documentation is outright wrong. The grain of truth behind the statement is that the parser won't let you declare numeric(N) columns with N > 1000. But unconstrained numeric can be a lot larger. The hard limit of the format seems to be 10^128K. I agree that a CHECK_FOR_INTERRUPTS in numeric_fac wouldn't be a bad idea, and we can reject arguments that are clearly going to overflow. regards, tom lane