Thread: gamma() and lgamma() functions
Attached is a patch adding support for the gamma and log-gamma functions. See, for example: https://en.wikipedia.org/wiki/Gamma_function I think these are very useful general-purpose mathematical functions. They're part of C99, and they're commonly included in other mathematical libraries, such as the python math module, so I think it's useful to make them available from SQL. The error-handling for these functions seems to be a little trickier than most, so that might need further discussion. Regards, Dean
Attachment
On Mon, Jul 1, 2024 at 5:33 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Attached is a patch adding support for the gamma and log-gamma
functions. See, for example:
https://en.wikipedia.org/wiki/Gamma_function
I think these are very useful general-purpose mathematical functions.
They're part of C99, and they're commonly included in other
mathematical libraries, such as the python math module, so I think
it's useful to make them available from SQL.
The error-handling for these functions seems to be a little trickier
than most, so that might need further discussion.
Regards,
Dean
Hi! The patch file seems broken.
git apply gamma-and-lgamma.patch
error: git apply: bad git-diff — exptec /dev/null in line 2
Best regards, Stepan Neretin.
> On 1 Jul 2024, at 16:20, Stepan Neretin <sncfmgg@gmail.com> wrote: > The patch file seems broken. > git apply gamma-and-lgamma.patch error: git apply: bad git-diff — exptec /dev/null in line 2 It's a plain patch file, if you apply it with patch and not git it will work fine: $ patch -p1 < gamma-and-lgamma.patch patching file 'doc/src/sgml/func.sgml' patching file 'src/backend/utils/adt/float.c' patching file 'src/include/catalog/pg_proc.dat' patching file 'src/test/regress/expected/float8.out' patching file 'src/test/regress/sql/float8.sql' -- Daniel Gustafsson
On Mon, Jul 1, 2024 at 5:33 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Attached is a patch adding support for the gamma and log-gamma
functions. See, for example:
https://en.wikipedia.org/wiki/Gamma_function
I think these are very useful general-purpose mathematical functions.
They're part of C99, and they're commonly included in other
mathematical libraries, such as the python math module, so I think
it's useful to make them available from SQL.
The error-handling for these functions seems to be a little trickier
than most, so that might need further discussion.
Regards,
Dean
ERROR: value out of range: overflow. I think we need to add information about the available ranges in the error message
On 2024-Jul-01, Stepan Neretin wrote: > I have one notice: > ERROR: value out of range: overflow. I think we need to add information > about the available ranges in the error message I think this is a project of its own. The error comes from calling float_overflow_error(), which is a generic routine used in several functions which have different overflow conditions. It's not clear to me that throwing errors listing the specific range for each case is worth the effort. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio)
On Mon, 1 Jul 2024 at 16:04, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Jul-01, Stepan Neretin wrote: > > > I have one notice: > > ERROR: value out of range: overflow. I think we need to add information > > about the available ranges in the error message > > I think this is a project of its own. The error comes from calling > float_overflow_error(), which is a generic routine used in several > functions which have different overflow conditions. It's not clear to > me that throwing errors listing the specific range for each case is > worth the effort. > Right. It's also worth noting that gamma() has several distinct ranges of validity for which it doesn't overflow, so it'd be hard to codify that succinctly in an error message. Something that bothers me about float.c is that there is precious little consistency as to whether functions raise overflow errors or return infinity. For example, exp() gives an overflow error for sufficiently large (finite) inputs, whereas sinh() and cosh() (very closely related) return infinity. I think raising an error is the more technically correct thing to do, but returning infinity is sometimes perhaps more practically useful. However, given how much more quickly the result from gamma() explodes, I think it's more useful for it to raise an error so that you know you have a problem, and that you probably want to use lgamma() instead. (As an aside, I've seen people (and ChatGPT) write algorithms to solve the Birthday Problem using the gamma() function. That doesn't work because gamma(366) overflows, so you have to use lgamma().) Regards, Dean
On 01.07.24 12:33, Dean Rasheed wrote: > Attached is a patch adding support for the gamma and log-gamma > functions. See, for example: > > https://en.wikipedia.org/wiki/Gamma_function > > I think these are very useful general-purpose mathematical functions. > They're part of C99, and they're commonly included in other > mathematical libraries, such as the python math module, so I think > it's useful to make them available from SQL. What are examples of where this would be useful in a database context? > The error-handling for these functions seems to be a little trickier > than most, so that might need further discussion. Yeah, this is quite something. I'm not sure why you are doing the testing for special values (NaN etc.) yourself when the C library function already does it. For example, if I remove the isnan(arg1) check in your dlgamma(), then it still behaves the same in all tests. However, the same does not happen in your dgamma(). The overflow checks after the C library call are written differently for the two functions. dgamma() does not check errno for ERANGE for example. It might also be good if dgamma() checked errno for EDOM, because other the result of gamma(-1) being "overflow" seems a bit wrong. I'm also not clear why you are turning a genuine result overflow into infinity in lgamma(). I think this could use some kind of chart or something about all the possible behaviors and how the C library reports them and what we intend to do with them. Btw., I'm reading that lgamma() in the C library is not necessarily thread-safe. Is that a possible problem?
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On Fri, 23 Aug 2024 at 13:40, Peter Eisentraut <peter@eisentraut.org> wrote: >> What are examples of where this would be useful in a database context? > Of course, there's a somewhat fuzzy line between what is generally > useful enough, and what is too specialised for core Postgres, but I > would argue that these qualify, since they are part of C99, and > commonly appear in other general-purpose math libraries like the > Python math module. Yeah, I think any math function that's part of C99 or POSIX is arguably of general interest. >> I'm not sure why you are doing the testing for special values (NaN etc.) >> yourself when the C library function already does it. For example, if I >> remove the isnan(arg1) check in your dlgamma(), then it still behaves >> the same in all tests. > It's useful to do that so that we don't need to assume that every > platform conforms to the POSIX standard, and because it simplifies the > later overflow checks. This is consistent with the approach taken in > other functions, such as dexp(), dsin(), dcos(), etc. dexp() and those other functions date from the late stone age, before it was safe to assume that libm's behavior matched the POSIX specs. Today I think we can assume that, at least till proven differently. There's not necessarily anything wrong with what you've coded, but I don't buy this argument for it. >> Btw., I'm reading that lgamma() in the C library is not necessarily >> thread-safe. Is that a possible problem? > It's not quite clear what to do about that. Per the Linux man page, the reason lgamma() isn't thread-safe is The lgamma(), lgammaf(), and lgammal() functions return the natural logarithm of the absolute value of the Gamma function. The sign of the Gamma function is returned in the external integer signgam declared in <math.h>. It is 1 when the Gamma function is positive or zero, -1 when it is negative. AFAICS this patch doesn't inspect signgam, so whether it gets overwritten by a concurrent thread wouldn't matter. However, it'd be a good idea to add a comment noting the hazard. (Presumably, the reason POSIX says "need not be thread-safe" is that an implementation could make it thread-safe by making signgam thread-local, but the standard doesn't wish to mandate that.) regards, tom lane
I wrote: > AFAICS this patch doesn't inspect signgam, so whether it gets > overwritten by a concurrent thread wouldn't matter. However, > it'd be a good idea to add a comment noting the hazard. Further to that ... I looked at POSIX issue 8 (I had been reading 7) and found this illuminating discussion: Earlier versions of this standard did not require lgamma(), lgammaf(), and lgammal() to be thread-safe because signgam was a global variable. They are now required to be thread-safe to align with the ISO C standard (which, since the introduction of threads in 2011, requires that they avoid data races), with the exception that they need not avoid data races when storing a value in the signgam variable. Since signgam is not specified by the ISO C standard, this exception is not a conflict with that standard. So the other reason to avoid using signgam is that it might not exist at all in some libraries. regards, tom lane
Hi! I think having gamma() and lgamma() functions in PostgreSQL would be useful for some users, this is asked [1]. I have a question regarding the current implementation of gamma() function. Code block: + if (errno == ERANGE && arg1 != 0) + { + if (result != 0.0) + float_overflow_error(); + else + float_underflow_error(); + } + else if (isinf(result) && arg1 != 0 && !isinf(arg1)) + float_overflow_error(); + else if (result == 0.0) + float_underflow_error(); Why in some cases (if arg1 is close to 0, but not 0) an error (float_overflow_error) will be returned, but in the case of "arg1 = 0" the value 'Infinity' will be returned? For example: >SELECT gamma(float8 '1e-320'); ERROR: value out of range: overflow >SELECT gamma(float8 '0'); gamma ---------- Infinity (1 row) Perhaps it would be logical if the behavior in these cases was the same (either ERROR or 'Infinity')? Links: [1] https://stackoverflow.com/questions/58884066/how-can-i-run-the-equivalent-of-excels-gammaln-function-in-postgres -- With best regards, Dmitry Koval Postgres Professional: http://postgrespro.com
On Thu, 14 Nov 2024 at 16:28, Dmitry Koval <d.koval@postgrespro.ru> wrote: > > I think having gamma() and lgamma() functions in PostgreSQL would be > useful for some users, this is asked [1]. > Thanks for looking. > >SELECT gamma(float8 '1e-320'); > ERROR: value out of range: overflow > > >SELECT gamma(float8 '0'); > gamma > ---------- > Infinity > (1 row) > That's correct since gamma(1e-320) is roughly 1e320, which overflows the double precision type, but it's not actually infinite, whereas gamma(0) is infinite. > Perhaps it would be logical if the behavior in these cases was the same > (either ERROR or 'Infinity')? > In general, I think that having gamma() throw overflow errors is useful for spotting real problems in user code. In my experience, it's uncommon to pass negative or even close-to-zero inputs to gamma(), but it is common to fail to appreciate just how quickly the result grows for positive inputs (it overflows for inputs just over 171.6), so it seems better to be made aware of such problems (which might be solved by using lgamma() instead). So I think throwing overflow errors is the most useful thing to do from a practical point of view, and if we do that for too-large inputs, it makes sense to do the same for too-close-to-zero inputs. OTOH, gamma(+/-0) = +/-Infinity is right from a mathematical point of view, and consistent with the POSIX spec, and it's also consistent with the functions cot() and cotd(). Regards, Dean
On Thu, 14 Nov 2024 at 22:35, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > On Thu, 14 Nov 2024 at 16:28, Dmitry Koval <d.koval@postgrespro.ru> wrote: > > > > >SELECT gamma(float8 '1e-320'); > > ERROR: value out of range: overflow > > > > >SELECT gamma(float8 '0'); > > gamma > > ---------- > > Infinity > > (1 row) > > > > Perhaps it would be logical if the behavior in these cases was the same > > (either ERROR or 'Infinity')? > > In general, I think that having gamma() throw overflow errors is > useful for spotting real problems in user code. > Thinking about this afresh, I remain of the opinion that having the gamma function throw overflow errors is useful for inputs that are too large, like gamma(200). But then, if we're going to do that, maybe we should just do the same for other invalid inputs (zero, negative integers, and -Inf). That resolves the consistency issue with inputs very close to zero, and seems like a practical solution. Attached is an update doing that. Regards, Dean