Thread: standard normal cumulative distribution function

standard normal cumulative distribution function

From
SunWuKung
Date:
Does somebody have/know of a function for pg returning the standard
normal cumulative distribution for a Z score?

Thanks.
Balazs

Re: standard normal cumulative distribution function

From
Michael Fuhr
Date:
On Sat, Jan 21, 2006 at 12:00:49AM +0100, SunWuKung wrote:
> Does somebody have/know of a function for pg returning the standard
> normal cumulative distribution for a Z score?

Are you looking for something like this?

test=> SELECT z, cdf_ugaussian_p(z) FROM generate_series(-3, 3) AS g(z);
 z  |   cdf_ugaussian_p
----+---------------------
 -3 | 0.00134989803163009
 -2 |  0.0227501319481792
 -1 |   0.158655253931457
  0 |                 0.5
  1 |   0.841344746068543
  2 |   0.977249868051821
  3 |    0.99865010196837
(7 rows)

cdf_ugaussian_p() is just a little wrapper I put around the GNU
Scientific Library's gsl_cdf_ugaussian_P() function.  I can post
an example of how to do that if it's what you're looking for.

--
Michael Fuhr

Re: standard normal cumulative distribution function

From
SunWuKung
Date:
This is exactly what I was looking for, could you post that please.
thx
B.

> On Sat, Jan 21, 2006 at 12:00:49AM +0100, SunWuKung wrote:
> > Does somebody have/know of a function for pg returning the standard
> > normal cumulative distribution for a Z score?
>
> Are you looking for something like this?
>
> test=> SELECT z, cdf_ugaussian_p(z) FROM generate_series(-3, 3) AS g(z);
>  z  |   cdf_ugaussian_p
> ----+---------------------
>  -3 | 0.00134989803163009
>  -2 |  0.0227501319481792
>  -1 |   0.158655253931457
>   0 |                 0.5
>   1 |   0.841344746068543
>   2 |   0.977249868051821
>   3 |    0.99865010196837
> (7 rows)
>
> cdf_ugaussian_p() is just a little wrapper I put around the GNU
> Scientific Library's gsl_cdf_ugaussian_P() function.  I can post
> an example of how to do that if it's what you're looking for.
>
>

Re: standard normal cumulative distribution function

From
Michael Fuhr
Date:
On Sat, Jan 21, 2006 at 01:01:28AM +0100, SunWuKung wrote:
> This is exactly what I was looking for, could you post that please.

The instructions that follow are for building a function written
in C on Unix-like systems; if you're on another platform like Windows
then I don't know what steps you'll have to follow.  The instructions
also rely on the PGXS build infrastructure available in PostgreSQL
8.0 and later, although the Makefile should be trivial to modify
to work with 7.4 and earlier.  If you have any trouble building or
installing the code then you might wish to read "C-Language Functions"
and especially "Compiling and Linking Dynamically-Loaded Functions"
and "Extension Building Infrastructure" in the documentation.

http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html
http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#DFUNC
http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#XFUNC-C-PGXS

1. Install the GNU Scientific Library (GSL), either from a package
or by building it from source.

http://www.gnu.org/software/gsl/

We'll wrap a single GSL function, gsl_cdf_ugaussian_P(), and call
it cdf_ugaussian_p().  If you want to call it something else then
substitute your own name where I've written cdf_ugaussian_p.  This
example should also be useful as a template if you want to create
a PostgreSQL interface to other functions.

2. Create a directory for the code we're going to build; let's call
it pg_gsl.

mkdir pg_gsl
cd pg_gsl

3. Put the following C code in a file named pg_gsl.c:

#include "postgres.h"
#include "fmgr.h"
#include <gsl/gsl_cdf.h>

Datum  cdf_ugaussian_p(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1(cdf_ugaussian_p);

Datum
cdf_ugaussian_p(PG_FUNCTION_ARGS)
{
    PG_RETURN_FLOAT8(gsl_cdf_ugaussian_P(PG_GETARG_FLOAT8(0)));
}

4. Put the following SQL in a file named pg_gsl.sql.in (the build
process will create a file named pg_gsl.sql, substituting MODULE_PATHNAME
with an appropriate value):

CREATE OR REPLACE FUNCTION cdf_ugaussian_p(double precision)
RETURNS double precision
AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT;

5. Put the following in a file named Makefile:

MODULE_big  = pg_gsl
OBJS        = pg_gsl.o
DATA_built  = pg_gsl.sql

PG_CPPFLAGS = `gsl-config --cflags`
SHLIB_LINK  = `gsl-config --libs`

PGXS := $(shell pg_config --pgxs)
include $(PGXS)

6. Build the code by running "make".  This needs to be GNU Make,
which on some systems is "gmake".  If you have any trouble with
this step then read the documentation mentioned above; if you still
can't get it to work then please post the exact output from this
command.

gmake

The result should be a shared object with a name like libpg_gsl.so
(possibly with a different suffix, and there might be another file
with a similar name like libpg_gsl.so.0).

7. Install the code so PostgreSQL can find it.  You might need to
become root or another user to do this if you don't have permission
to write to the PostgreSQL directories.

gmake install

8. Create the function in your database.  Since this is a C function
you'll need to do this as a database superuser.

psql -U postgres -d dbname -f pg_gsl.sql

9. Test the function:

psql dbname

dbname=> select cdf_ugaussian_p(0);
 cdf_ugaussian_p
-----------------
             0.5
(1 row)

dbname=> select cdf_ugaussian_p(1);
  cdf_ugaussian_p
-------------------
 0.841344746068543
(1 row)

dbname=> select cdf_ugaussian_p(-1);
  cdf_ugaussian_p
-------------------
 0.158655253931457
(1 row)

That's it.  If you have trouble then please post what step you were
at, exactly what command you ran, the exact output you got, what
platform you're on, and what version of PostgreSQL you're using.

Now that I've written all this I suppose I could turn it into a
PgFoundry project.  Would there be any interest in a PostgreSQL
interface to the GNU Scientific Library?  Or has somebody already
done that and I simply overlooked it?

--
Michael Fuhr

Re: standard normal cumulative distribution function

From
Michael Fuhr
Date:
On Fri, Jan 20, 2006 at 04:54:00PM -0700, Michael Fuhr wrote:
> On Sat, Jan 21, 2006 at 12:00:49AM +0100, SunWuKung wrote:
> > Does somebody have/know of a function for pg returning the standard
> > normal cumulative distribution for a Z score?
>
> Are you looking for something like this?
>
> test=> SELECT z, cdf_ugaussian_p(z) FROM generate_series(-3, 3) AS g(z);

You can also do this with the R language, which you can access from
PostgreSQL via PL/R.

http://www.r-project.org/
http://www.joeconway.com/plr/

CREATE FUNCTION pnorm(z double precision) RETURNS double precision AS $$
return(pnorm(z))
$$ LANGUAGE plr IMMUTABLE STRICT;

SELECT z, pnorm(z) FROM generate_series(-3, 3) AS g(z);
 z  |        pnorm
----+---------------------
 -3 | 0.00134989803163009
 -2 |  0.0227501319481792
 -1 |   0.158655253931457
  0 |                 0.5
  1 |   0.841344746068543
  2 |   0.977249868051821
  3 |    0.99865010196837

--
Michael Fuhr

Re: standard normal cumulative distribution function

From
Klein Balázs
Date:
Thanks for posting this.
I looked at
 - this function, but I work on a Windows so it seemed too difficult for me,
 - looked at the R language, but installing a new language just for this
function seemed like an overkill,
- I also looked at the pgnumeric project that has a function for this - but
it installs cc.300 interdependent functions, which was again a bit too much
for me.

Finally I just uploaded a small table to look up values and wrote a very
simple function for this - at least I know what is happening in the db this
way.

Thanks again for the help.
SWK


-----Original Message-----
From: Michael Fuhr [mailto:mike@fuhr.org]
Sent: 2006. január 21. 3:28
To: SunWuKung
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] standard normal cumulative distribution function

On Sat, Jan 21, 2006 at 01:01:28AM +0100, SunWuKung wrote:
> This is exactly what I was looking for, could you post that please.

The instructions that follow are for building a function written
in C on Unix-like systems; if you're on another platform like Windows
then I don't know what steps you'll have to follow.  The instructions
also rely on the PGXS build infrastructure available in PostgreSQL
8.0 and later, although the Makefile should be trivial to modify
to work with 7.4 and earlier.  If you have any trouble building or
installing the code then you might wish to read "C-Language Functions"
and especially "Compiling and Linking Dynamically-Loaded Functions"
and "Extension Building Infrastructure" in the documentation.

http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html
http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#DFUNC
http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#XFUNC-C-PGXS

1. Install the GNU Scientific Library (GSL), either from a package
or by building it from source.

http://www.gnu.org/software/gsl/

We'll wrap a single GSL function, gsl_cdf_ugaussian_P(), and call
it cdf_ugaussian_p().  If you want to call it something else then
substitute your own name where I've written cdf_ugaussian_p.  This
example should also be useful as a template if you want to create
a PostgreSQL interface to other functions.

2. Create a directory for the code we're going to build; let's call
it pg_gsl.

mkdir pg_gsl
cd pg_gsl

3. Put the following C code in a file named pg_gsl.c:

#include "postgres.h"
#include "fmgr.h"
#include <gsl/gsl_cdf.h>

Datum  cdf_ugaussian_p(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1(cdf_ugaussian_p);

Datum
cdf_ugaussian_p(PG_FUNCTION_ARGS)
{
    PG_RETURN_FLOAT8(gsl_cdf_ugaussian_P(PG_GETARG_FLOAT8(0)));
}

4. Put the following SQL in a file named pg_gsl.sql.in (the build
process will create a file named pg_gsl.sql, substituting MODULE_PATHNAME
with an appropriate value):

CREATE OR REPLACE FUNCTION cdf_ugaussian_p(double precision)
RETURNS double precision
AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT;

5. Put the following in a file named Makefile:

MODULE_big  = pg_gsl
OBJS        = pg_gsl.o
DATA_built  = pg_gsl.sql

PG_CPPFLAGS = `gsl-config --cflags`
SHLIB_LINK  = `gsl-config --libs`

PGXS := $(shell pg_config --pgxs)
include $(PGXS)

6. Build the code by running "make".  This needs to be GNU Make,
which on some systems is "gmake".  If you have any trouble with
this step then read the documentation mentioned above; if you still
can't get it to work then please post the exact output from this
command.

gmake

The result should be a shared object with a name like libpg_gsl.so
(possibly with a different suffix, and there might be another file
with a similar name like libpg_gsl.so.0).

7. Install the code so PostgreSQL can find it.  You might need to
become root or another user to do this if you don't have permission
to write to the PostgreSQL directories.

gmake install

8. Create the function in your database.  Since this is a C function
you'll need to do this as a database superuser.

psql -U postgres -d dbname -f pg_gsl.sql

9. Test the function:

psql dbname

dbname=> select cdf_ugaussian_p(0);
 cdf_ugaussian_p
-----------------
             0.5
(1 row)

dbname=> select cdf_ugaussian_p(1);
  cdf_ugaussian_p
-------------------
 0.841344746068543
(1 row)

dbname=> select cdf_ugaussian_p(-1);
  cdf_ugaussian_p
-------------------
 0.158655253931457
(1 row)

That's it.  If you have trouble then please post what step you were
at, exactly what command you ran, the exact output you got, what
platform you're on, and what version of PostgreSQL you're using.

Now that I've written all this I suppose I could turn it into a
PgFoundry project.  Would there be any interest in a PostgreSQL
interface to the GNU Scientific Library?  Or has somebody already
done that and I simply overlooked it?

--
Michael Fuhr