Thread: standard normal cumulative distribution function
Does somebody have/know of a function for pg returning the standard normal cumulative distribution for a Z score? Thanks. Balazs
On Sat, Jan 21, 2006 at 12:00:49AM +0100, SunWuKung wrote: > Does somebody have/know of a function for pg returning the standard > normal cumulative distribution for a Z score? Are you looking for something like this? test=> SELECT z, cdf_ugaussian_p(z) FROM generate_series(-3, 3) AS g(z); z | cdf_ugaussian_p ----+--------------------- -3 | 0.00134989803163009 -2 | 0.0227501319481792 -1 | 0.158655253931457 0 | 0.5 1 | 0.841344746068543 2 | 0.977249868051821 3 | 0.99865010196837 (7 rows) cdf_ugaussian_p() is just a little wrapper I put around the GNU Scientific Library's gsl_cdf_ugaussian_P() function. I can post an example of how to do that if it's what you're looking for. -- Michael Fuhr
This is exactly what I was looking for, could you post that please. thx B. > On Sat, Jan 21, 2006 at 12:00:49AM +0100, SunWuKung wrote: > > Does somebody have/know of a function for pg returning the standard > > normal cumulative distribution for a Z score? > > Are you looking for something like this? > > test=> SELECT z, cdf_ugaussian_p(z) FROM generate_series(-3, 3) AS g(z); > z | cdf_ugaussian_p > ----+--------------------- > -3 | 0.00134989803163009 > -2 | 0.0227501319481792 > -1 | 0.158655253931457 > 0 | 0.5 > 1 | 0.841344746068543 > 2 | 0.977249868051821 > 3 | 0.99865010196837 > (7 rows) > > cdf_ugaussian_p() is just a little wrapper I put around the GNU > Scientific Library's gsl_cdf_ugaussian_P() function. I can post > an example of how to do that if it's what you're looking for. > >
On Sat, Jan 21, 2006 at 01:01:28AM +0100, SunWuKung wrote: > This is exactly what I was looking for, could you post that please. The instructions that follow are for building a function written in C on Unix-like systems; if you're on another platform like Windows then I don't know what steps you'll have to follow. The instructions also rely on the PGXS build infrastructure available in PostgreSQL 8.0 and later, although the Makefile should be trivial to modify to work with 7.4 and earlier. If you have any trouble building or installing the code then you might wish to read "C-Language Functions" and especially "Compiling and Linking Dynamically-Loaded Functions" and "Extension Building Infrastructure" in the documentation. http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#DFUNC http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#XFUNC-C-PGXS 1. Install the GNU Scientific Library (GSL), either from a package or by building it from source. http://www.gnu.org/software/gsl/ We'll wrap a single GSL function, gsl_cdf_ugaussian_P(), and call it cdf_ugaussian_p(). If you want to call it something else then substitute your own name where I've written cdf_ugaussian_p. This example should also be useful as a template if you want to create a PostgreSQL interface to other functions. 2. Create a directory for the code we're going to build; let's call it pg_gsl. mkdir pg_gsl cd pg_gsl 3. Put the following C code in a file named pg_gsl.c: #include "postgres.h" #include "fmgr.h" #include <gsl/gsl_cdf.h> Datum cdf_ugaussian_p(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(cdf_ugaussian_p); Datum cdf_ugaussian_p(PG_FUNCTION_ARGS) { PG_RETURN_FLOAT8(gsl_cdf_ugaussian_P(PG_GETARG_FLOAT8(0))); } 4. Put the following SQL in a file named pg_gsl.sql.in (the build process will create a file named pg_gsl.sql, substituting MODULE_PATHNAME with an appropriate value): CREATE OR REPLACE FUNCTION cdf_ugaussian_p(double precision) RETURNS double precision AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; 5. Put the following in a file named Makefile: MODULE_big = pg_gsl OBJS = pg_gsl.o DATA_built = pg_gsl.sql PG_CPPFLAGS = `gsl-config --cflags` SHLIB_LINK = `gsl-config --libs` PGXS := $(shell pg_config --pgxs) include $(PGXS) 6. Build the code by running "make". This needs to be GNU Make, which on some systems is "gmake". If you have any trouble with this step then read the documentation mentioned above; if you still can't get it to work then please post the exact output from this command. gmake The result should be a shared object with a name like libpg_gsl.so (possibly with a different suffix, and there might be another file with a similar name like libpg_gsl.so.0). 7. Install the code so PostgreSQL can find it. You might need to become root or another user to do this if you don't have permission to write to the PostgreSQL directories. gmake install 8. Create the function in your database. Since this is a C function you'll need to do this as a database superuser. psql -U postgres -d dbname -f pg_gsl.sql 9. Test the function: psql dbname dbname=> select cdf_ugaussian_p(0); cdf_ugaussian_p ----------------- 0.5 (1 row) dbname=> select cdf_ugaussian_p(1); cdf_ugaussian_p ------------------- 0.841344746068543 (1 row) dbname=> select cdf_ugaussian_p(-1); cdf_ugaussian_p ------------------- 0.158655253931457 (1 row) That's it. If you have trouble then please post what step you were at, exactly what command you ran, the exact output you got, what platform you're on, and what version of PostgreSQL you're using. Now that I've written all this I suppose I could turn it into a PgFoundry project. Would there be any interest in a PostgreSQL interface to the GNU Scientific Library? Or has somebody already done that and I simply overlooked it? -- Michael Fuhr
On Fri, Jan 20, 2006 at 04:54:00PM -0700, Michael Fuhr wrote: > On Sat, Jan 21, 2006 at 12:00:49AM +0100, SunWuKung wrote: > > Does somebody have/know of a function for pg returning the standard > > normal cumulative distribution for a Z score? > > Are you looking for something like this? > > test=> SELECT z, cdf_ugaussian_p(z) FROM generate_series(-3, 3) AS g(z); You can also do this with the R language, which you can access from PostgreSQL via PL/R. http://www.r-project.org/ http://www.joeconway.com/plr/ CREATE FUNCTION pnorm(z double precision) RETURNS double precision AS $$ return(pnorm(z)) $$ LANGUAGE plr IMMUTABLE STRICT; SELECT z, pnorm(z) FROM generate_series(-3, 3) AS g(z); z | pnorm ----+--------------------- -3 | 0.00134989803163009 -2 | 0.0227501319481792 -1 | 0.158655253931457 0 | 0.5 1 | 0.841344746068543 2 | 0.977249868051821 3 | 0.99865010196837 -- Michael Fuhr
Thanks for posting this. I looked at - this function, but I work on a Windows so it seemed too difficult for me, - looked at the R language, but installing a new language just for this function seemed like an overkill, - I also looked at the pgnumeric project that has a function for this - but it installs cc.300 interdependent functions, which was again a bit too much for me. Finally I just uploaded a small table to look up values and wrote a very simple function for this - at least I know what is happening in the db this way. Thanks again for the help. SWK -----Original Message----- From: Michael Fuhr [mailto:mike@fuhr.org] Sent: 2006. január 21. 3:28 To: SunWuKung Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] standard normal cumulative distribution function On Sat, Jan 21, 2006 at 01:01:28AM +0100, SunWuKung wrote: > This is exactly what I was looking for, could you post that please. The instructions that follow are for building a function written in C on Unix-like systems; if you're on another platform like Windows then I don't know what steps you'll have to follow. The instructions also rely on the PGXS build infrastructure available in PostgreSQL 8.0 and later, although the Makefile should be trivial to modify to work with 7.4 and earlier. If you have any trouble building or installing the code then you might wish to read "C-Language Functions" and especially "Compiling and Linking Dynamically-Loaded Functions" and "Extension Building Infrastructure" in the documentation. http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#DFUNC http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html#XFUNC-C-PGXS 1. Install the GNU Scientific Library (GSL), either from a package or by building it from source. http://www.gnu.org/software/gsl/ We'll wrap a single GSL function, gsl_cdf_ugaussian_P(), and call it cdf_ugaussian_p(). If you want to call it something else then substitute your own name where I've written cdf_ugaussian_p. This example should also be useful as a template if you want to create a PostgreSQL interface to other functions. 2. Create a directory for the code we're going to build; let's call it pg_gsl. mkdir pg_gsl cd pg_gsl 3. Put the following C code in a file named pg_gsl.c: #include "postgres.h" #include "fmgr.h" #include <gsl/gsl_cdf.h> Datum cdf_ugaussian_p(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(cdf_ugaussian_p); Datum cdf_ugaussian_p(PG_FUNCTION_ARGS) { PG_RETURN_FLOAT8(gsl_cdf_ugaussian_P(PG_GETARG_FLOAT8(0))); } 4. Put the following SQL in a file named pg_gsl.sql.in (the build process will create a file named pg_gsl.sql, substituting MODULE_PATHNAME with an appropriate value): CREATE OR REPLACE FUNCTION cdf_ugaussian_p(double precision) RETURNS double precision AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; 5. Put the following in a file named Makefile: MODULE_big = pg_gsl OBJS = pg_gsl.o DATA_built = pg_gsl.sql PG_CPPFLAGS = `gsl-config --cflags` SHLIB_LINK = `gsl-config --libs` PGXS := $(shell pg_config --pgxs) include $(PGXS) 6. Build the code by running "make". This needs to be GNU Make, which on some systems is "gmake". If you have any trouble with this step then read the documentation mentioned above; if you still can't get it to work then please post the exact output from this command. gmake The result should be a shared object with a name like libpg_gsl.so (possibly with a different suffix, and there might be another file with a similar name like libpg_gsl.so.0). 7. Install the code so PostgreSQL can find it. You might need to become root or another user to do this if you don't have permission to write to the PostgreSQL directories. gmake install 8. Create the function in your database. Since this is a C function you'll need to do this as a database superuser. psql -U postgres -d dbname -f pg_gsl.sql 9. Test the function: psql dbname dbname=> select cdf_ugaussian_p(0); cdf_ugaussian_p ----------------- 0.5 (1 row) dbname=> select cdf_ugaussian_p(1); cdf_ugaussian_p ------------------- 0.841344746068543 (1 row) dbname=> select cdf_ugaussian_p(-1); cdf_ugaussian_p ------------------- 0.158655253931457 (1 row) That's it. If you have trouble then please post what step you were at, exactly what command you ran, the exact output you got, what platform you're on, and what version of PostgreSQL you're using. Now that I've written all this I suppose I could turn it into a PgFoundry project. Would there be any interest in a PostgreSQL interface to the GNU Scientific Library? Or has somebody already done that and I simply overlooked it? -- Michael Fuhr