Postgres Analysis Tool-Pak - Mailing list pgsql-hackers
From | Mike Mascari |
---|---|
Subject | Postgres Analysis Tool-Pak |
Date | |
Msg-id | 39260C4F.3BA68B77@mascari.com Whole thread Raw |
Responses |
Re: Postgres Analysis Tool-Pak
|
List | pgsql-hackers |
Hello guys, As a personal project, I have written an Excel-compatible spreadsheet as a component of a 3D charting and graphing application. It uses a Phong-style renderer with texture mapping, bump mapping, Z-buffer shadow mapping, non-refractive transparency and antialiasing to produce some very nice looking images. Anyway, a component of the data analysis tool is, as mentioned, a spreadsheet. I've been able to implement around 300 of the 320 Excel-style functions and was wondering if they would be useful to PostgreSQL. Here's the list: ABS ACCRINT ACCRINTM ACOS ACOSH ADDRESS AMORDEGRC AMORLINC AND AREAS ASIN ASINH ATAN ATAN2 ATANH AVEDEV AVERAGE BESSELI BESSELJ BESSELK BESSELY BETADIST BETAINV BIN2DEC BIN2HEX BIN2OCT BINOMDIST CALL CEILING CELL CHAR CHIDIST CHIINV CHITEST CHOOSE CLEAN CODE COLUMN COLUMNS COMBIN COMPLEX CONCATENATE CONFIDENCE CONVERT CORREL COS COSH COUNT COUNTA COUNTBLANK COUNTIF COUPDAYS COUPDAYSBS COUPDAYSNC COUPNCD COUPNUM COUPPCD COVAR CRITBINOM CUMIPMT CUMPRINC DATE DATEVALUE DAVERAGE DAY DAYS360 DB DCOUNT DCOUNTA DDB DEC2BIN DEC2HEX DEC2OCT DEGREES DELTA DEVSQ DGET DISC DMAX DMIN DOLLAR DOLLARDE DOLLARFR DPRODUCT DSTDEV DSTDEVP DSUM DURATION DVAR DVARP EDATE EFFECT EOMONTH ERF ERFC ERRORTYPE EVEN EXACT EXP EXPONDIST FACT FACTDOUBLE FALSE FDIST FIND FINV FISHER FISHERINV FIXED FLOOR FORECAST FREQUENCY FTEST FV FVSCHEDULE GAMMADIST GAMMAINV GAMMALN GCD GEOMEAN GESTEP GROWTH HARMEAN HEX2BIN HEX2DEC HEX2OCT HLOOKUP HOUR HYPGEOMDIST IF IMABS IMAGINARY IMARGUMENT IMCONJUGATE IMCOS IMDIV IMEXP IMLN IMLOG10 IMLOG2 IMPOWER IMPRODUCT IMREAL IMSIN IMSQRT IMSUB IMSUM INDEX INDIRECT INFO INT INTERCEPT INTRATE IPMT IRR ISBLANK ISERR ISERROR ISEVEN ISLOGICAL ISNA ISNONTEXT ISNUMBER ISODD ISREF ISTEXT KURT LARGE LCM LEFT LEN LINEST LN LOG LOG10 LOGEST LOGINV LOGNORMDIST LOOKUP LSOLVE LOWER MATCH MAX MDETERM MDURATION MEDIAN MID MIN MINUTE MINVERSE MIRR MMULT MOD MODE MONTH MROUND MULTINOMIAL N NA NEGBINOMDIST NETWORKDAYS NOMINAL NORMDIST NORMINV NORMSDIST NORMSINV NOT NOW NPER NPV OCT2BIN OCT2DEC OCT2HEX ODD ODDFPPRICE ODDFYIELD ODDLPRICE ODDLYIELD OFFSET OR PEARSON PERCENTILE PERCENTRANK PERMUT PI PMT POISSON POWER PPMT PRICE PRICEDISC PRICEMAT PROB PRODUCT PROPER PV QUARTILE QUOTIENT RADIANS RAND RANDBETWEEN RANK RATE RECEIVED REGISTERID REPLACE REPT RIGHT ROMAN ROUND ROUNDDOWN ROUNDUP ROW ROWS RSQ SEARCH SECOND SERIESSUM SIGN SIN SINH SKEW SLN SLOPE SMALL SQLREQUEST SQRT SQRTPI STANDARDIZE STDEV STDEVP STEYX SUBSTITUTE SUBTOTAL SUM SUMIF SUMPRODUCT SUMSQ SUMX2MY2 SUMX2PY2 SUMXMY2 SYD T TAN TANH TBILLEQ TBILLPRICE TBILLYIELD TDIST TEXT TIME TIMEVALUE TINV TODAY TRANSPOSE TREND TRIM TRIMMEAN TRUE TRUNC TTEST TYPE UPPER VALUE VAR VARP VDB VLOOKUP WEEKDAY WEIBULL WORKDAY XIRR XNPV YEAR YEARFRAC YIELD YIELDDISC YIELDMAT Naturally, some of these functions already map to existing PostgreSQL functions and some are irrelevant to PostgreSQL (such as ADDRESS, CALL, HLOOKUP for example). Others map with different SQL names. All of them have been written using double precision numbers (like Excel) and therefore would take some time to use numeric -- if that's what is desired. The cumulative distribution functions and their inverses I derived solely from recurrence relations, polynomial approximations, and evalutation of various infinite series to n-terms from the "Handbook of Mathematical Functions" and so, like Excel, only realize 32-bit float precision. I suspect there may be some BSD licensed libraries out there that do a better job (hopefully with arbitrary precision). Some of the functions, it would seem to me, would be very useful, but I'm not sure of how the semantics would map. For example, TREND() performs multi-variate linear regression analysis (i.e.: good for picking stocks), but, not only would it require the new function manager interface to return a result set, but it takes several 'sets' as input as well: TREND(known y's, known x1's, known x2's, ..., new x's) and so it would act almost like an aggregate except that it would yield a result set, not just a single-valued result. I visualize something like: SELECT TREND(quotes.prices, quotes.date1, quotes.date2) FROM quotes; Anyways, I guess I'm like Chris Bitmead. What do you want me to do with this crap? I could port the non-set returning functions and single-pass aggregates now (such as DDB(), RATE(), PMT(), etc.) and wait on the others until the rewrite. Or I could just wait until the rewrite if the function call interface is going to change dramatically for this such as NULL handling etc. Any comments? Mike Mascari
pgsql-hackers by date: