Thread: Wierdness using SUM to add results of custom C function.
Hello from a brand new Postgres user. I first want to thank and congratulate all those who have done an amazing job on Postgres. It has been a godsend for the cash strapped univerity project I am part of. That said I want show you something that has got me really stumped. Machine: SunOS brancusi 5.7 Generic_106541-08 sun4u sparc SUNW, UltraSPARC-IIi-Engine Postgres version: PostgreSQL 7.1.3 on sparc-sun-solaris2.7, compiled by GCC 2.95.3 Starting info: countcommas is a very simple small C function that counts the number of commas in a text string and returns int. Compiled as .so and added using Create Function .. Language 'C'. It works as expected. Headwords is a text field of comma seperated words with no constraints. The where clause is not important, it was a convenient way to restrict output to 3 rows. Begin ouput wfdb=> select countcommas(headwords) from workflow where family like 'abat%'; countcommas ------------- 18 3 1 (3 rows) Comment: These numbers are correct. I verified them by hand _many_ times. wfdb=> select sum(countcommas(headwords)) from workflow where family like 'abat%'; sum ----- 24 (1 row) Comment: Repeat run wfdb=> select countcommas(headwords) from workflow where family like 'abat%'; countcommas ------------- 18 3 1 (3 rows) wfdb=> select sum(countcommas(headwords)) from workflow where family like 'abat%'; sum ----- 23 (1 row) End output The C function returns the correct result all the time but the sum returns the wrong result all the time. It mostly returns 23, but it returned 24 once! All these queries where done without any other activity on the database (it is the test version) within 20 seconds max. I find it hard to believe there is a bug in SUM that is why I posted this in pgsql-general. I haven't a clue whether it is SQL, interface or installation related. I don't have root on the machine so I couldn't try it on 7.2. If I know there is a fix I can presuade "them" to install it. Can someone please help me make sense of this before I go source diving? Thank you in advance. Konrad Scherer
On Fri, 2002-02-22 at 21:07, Konrad Scherer wrote: > The C function returns the correct result all the time but the sum > returns the wrong result all the time. It mostly returns 23, but it > returned 24 once! Can you reproduce the results? You said it returned incorrect data once -- has it only happened a single time, or does it happen on some regular basis? Are you sure that the C function always returns the right results? IMHO the most likely culprit is a bug in your C function. You might want to run postgres under gdb (compile with debugging; start up a single 'postgres' process in interactive mode w/o postmaster), and step through the query to see where the bug comes from. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
Konrad Scherer <kscherer@magma.ca> writes: > The C function returns the correct result all the time but the sum > returns the wrong result all the time. It mostly returns 23, but it > returned 24 once! Very strange. May we see the full text of the C function, and the SQL declaration you gave for it? regards, tom lane
Tom Lane wrote: > > Konrad Scherer <kscherer@magma.ca> writes: > > The C function returns the correct result all the time but the sum > > returns the wrong result all the time. It mostly returns 23, but it > > returned 24 once! > > Very strange. May we see the full text of the C function, and the > SQL declaration you gave for it? I was hoping someone would just say "Yeah I had that bug. Upgrade to .." ;-) Here is everything you need to reproduce this. C function #include "postgres.h" #include "executor/spi.h" int countwords(text *txt); //this function counts the number of commas in a comma seperated list //and then deduces the number of words in the list int countwords(text *txt) { int i, count; char * str; //retrieve string from text field str = DatumGetCString(DirectFunctionCall1(textout, PointerGetDatum(txt))); //if no text, then no words if (txt->vl_len < 1) { return 0; } //if there is text, but no commas there must be one word count = 1; //ignore commas in first and last postion in list for (i=1;i<(txt->vl_len)-1;i++) { if (str[i]==',') { count++; } } return count; } An aside: should I be freeing str after I am done with it? It looks like it is only returning a pointer into the field. Compiled on Solaris using gcc -Wall -Wmissing-prototypes -Wmissing-declarations -fPIC -I/export/home/kscherer/postgresql-7.1.3/src/interfaces/libpq -I/export/home/kscherer/postgresql-7.1.3/src/include -c -o countWords.o countWords.c /usr/ccs/bin/ld -G -Bdynamic -o countWords.so countWords.o A quick script to run drop function countWords(text); CREATE FUNCTION countwords(text) RETURNS int AS '/export/home/kscherer/src/countWords.so' LANGUAGE 'C'; DROP table "sumtest"; CREATE table "sumtest" ( headwords text ); INSERT into sumtest values ( 'abattage/nm , abat/nm , abats/npl , abattis/npl , abattoir/nm , abatteuse/nf , abattant/nm , abat-son/nm , abat-foin/nm , abat-feuille/nm , abattable/adj , abat-voix/nm , abat-jour/nm , abatteur/nm , abat-vent/nm , abattu/adj , abattement/nm , s\'abattre/vp'); INSERT into sumtest values ( 'abate.vt, abate.vi, abatement.n'); INSERT into sumtest values ('abattoir/n'); select countwords(headwords) from sumtest; select sum(countwords(headwords)) from sumtest; Gives output (repeatedly) on my machine as follows countwords ------------ 18 3 1 (3 rows) sum ----- 23 (1 row) You can quickly verify that 18,3,1 are indeed the correct answers and that 23 is the worng one. It may have something to do with the parameter passing of the int in the C function, but then why does it work without using sum?? Thank You in advance Konrad Scherer
Konrad Scherer <kscherer@magma.ca> writes: > //if no text, then no > words > if (txt->vl_len < 1) > { > return > 0; > } > //if there is text, but no commas there must be one > word > count = > 1; > //ignore commas in first and last postion in > list > for (i=1;i<(txt->vl_len)-1;i++) > { > if (str[i]==',') > { > count++; > } You appear to be assuming that txt->vl_len is the length of the string. It's not (it includes the size of the header). You're scanning a few garbage bytes beyond the data, and evidently those sometimes look like commas. The apparent difference in behavior between aggregate and non-aggregate context likely has something to do with whether the argument you're given has been copied or not. > An aside: should I be freeing str after I am done with it? You can, but it's not really necessary; the palloc'd storage will be reclaimed soon after your function exits anyway. regards, tom lane