Thread: C set return function differences on 8.0?
I have a C .so set return function which is called to create a views, pulling data from an older non SQL flat file data structure CREATE OR REPLACE VIEW view_name AS SELECT call_d.col_0, call_d.col_1, call_d.col_2, call_d.col_3" FROM call_d('select * from aname '::text, 'aname'::text) AS (col_0 bigint, col_2 numeric, col_2 text, col_3 text); The call_d function calls the C .so passing in the SQL and the target and returns a record set conforming to the request, the column names and data types are supplied for each target/view This worked compiled under postgresql 7.4 for all the target files and various data types that I have. I re-compiled the .so using 8.0 server development source and rebuilt the views on 8.0 and now only some views will pull up data. These are views using the same data that worked under 7.4 and now just abort. Some of the views do select. Could the problem be related to a difference in the way 8.0 is handling data types? When I define a column as numeric with out precision or scale it should handle any numeric value, however in a data set that has a numeric field with mostly XX.XX data one record had a XX.XXX value and once I removed that record from the set the view selected ok. This doesn't make sense to me.
On Thu, Jul 07, 2005 at 05:58:28PM -0700, Tim Jackson wrote: > > I re-compiled the .so using 8.0 server development source and rebuilt > the views on 8.0 and now only some views will pull up data. These are > views using the same data that worked under 7.4 and now just abort. What do you mean by "abort" -- does the backend crash? If so, and if you get a core dump, then it might be useful to see a stack trace. Could you post a simple, self-contained test case that works in 7.4 and fails in 8.0? That is, all SQL, C code, and data that somebody could use to reproduce the problem from an empty database. Please do distill the example so it's as simple as possible -- that way the analysis won't be distracted by irrelevant factors. Exactly which versions of 7.4 and 8.0 are you running, and on what platform? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
[Please copy the mailing list on replies so others can contribute to and learn from the discussion.] On Fri, Jul 08, 2005 at 05:12:58PM -0700, Tim Jackson wrote: > Michael Fuhr wrote: > > What do you mean by "abort" -- does the backend crash? If so, and > > if you get a core dump, then it might be useful to see a stack trace. > > > The backend crashes, in pgadmin attempting to display the view simple > says aborting. There is no core dump, and the log only seems to reflect > the standard > WARNING: terminating connection because of crash of another server process Where did you look for a core dump? If one was made then it'll probably be somewhere under $PGDATA (e.g., $PGDATA/base/XXX/core) unless your system is configured to put core dumps elsewhere. If there isn't a core dump then you could add some debugging ereport() calls to your code so you can find out where the crash is happening. Another possibility might be to attach a debugger to the backend. > >Could you post a simple, self-contained test case that works in 7.4 > >and fails in 8.0? That is, all SQL, C code, and data that somebody > > > The problem I have is we are using a licensed library to accesss the > target database. We have c code which parses SQLSRF and makes a call to > these 3rd party lib functions which returns strings of data. So your C function calls these library functions, which query some other data source and return strings back to you, right? How are these strings returned -- as char * values? Can you at least post your code? > Today we tried to just hard code some data strings and pass that into > SRF part of the code. We put in values that we expected to cause the > problem and it did not cause the problem. So we are stuck with how to > proceed. Let's see if attaching a debugger to the backend or adding some ereport() calls can at least tell us where the crash is happening. Then maybe we can figure out why. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr <mike@fuhr.org> writes: > On Fri, Jul 08, 2005 at 05:12:58PM -0700, Tim Jackson wrote: >> The backend crashes, in pgadmin attempting to display the view simple >> says aborting. There is no core dump, and the log only seems to reflect >> the standard >> WARNING: terminating connection because of crash of another server process > Where did you look for a core dump? "There is no core dump" is not an acceptable answer: if there isn't one, the first item on your agenda must be to get one (or else run the problem case under gdb so you don't need a dump to get a stack trace). Usually if there's no dump it's because the default configuration on your system is "ulimit -c 0" to suppress core dumps from daemons. Put "ulimit -c unlimited" (or local equivalent) into your postmaster start script and restart the postmaster. On the whole though I'd recommend learning how to attach gdb to a live backend, since then not only can you get a stack trace from the point of the fault, but you can then work backwards by setting breakpoints ahead of the crash and stepping through the code to see where it goes wrong. regards, tom lane
Michael Fuhr wrote: > >Let's see if attaching a debugger to the backend or adding some >ereport() calls can at least tell us where the crash is happening. >Then maybe we can figure out why. > > I was able to set gdb to the pid of psql session and saw that selects on the "bad" views segfaulted on pfree() The documentation for SRF had a comment that said pfree() was not really necessary so I commented that code out and now I can select all the views. :) Perhaps you can advise me on why pfree() would segfault on 8.0 in but not 7.4 Anyway the .so generated views are working now as they were on 7.4. Thanks for your help. Tim Not sure what all this means but the following lines were gleaned from the core dump cannot allocate memory for thread-local data: ABORT result == _rtld_local._dl_tls_max_dtv_idx + 1 result <= _rtld_local._dl_tls_max_dtv_idx + 1 @cnt < _rtld_local._dl_tls_dtv_slotinfo_list->len _rtld_local._dl_tls_dtv_slotinfo_list->next == ((void *)0) _rtld_local._dl_tls_dtv_slotinfo_list != ((void *)0) _rtld_local._dl_tls_max_dtv_idx == 0 _rtld_local._dl_tls_dtv_slotinfo_list == ((void *)0) (size_t) map->l_tls_offset >= map->l_tls_blocksize map->l_tls_blocksize >= map->l_tls_initimage_size ../sysdeps/unix/sysv/linux/dl-origin.c FATAL: cannot determine library version ../sysdeps/generic/dl-sysdep.c Inconsistency detected by ld.so: %s: %u: %s%sAssertion `%s' failed! Inconsistency detected by ld.so: %s: %u: %s%sUnexpected error: %s. And Here is what gdb says about the core Core was generated by `postgres: postgres lubesoft_tj [local] SELECT '. Program terminated with signal 11, Segmentation fault. #0 0x08222a86 in ?? () (no debugging symbols found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".