Thread: SIGSEGV in 'select * from pg_user'
Hi, I've found the following SISGEV while playing around with a snapshot of September 3rd. I did a make all (with -g); make install; rm -rf data; initdb Here's what I've done in gdb: [postgres@jeroenv bin]$ gdb postgres GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (i586-unknown-linux), Copyright 1996 Free Software Foundation, Inc... (gdb) run -D /usr/local/pgsql/data template1 Starting program: /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data template1 POSTGRES backend interactive interface $Revision: 1.89 $ $Date: 1998/09/01 04:32:13 $ > select * from pg_shadow blank 1: usename (typeid = 19, len = 32, typmod = -1, byval = f) 2: usesysid (typeid = 23, len = 4, typmod = -1, byval = t) 3: usecreatedb (typeid = 16, len = 1, typmod = -1, byval = t) 4: usetrace (typeid = 16, len = 1, typmod = -1, byval = t) 5: usesuper (typeid = 16, len = 1, typmod = -1, byval = t) 6: usecatupd (typeid = 16, len = 1, typmod = -1, byval = t) 7: passwd (typeid = 25, len = -1, typmod = -1, byval = f) 8: valuntil (typeid = 702, len = 4, typmod = -1, byval = t) ---- 1: usename = "postgres" (typeid = 19, len = 32, typmod = -1, byval = f) 2: usesysid = "203" (typeid = 23, len = 4, typmod = -1, byval = t) 3: usecreatedb = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 4: usetrace = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 5: usesuper = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 6: usecatupd = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 8: valuntil = "Sat Jan 31 07:00:00 2037 MET" (typeid = 702, len = 4, typmod = -1, byval = t) ---- [So far, no problems] > select * from pg_user blank 1: usename (typeid = 19, len = 32, typmod = -1, byval = f) 2: usesysid (typeid = 23, len = 4, typmod = -1, byval = t) 3: usecreatedb (typeid = 16, len = 1, typmod = -1, byval = t) 4: usetrace (typeid = 16, len = 1, typmod = -1, byval = t) 5: usesuper (typeid = 16, len = 1, typmod = -1, byval = t) 6: usecatupd (typeid = 16, len = 1, typmod = -1, byval = t) 7: passwd (typeid = 25, len = -1, typmod = -1, byval = f) 8: valuntil (typeid = 702, len = 4, typmod = -1, byval = t) ---- 1: usename = "postgres" (typeid = 19, len = 32, typmod = -1, byval = f) 2: usesysid = "203" (typeid = 23, len = 4, typmod = -1, byval = t) 3: usecreatedb = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 4: usetrace = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 5: usesuper = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 6: usecatupd = "t" (typeid = 16, len = 1, typmod = -1, byval = t) 7: passwd = "********" (typeid = 25, len = -1, typmod = -1, byval = f) 8: valuntil = "Sat Jan 31 07:00:00 2037 MET" (typeid = 702, len = 4, typmod = -1, byval = t) ---- Program received signal SIGSEGV, Segmentation fault. 0x400e90eb in __libc_free (mem=0x400f9740) (gdb) bt #0 0x400e90eb in __libc_free (mem=0x400f9740) #1 0x81cf188 in ?? () As the backtrace shows no clues, I've no idea where this goes wrong. Note that the view pg_shadow goes OK. select version() returns: PostgreSQL 6.4.0 on i586-pc-linux-gnu, compiled by gcc 2.8.1 Anybody know what's going wrong (and where)? Thanks, Jeroen van Vianen
> I've found the following SISGEV while playing around with a snapshot > of September 3rd. (did a fresh install with initdb) > > select * from pg_shadow > > select * from pg_user > Program received signal SIGSEGV, Segmentation fault. I see the same thing with a fresh source tree on my linux box. Is this normal? Also, I've been working on a (small) test case, and have at least some indication that the problem is not solely indices. I'll send a better documented example in a bit, but at least the following one will result in errors on a fresh install: CREATE TABLE onek ( unique1 int4, unique2 int4, two int4, four int4, ten int4, twenty int4, hundred int4, thousand int4, twothousand int4, fivethous int4, tenthous int4, odd int4, even int4, stringu1 name, stringu2 name, string4 name ); COPY onek FROM '/opt/postgres/current/src/test/regress/input/../data/onek.data'; create table k1 as select unique1, unique2 from onek; copy k1 to '/opt/postgres/current/src/test/regress/k1.data'; delete from k1; copy k1 from '/opt/postgres/current/src/test/regress/k1.data'; CREATE INDEX k1_unique1 ON k1 USING btree(unique1 int4_ops); CREATE INDEX k1_unique2 ON k1 USING btree(unique2 int4_ops); ERROR: DefineIndex: k1 relation not found If I leave out the "delete from" I don't get the errors. If I do these steps, then do a new initdb and create k1 from the saved data file, then I still see the error. - Tom
I have just cvsuped the source tree and have tried some tests. >> I've found the following SISGEV while playing around with a snapshot >> of September 3rd. >(did a fresh install with initdb) >> > select * from pg_shadow >> > select * from pg_user >> Program received signal SIGSEGV, Segmentation fault. > >I see the same thing with a fresh source tree on my linux box. Is this >normal? I saw this too on my LinuxPPC box. In my case, just doing: select * from pg_user crashes the backend. The backtrace shows it crashed in chunk_free() while committing the transaction. I guess something messed up the tables managed by malloc(). Talking about the regression, two tests (constraints, select_views) produced core dump. Seems no difference even after applying Bruce's latest patches. -- Tatsauo Ishii t-ishii@sra.co.jp
> I saw this too on my LinuxPPC box. In my case, just doing: > select * from pg_user > > crashes the backend. The backtrace shows it crashed in chunk_free() > while committing the transaction. I guess something messed up the > tables managed by malloc(). > > Talking about the regression, two tests (constraints, select_views) > produced core dump. Seems no difference even after applying Bruce's > latest patches. I see the same behavior, with a simple "select * from pg_user" enough to crash the backend, and with the same two regression tests resulting in core dumps. As I've mentioned earlier, I believe that the select_views test has been failing for quite a while, where the other problems are more recent. Presumably the pg_user problem is similar to the select_views problem?? Would it help to choose a (simple) test case which shows a problem (either a core dump or the "relation not found" problem) and start working it through together? We could then exchange notes on what we are finding. I'm not absolutely certain that the problems are directly related to changes for indexing; other changes (the oid removal, the "name" type changes, others??) happened in the same time frame... - Tom
> I see the same behavior, with a simple "select * from pg_user" enough > to crash the backend The segfault is coming from a call to free() after the command has executed and while the "CommitTransactionCommand" phase is running. Putting the query inside a begin/end block does not help. Presumably there is a bad pointer or something getting free'd twice. Any ideas? - Tom