Thread: BUG #4838: Database corruption after btree_gin index creation
The following bug has been logged online: Bug reference: 4838 Logged by: Daniele Bortoluzzi Email address: bortoluz@gmail.com PostgreSQL version: 8.4beta2 Operating system: Linux amd64 2.6.24 (Debian 4.0) Description: Database corruption after btree_gin index creation Details: I am testing this db I created a multicolumn GIN index with btree_gin functionality (fulltext column + timestamp). After creating the index the db segfaulted: LOG: server process (PID 14195) was terminated by signal 11: Segmentation fault LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. The WARNING-DETAIL-HINT messages repeated 4 times, then postgres restarted: LOG: all server processes terminated; reinitializing LOG: database system was interrupted; last known up at 2009-06-04 12:47:19 CEST LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 2/778687D0 LOG: record with zero length at 2/779392A8 LOG: redo done at 2/77938E20 LOG: last completed transaction was at log time 2009-06-04 12:47:35.55392+02 LOG: autovacuum launcher started LOG: database system is ready to accept connections but segfaulted 2 times more. Then I launched a VACUUM FULL ANALYZE, no segmentation faults, it completed succesfully, but now it throws this error: ERROR: tuple offset out of range: 48090 or ERROR: tuple offset out of range: 0 when doing fulltext queries. I was using postgres 8.4devel (SVN revision 28901) happily...
"Daniele Bortoluzzi" <bortoluz@gmail.com> writes: > Description: Database corruption after btree_gin index creation Can you provide a self-contained test case to reproduce this problem? We had a similar report yesterday but no one can reproduce it. regards, tom lane
"Daniele Bortoluzzi" <bortoluz@gmail.com> writes: > I created a multicolumn GIN index with btree_gin functionality (fulltext > column + timestamp). After creating the index the db segfaulted: > LOG: server process (PID 14195) was terminated by signal 11: Segmentation > fault I cannot replicate this problem based on the little information provided. The GIN bug we found a couple of days ago would explain the "tuple offset out of range" errors, and if you had had Asserts enabled it would explain Assert failures; but I don't see that it explains a segfault. Can you still reproduce this with CVS HEAD, and if so would you submit a test case? Or at least a stack trace from the crash? regards, tom lane
2009/6/10 Tom Lane <tgl@sss.pgh.pa.us>: [...] > I cannot replicate this problem based on the little information > provided. =A0The GIN bug we found a couple of days ago would explain > the "tuple offset out of range" errors, and if you had had Asserts > enabled it would explain Assert failures; but I don't see that it > explains a segfault. =A0Can you still reproduce this with CVS HEAD, > and if so would you submit a test case? =A0Or at least a stack trace > from the crash? I tried to replicate the error with a little set of data (our db weights ~700MB) but I could not achieve it. Now I'm checking out from the CVS server, will post a new message today or at least tomorrow. If I cannot reproduce the error, what is the best way to catch the stack trace? Do I have to recompile with --enable-debug? I read this article: http://wiki.postgresql.org/wiki/Developer_FAQ#What_debugging_features_are_a= vailable.3F but I never debugged postgresql with gdb. Can you give me some hint? I am sorry for the megadelay. Thank you for supporting.
Daniele Bortoluzzi <bortoluz@gmail.com> writes: > If I cannot reproduce the error, what is the best way to catch the > stack trace? Do I have to recompile with --enable-debug? Yes, that would be the best thing. If you are using gcc there is no harm in using --enable-debug all the time; it just makes the executable files a bit bigger, there's no performance change. Make sure the postmaster is started with "ulimit -c unlimited", else the crash might not drop a core file. The core file will normally appear in $PGDATA, but sometimes in a system-dependent special place such as /cores/. Once you've got a core file, do $ gdb /path/to/postgres-executable /path/to/core-file gdb> bt ... stack trace ... gdb> quit and send the whole output of gdb. regards, tom lane
[...] >Can you still reproduce this with CVS HEAD, with CVS HEAD the error is not occurring. Did you fix some GIN bug in this version? Thank you for your support
Daniele Bortoluzzi <bortoluz@gmail.com> writes: >> Can you still reproduce this with CVS HEAD, > with CVS HEAD the error is not occurring. Did you fix some GIN bug in > this version? Yes, I told you so. http://archives.postgresql.org/pgsql-committers/2009-06/msg00081.php But I don't see how that bug would've led to a segfault. Bogus TIDs in the index should be caught without that. regards, tom lane