Thread: postgresql 8.2 rc1 - crash
hi,
i have been testing 8.2 rc1, while i got this problem.
base data:
linux, 32bit, kernel: 2.6.18.3; debian
postgresql version:
PostgreSQL 8.2rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)
problematic table is over 2gigabytes in size, and has several indices - one of them is gin-index.
problem:
when i issue vacuum full verbose analyze it works, but then crashes with signal 11.
always in the same situation.
i was not able to check what is the reason.
i did:
recompile with debug, set ulimit -c unlimited, and rerun the query.
it crashed.
i bundled:
1. logs
2. core file
3. config of postgresql
4. saved output of vacuum
all of this can be fetched from: http://depesz.com/various/crash.data.tar.bz2
unfortunatelly i'm not c programmer, so i dont know gdb, but i hope you will be ab le to make any sense out of it.
the bz2 file is > 20mb in size.
any help? is it hardware problem? or a missed bug in code?
if i can provide you with more information - please tell me what i should tell you.
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz
i have been testing 8.2 rc1, while i got this problem.
base data:
linux, 32bit, kernel: 2.6.18.3; debian
postgresql version:
PostgreSQL 8.2rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)
problematic table is over 2gigabytes in size, and has several indices - one of them is gin-index.
problem:
when i issue vacuum full verbose analyze it works, but then crashes with signal 11.
always in the same situation.
i was not able to check what is the reason.
i did:
recompile with debug, set ulimit -c unlimited, and rerun the query.
it crashed.
i bundled:
1. logs
2. core file
3. config of postgresql
4. saved output of vacuum
all of this can be fetched from: http://depesz.com/various/crash.data.tar.bz2
unfortunatelly i'm not c programmer, so i dont know gdb, but i hope you will be ab le to make any sense out of it.
the bz2 file is > 20mb in size.
any help? is it hardware problem? or a missed bug in code?
if i can provide you with more information - please tell me what i should tell you.
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz
While I'm downloading your file, pls, do follow: gdb /usr/local/pgsql/bin/postgres your_core_file If it's needed, change path to postgres file. In gdb, type # bt and send output hubert depesz lubaczewski wrote: > hi, > i have been testing 8.2 rc1, while i got this problem. > > base data: > linux, 32bit, kernel: 2.6.18.3 <http://2.6.18.3>; debian > postgresql version: > PostgreSQL 8.2rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5 > (Debian 1:3.3.5-13) > problematic table is over 2gigabytes in size, and has several indices - > one of them is gin-index. > > problem: > when i issue vacuum full verbose analyze it works, but then crashes with > signal 11. > always in the same situation. > > i was not able to check what is the reason. > i did: > recompile with debug, set ulimit -c unlimited, and rerun the query. > it crashed. > i bundled: > 1. logs > 2. core file > 3. config of postgresql > 4. saved output of vacuum > all of this can be fetched from: > http://depesz.com/various/crash.data.tar.bz2 > unfortunatelly i'm not c programmer, so i dont know gdb, but i hope you > will be ab le to make any sense out of it. > > the bz2 file is > 20mb in size. > > any help? is it hardware problem? or a missed bug in code? > > if i can provide you with more information - please tell me what i > should tell you. > > depesz > > -- > http://www.depesz.com/ - nowy, lepszy depesz -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 11/30/06, Teodor Sigaev <teodor@sigaev.ru> wrote:
sure, here you have:
$ gdb /home/pgdba/work/bin/postgres /home/pgdba/data/core
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1".
Core was generated by `postgres: trader_ru_tomcat tra'.
Program terminated with signal 11, Segmentation fault.
warning: current_sos: Can't read pathname for load map: Input/output error
Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.7
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
Reading symbols from /lib/tls/libcrypt.so.1...done.
Loaded symbols for /lib/tls/libcrypt.so.1
Reading symbols from /lib/tls/libdl.so.2...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/libnss_files.so.2...done.
Loaded symbols for /lib/tls/libnss_files.so.2
Reading symbols from /home/pgdba/work/lib/postgresql/tsearch2.so...done.
Loaded symbols for /home/pgdba/work/lib/postgresql/tsearch2.so
#0 0xb7ce4a85 in memmove () from /lib/tls/libc.so.6
(gdb) bt
#0 0xb7ce4a85 in memmove () from /lib/tls/libc.so.6
#1 0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020", offset=53719) at gindatapage.c:291
#2 0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194, leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at ginvacuum.c:268
#3 0x080bf95c in ginScanToDelete (gvs=0xbfc2ab80, blkno=29194, isRoot=0 '\0', parent=0xb2df39a0, myoff=351) at ginvacuum.c:412
#4 0x080bf8f2 in ginScanToDelete (gvs=0xbfc2ab80, blkno=99489, isRoot=0 '\0', parent=0xb2b359a0, myoff=2) at ginvacuum.c:399
#5 0x080bf8f2 in ginScanToDelete (gvs=0xbfc2ab80, blkno=43, isRoot=1 '\001', parent=0xb28019a0, myoff=0) at ginvacuum.c:399
#6 0x080bfa83 in ginVacuumPostingTree (gvs=0xbfc2ab80, rootBlkno=43) at ginvacuum.c :446
#7 0x080bffd0 in ginbulkdelete (fcinfo=0xb2804768) at ginvacuum.c:638
#8 0x08259717 in FunctionCall4 (flinfo=0xfffffff6, arg1=2994751336, arg2=2994751336, arg3=2994751336, arg4=2994751336) at fmgr.c:1206
#9 0x08097005 in index_bulk_delete (info=0xbfc2af10, stats=0xb2804768, callback=0xb2804768, callback_state=0xb2804768) at indexam.c:573
#10 0x08143d42 in vacuum_index (vacpagelist=0xb2804768, indrel=0xa788c6a8, num_tuples=1675710, keep_tuples=0) at vacuum.c:3029
#11 0x08140c00 in full_vacuum_rel (onerel=0xa787aba0, vacstmt=0x83f74c0) at vacuum.c:1172
#12 0x08140a21 in vacuum_rel (relid=2994751336, vacstmt=0x83f74c0, expected_relkind=114 'r') at vacuum.c:1086
#13 0x08140127 in vacuum (vacstmt=0x83f74c0, relids=0x4601) at vacuum.c:397
#14 0x081da588 in PortalRunUtility (portal=0x841c160, query=0x83f7510, dest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:1063
#15 0x081da833 in PortalRunMulti (portal=0x841c160, dest=0x83f73b8, altdest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:1131
#16 0x081d9f7b in PortalRun (portal=0x841c160, count=2147483647, dest=0x83f73b8, altdest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:700
#17 0x081d526d in exec_simple_query (query_string=0x83f71a8 "VACUUM FULL verbose analyze adverts;") at postgres.c:939
#18 0x081d8725 in PostgresMain (argc=4, argv=0x836c368, username=0x836c328 "trader_ru_tomcat") at postgres.c:3419
#19 0x081b0216 in BackendRun (port=0x839b858) at postmaster.c:2909
#20 0x081af9ef in BackendStartup (port=0x839b858) at postmaster.c:2536
#21 0x081adaba in ServerLoop () at postmaster.c:1206
#22 0x081acf5a in PostmasterMain (argc=1, argv=0x836a508) at postmaster.c:958
#23 0x0816b3d4 in main (argc=1, argv=0x836a508) at main.c:188
(gdb)
--
http://www.depesz.com/ - nowy, lepszy depesz
gdb /usr/local/pgsql/bin/postgres your_core_file
If it's needed, change path to postgres file.
In gdb, type
# bt
and send output
sure, here you have:
$ gdb /home/pgdba/work/bin/postgres /home/pgdba/data/core
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1".
Core was generated by `postgres: trader_ru_tomcat tra'.
Program terminated with signal 11, Segmentation fault.
warning: current_sos: Can't read pathname for load map: Input/output error
Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.7
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
Reading symbols from /lib/tls/libcrypt.so.1...done.
Loaded symbols for /lib/tls/libcrypt.so.1
Reading symbols from /lib/tls/libdl.so.2...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/libnss_files.so.2...done.
Loaded symbols for /lib/tls/libnss_files.so.2
Reading symbols from /home/pgdba/work/lib/postgresql/tsearch2.so...done.
Loaded symbols for /home/pgdba/work/lib/postgresql/tsearch2.so
#0 0xb7ce4a85 in memmove () from /lib/tls/libc.so.6
(gdb) bt
#0 0xb7ce4a85 in memmove () from /lib/tls/libc.so.6
#1 0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020", offset=53719) at gindatapage.c:291
#2 0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194, leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at ginvacuum.c:268
#3 0x080bf95c in ginScanToDelete (gvs=0xbfc2ab80, blkno=29194, isRoot=0 '\0', parent=0xb2df39a0, myoff=351) at ginvacuum.c:412
#4 0x080bf8f2 in ginScanToDelete (gvs=0xbfc2ab80, blkno=99489, isRoot=0 '\0', parent=0xb2b359a0, myoff=2) at ginvacuum.c:399
#5 0x080bf8f2 in ginScanToDelete (gvs=0xbfc2ab80, blkno=43, isRoot=1 '\001', parent=0xb28019a0, myoff=0) at ginvacuum.c:399
#6 0x080bfa83 in ginVacuumPostingTree (gvs=0xbfc2ab80, rootBlkno=43) at ginvacuum.c :446
#7 0x080bffd0 in ginbulkdelete (fcinfo=0xb2804768) at ginvacuum.c:638
#8 0x08259717 in FunctionCall4 (flinfo=0xfffffff6, arg1=2994751336, arg2=2994751336, arg3=2994751336, arg4=2994751336) at fmgr.c:1206
#9 0x08097005 in index_bulk_delete (info=0xbfc2af10, stats=0xb2804768, callback=0xb2804768, callback_state=0xb2804768) at indexam.c:573
#10 0x08143d42 in vacuum_index (vacpagelist=0xb2804768, indrel=0xa788c6a8, num_tuples=1675710, keep_tuples=0) at vacuum.c:3029
#11 0x08140c00 in full_vacuum_rel (onerel=0xa787aba0, vacstmt=0x83f74c0) at vacuum.c:1172
#12 0x08140a21 in vacuum_rel (relid=2994751336, vacstmt=0x83f74c0, expected_relkind=114 'r') at vacuum.c:1086
#13 0x08140127 in vacuum (vacstmt=0x83f74c0, relids=0x4601) at vacuum.c:397
#14 0x081da588 in PortalRunUtility (portal=0x841c160, query=0x83f7510, dest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:1063
#15 0x081da833 in PortalRunMulti (portal=0x841c160, dest=0x83f73b8, altdest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:1131
#16 0x081d9f7b in PortalRun (portal=0x841c160, count=2147483647, dest=0x83f73b8, altdest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:700
#17 0x081d526d in exec_simple_query (query_string=0x83f71a8 "VACUUM FULL verbose analyze adverts;") at postgres.c:939
#18 0x081d8725 in PostgresMain (argc=4, argv=0x836c368, username=0x836c328 "trader_ru_tomcat") at postgres.c:3419
#19 0x081b0216 in BackendRun (port=0x839b858) at postmaster.c:2909
#20 0x081af9ef in BackendStartup (port=0x839b858) at postmaster.c:2536
#21 0x081adaba in ServerLoop () at postmaster.c:1206
#22 0x081acf5a in PostmasterMain (argc=1, argv=0x836a508) at postmaster.c:958
#23 0x0816b3d4 in main (argc=1, argv=0x836a508) at main.c:188
(gdb)
--
http://www.depesz.com/ - nowy, lepszy depesz
> #1 0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020", > offset=53719) at gindatapage.c:291 > #2 0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194, > leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at > ginvacuum.c:268 Are you sure about your hardware? myoff in ginDeletePage() and offset in PageDeletePostingItem are the same variable... Pls, send to me postgres file itself - just core isn't very useful for debug. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: >> #1 0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020", >> offset=53719) at gindatapage.c:291 >> #2 0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194, >> leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at >> ginvacuum.c:268 > Are you sure about your hardware? myoff in ginDeletePage() and offset in > PageDeletePostingItem are the same variable... That sort of thing isn't unusual when looking at dumps with an optimized executable. gdb has only a limited view of what the compiler is doing, and frequently will think that register N contains a variable when in fact that register gets re-used for several different purposes within the function. regards, tom lane
I reproduce a problem with small script: print <<EOT; drop table if exists qq; create table qq ( i int, ii int[] ); COPY qq FROM stdin; EOT for ($i=0;$i<1000000;$i++) { print "$i\t{1}\n"; } print <<EOT; \\. CREATE INDEX qqidx ON qq USING gin (ii); DELETE FROM qq WHERE i>5000 and i<400000; VACUUM FULL ANALYZE qq; EOT So, I'm digging now... -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index is corrupted, so you need to recreate it) -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 11/30/06, Teodor Sigaev <teodor@sigaev.ru> wrote:
great. thanks. i will retry. full retry will take some time - i can estimate that i will be able to reply tomorrow in the evening (my evening) - let's say - in 24 hours.
hubert
--
http://www.depesz.com/ - nowy, lepszy depesz
Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index
is corrupted, so you need to recreate it)
great. thanks. i will retry. full retry will take some time - i can estimate that i will be able to reply tomorrow in the evening (my evening) - let's say - in 24 hours.
hubert
http://www.depesz.com/ - nowy, lepszy depesz
On 11/30/06, hubert depesz lubaczewski <depesz@gmail.com> wrote:
confirmed. everything works fine.
thanks for very quick patch.
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz
On 11/30/06, Teodor Sigaev <teodor@sigaev.ru> wrote:Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index
is corrupted, so you need to recreate it)great. thanks. i will retry. full retry will take some time - i can estimate that i will be able to reply tomorrow in the evening (my evening) - let's say - in 24 hours.
confirmed. everything works fine.
thanks for very quick patch.
depesz
--
http://www.depesz.com/ - nowy, lepszy depesz