Thread: postgresql 8.2 rc1 - crash

postgresql 8.2 rc1 - crash

From
"hubert depesz lubaczewski"
Date:
hi,
i have been testing 8.2 rc1, while i got this problem.

base data:
linux, 32bit, kernel: 2.6.18.3; debian
postgresql version:
PostgreSQL 8.2rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)
problematic table is over 2gigabytes in size, and has several indices - one of them is gin-index.

problem:
when i issue vacuum full verbose analyze it works, but then crashes with signal 11.
always in the same situation.

i was not able to check what is the reason.
i did:
recompile with debug, set ulimit -c unlimited, and rerun the query.
it crashed.
i bundled:
1. logs
2. core file
3. config of postgresql
4. saved output of vacuum
all of this can be fetched from: http://depesz.com/various/crash.data.tar.bz2
unfortunatelly i'm not c programmer, so i dont know gdb, but i hope you will be ab le to make any sense out of it.

the bz2 file is > 20mb in size.

any help? is it hardware problem? or a missed bug in code?

if i can provide you with more information - please tell me what i should tell you.

depesz

--
http://www.depesz.com/ - nowy, lepszy depesz

Re: postgresql 8.2 rc1 - crash

From
Teodor Sigaev
Date:
While I'm downloading your file, pls, do follow:
gdb /usr/local/pgsql/bin/postgres your_core_file

If it's needed, change path to postgres file.

In gdb, type
# bt
and send output

hubert depesz lubaczewski wrote:
> hi,
> i have been testing 8.2 rc1, while i got this problem.
>
> base data:
> linux, 32bit, kernel: 2.6.18.3 <http://2.6.18.3>; debian
> postgresql version:
> PostgreSQL 8.2rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5
> (Debian 1:3.3.5-13)
> problematic table is over 2gigabytes in size, and has several indices -
> one of them is gin-index.
>
> problem:
> when i issue vacuum full verbose analyze it works, but then crashes with
> signal 11.
> always in the same situation.
>
> i was not able to check what is the reason.
> i did:
> recompile with debug, set ulimit -c unlimited, and rerun the query.
> it crashed.
> i bundled:
> 1. logs
> 2. core file
> 3. config of postgresql
> 4. saved output of vacuum
> all of this can be fetched from:
> http://depesz.com/various/crash.data.tar.bz2
> unfortunatelly i'm not c programmer, so i dont know gdb, but i hope you
> will be ab le to make any sense out of it.
>
> the bz2 file is > 20mb in size.
>
> any help? is it hardware problem? or a missed bug in code?
>
> if i can provide you with more information - please tell me what i
> should tell you.
>
> depesz
>
> --
> http://www.depesz.com/ - nowy, lepszy depesz

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: postgresql 8.2 rc1 - crash

From
"hubert depesz lubaczewski"
Date:
On 11/30/06, Teodor Sigaev <teodor@sigaev.ru> wrote:
gdb /usr/local/pgsql/bin/postgres your_core_file
If it's needed, change path to postgres file.
In gdb, type
# bt
and send output

sure, here you have:
$ gdb /home/pgdba/work/bin/postgres /home/pgdba/data/core
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `postgres: trader_ru_tomcat tra'.
Program terminated with signal 11, Segmentation fault.

warning: current_sos: Can't read pathname for load map: Input/output error

Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.7
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
Reading symbols from /lib/tls/libcrypt.so.1...done.
Loaded symbols for /lib/tls/libcrypt.so.1
Reading symbols from /lib/tls/libdl.so.2...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/libnss_files.so.2...done.
Loaded symbols for /lib/tls/libnss_files.so.2
Reading symbols from /home/pgdba/work/lib/postgresql/tsearch2.so...done.
Loaded symbols for /home/pgdba/work/lib/postgresql/tsearch2.so
#0  0xb7ce4a85 in memmove () from /lib/tls/libc.so.6
(gdb) bt
#0  0xb7ce4a85 in memmove () from /lib/tls/libc.so.6
#1  0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020", offset=53719) at gindatapage.c:291
#2  0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194, leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at ginvacuum.c:268
#3  0x080bf95c in ginScanToDelete (gvs=0xbfc2ab80, blkno=29194, isRoot=0 '\0', parent=0xb2df39a0, myoff=351) at ginvacuum.c:412
#4  0x080bf8f2 in ginScanToDelete (gvs=0xbfc2ab80, blkno=99489, isRoot=0 '\0', parent=0xb2b359a0, myoff=2) at ginvacuum.c:399
#5  0x080bf8f2 in ginScanToDelete (gvs=0xbfc2ab80, blkno=43, isRoot=1 '\001', parent=0xb28019a0, myoff=0) at ginvacuum.c:399
#6  0x080bfa83 in ginVacuumPostingTree (gvs=0xbfc2ab80, rootBlkno=43) at ginvacuum.c :446
#7  0x080bffd0 in ginbulkdelete (fcinfo=0xb2804768) at ginvacuum.c:638
#8  0x08259717 in FunctionCall4 (flinfo=0xfffffff6, arg1=2994751336, arg2=2994751336, arg3=2994751336, arg4=2994751336) at fmgr.c:1206
#9  0x08097005 in index_bulk_delete (info=0xbfc2af10, stats=0xb2804768, callback=0xb2804768, callback_state=0xb2804768) at indexam.c:573
#10 0x08143d42 in vacuum_index (vacpagelist=0xb2804768, indrel=0xa788c6a8, num_tuples=1675710, keep_tuples=0) at vacuum.c:3029
#11 0x08140c00 in full_vacuum_rel (onerel=0xa787aba0, vacstmt=0x83f74c0) at vacuum.c:1172
#12 0x08140a21 in vacuum_rel (relid=2994751336, vacstmt=0x83f74c0, expected_relkind=114 'r') at vacuum.c:1086
#13 0x08140127 in vacuum (vacstmt=0x83f74c0, relids=0x4601) at vacuum.c:397
#14 0x081da588 in PortalRunUtility (portal=0x841c160, query=0x83f7510, dest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:1063
#15 0x081da833 in PortalRunMulti (portal=0x841c160, dest=0x83f73b8, altdest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:1131
#16 0x081d9f7b in PortalRun (portal=0x841c160, count=2147483647, dest=0x83f73b8, altdest=0x83f73b8, completionTag=0xbfc2b2a0 "") at pquery.c:700
#17 0x081d526d in exec_simple_query (query_string=0x83f71a8 "VACUUM FULL verbose analyze adverts;") at postgres.c:939
#18 0x081d8725 in PostgresMain (argc=4, argv=0x836c368, username=0x836c328 "trader_ru_tomcat") at postgres.c:3419
#19 0x081b0216 in BackendRun (port=0x839b858) at postmaster.c:2909
#20 0x081af9ef in BackendStartup (port=0x839b858) at postmaster.c:2536
#21 0x081adaba in ServerLoop () at postmaster.c:1206
#22 0x081acf5a in PostmasterMain (argc=1, argv=0x836a508) at postmaster.c:958
#23 0x0816b3d4 in main (argc=1, argv=0x836a508) at main.c:188
(gdb)


--
http://www.depesz.com/ - nowy, lepszy depesz

Re: postgresql 8.2 rc1 - crash

From
Teodor Sigaev
Date:
> #1  0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020",
> offset=53719) at gindatapage.c:291
> #2  0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194,
> leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at
> ginvacuum.c:268

Are you sure about your hardware? myoff in ginDeletePage() and offset in
PageDeletePostingItem are the same variable...

Pls, send to me postgres file itself - just core isn't very useful for debug.

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: postgresql 8.2 rc1 - crash

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
>> #1  0x080bc224 in PageDeletePostingItem (page=0xb28039a0 "\020",
>> offset=53719) at gindatapage.c:291
>> #2  0x080bf558 in ginDeletePage (gvs=0xbfc2ab80, deleteBlkno=29194,
>> leftBlkno=29059, parentBlkno=70274, myoff=351, isParentRoot=0 '\0') at
>> ginvacuum.c:268

> Are you sure about your hardware? myoff in ginDeletePage() and offset in
> PageDeletePostingItem are the same variable...

That sort of thing isn't unusual when looking at dumps with an optimized
executable.  gdb has only a limited view of what the compiler is doing,
and frequently will think that register N contains a variable when in
fact that register gets re-used for several different purposes within
the function.

            regards, tom lane

Re: postgresql 8.2 rc1 - crash

From
Teodor Sigaev
Date:
I reproduce a problem with small script:
print <<EOT;
drop table if exists qq;
create table qq (
     i int,
     ii  int[]
);
COPY qq FROM stdin;
EOT

for ($i=0;$i<1000000;$i++) {
     print "$i\t{1}\n";
}

print <<EOT;
\\.

CREATE INDEX qqidx ON qq USING gin (ii);
DELETE FROM qq WHERE i>5000 and i<400000;
VACUUM FULL ANALYZE qq;

EOT

So, I'm digging now...
--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: postgresql 8.2 rc1 - crash

From
Teodor Sigaev
Date:
Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index
is corrupted, so you need to recreate it)



--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: postgresql 8.2 rc1 - crash

From
"hubert depesz lubaczewski"
Date:
On 11/30/06, Teodor Sigaev <teodor@sigaev.ru> wrote:
Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index
is corrupted, so you need to recreate it)

great. thanks. i will retry. full retry will take  some time - i can estimate that i will be able to reply tomorrow in the evening (my evening) - let's say - in 24 hours.

hubert

--
http://www.depesz.com/ - nowy, lepszy depesz

Re: postgresql 8.2 rc1 - crash

From
"hubert depesz lubaczewski"
Date:
On 11/30/06, hubert depesz lubaczewski <depesz@gmail.com> wrote:
On 11/30/06, Teodor Sigaev <teodor@sigaev.ru> wrote:
Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index
is corrupted, so you need to recreate it)
great. thanks. i will retry. full retry will take  some time - i can estimate that i will be able to reply tomorrow in the evening (my evening) - let's say - in 24 hours.

confirmed. everything works fine.

thanks for very quick patch.

depesz

--
http://www.depesz.com/ - nowy, lepszy depesz