Re: [HACKERS] 6.5 cvs: can't drop table - Mailing list pgsql-hackers

From Oleg Bartunov
Subject Re: [HACKERS] 6.5 cvs: can't drop table
Date
Msg-id Pine.GSO.3.96.SK.990525185621.12127D-100000@ra
Whole thread Raw
In response to Re: [HACKERS] 6.5 cvs: can't drop table  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 25 May 1999, Tom Lane wrote:

> Date: Tue, 25 May 1999 10:15:43 -0400
> From: Tom Lane <tgl@sss.pgh.pa.us>
> To: Oleg Bartunov <oleg@sai.msu.su>
> Cc: hackers@postgreSQL.org
> Subject: Re: [HACKERS] 6.5 cvs: can't drop table 
> 
> Oleg Bartunov <oleg@sai.msu.su> writes:
> > today I did some tests with current 6.5 from cvs and multiple joins.
> > I did unpredictable server crashes, i.e. sometimes query works
> > sometimes crashes.
> 
> I have a theory about why the results are random: the GEQO planner
> deliberately uses random numbers to generate plans, so you don't
> always get the same plan out of it.  Whatever bug you are seeing
> occurs only for a particular plan path.  (I haven't had any luck
> repeating your crash here, so the bug may be platform-specific.)
> 
> It bothers me that the GEQO results are not reliably reproducible
> across platforms; that complicates debugging.  I have been thinking
> about suggesting that we ought to change GEQO to use a fixed random
> seed value by default, with the variable random seed being available
> only as a *non default* option.  Comments anyone?
> 
> In the meantime, you could try setting up a pgsql/data/pg_geqo file
> with a specific Random_Seed NNN line, and try different NNN values
> until you find one that will reliably trigger the failure.  That
> would help in reproducing the problem elsewhere.

I have rather stable crash under 2.0.37, see below

> 
> > After about a hour of my experiments I can't drop table in
> > my test database:
> 
> If you crash the backend enough times, you shouldn't be too surprised
> that your database gets corrupted ... I think this is just collateral
> damage.

Got cvs update, reinstall pgsql, run my test and after several
success got the same crash :-) You probably right - this could be
connected with OS - Linux 2.0.37, I installed new kernel (old one was 2.0.36)
several days ago. I'll move back to 2.0.36 and will see what happens.
Interesting that I never get a crash on the same test (even 20 tables)
on my home machine which is running 2.2.9 ! I also run test under
FreeBSD 3.1 release (elf) and also no problems.

As usual, here is a backtrace :-)
Regards,    Oleg

PS. btw, it seems Jan fixed the bug with pg_dump and view !
where t1.id = t0.id and t2.id=t0.id and t3.id=t0.id and t4.id=t0.id and t5.id=t0.id and t6.id=t0.id and t7.id=t0.id and
t8.id=t0.idand t9.id=t0.id and t10.id=t0.id and t11.id=t0.id and t12.id=t0.id and t13.id=t0.id and t14.id=t0.id and
t15.id=t0.idand 
 
t16.id=t0.id and t17.id=t0.id and t18.id=t0.id and t19.id=t0.id and t20.id=t0.id ;
pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
We have lost the connection to the backend, so further processing is impossible.  Terminating.
mira:/usr/local/pgsql/data/base/test$ l core
-rw-------   1 postgres users    11784192 May 25 19:07 core
mira:/usr/local/pgsql/data/base/test$ gdb /usr/local/pgsql/bin/postmaster core
GDB is free software and you are welcome to distribute copies of itunder certain conditions; type "show copying" to see
theconditions.
 
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i486-slackware-linux), 
Copyright 1996 Free Software Foundation, Inc...
Core was generated by /usr/local/pgsql/bin/postgres localhost megera test idle                     '.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libdl.so.1...done.
Reading symbols from /lib/libm.so.5...done.
Reading symbols from /usr/lib/libreadline.so...done.
Reading symbols from /usr/lib/libhistory.so...done.
Reading symbols from /lib/libtermcap.so.2...done.
Reading symbols from /lib/libncurses.so.3.0...done.
Reading symbols from /usr/lib/libc.so.5...done.
Reading symbols from /lib/ld-linux.so.1...done.
Reading symbols from /usr/lib/libc.so.5...done.
Reading symbols from /lib/ld-linux.so.1...done.
#0  0x80c76af in pathorder_match ()
(gdb) bt
#0  0x80c76af in pathorder_match ()
#1  0x80c7900 in better_path ()
#2  0x80c7863 in add_pathlist ()
#3  0x80bf515 in update_rels_pathlist_for_joins ()
#4  0x80c8b8b in gimme_tree ()
#5  0x80c8ae7 in geqo_eval ()
#6  0x80c8d12 in geqo ()
#7  0x80bd6e6 in make_one_rel_by_joins ()
#8  0x80bd5ee in make_one_rel ()
#9  0x80c1e81 in subplanner ()
#10 0x80c1dff in query_planner ()
#11 0x80c2173 in union_planner ()
#12 0x80c1f55 in planner ()
#13 0x80e2497 in pg_parse_and_plan ()
#14 0x80e25bb in pg_exec_query_dest ()
#15 0x80e257c in pg_exec_query ()
#16 0x80e36c8 in PostgresMain ()
#17 0x80cc72c in DoBackend ()
#18 0x80cc26b in BackendStartup ()
#19 0x80cb9e7 in ServerLoop ()
#20 0x80cb573 in PostmasterMain ()
#21 0x80a2999 in main ()
#22 0x806131e in _start ()
(gdb) 


> 
>             regards, tom lane
> 

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



pgsql-hackers by date:

Previous
From: Vadim Mikheev
Date:
Subject: Re: [HACKERS] createlang - ?
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Call for updates!