segfault in geqo on experimental gcc animal - Mailing list pgsql-hackers

From Andres Freund
Subject segfault in geqo on experimental gcc animal
Date
Msg-id 20191109221919.5mgumlpnvifj6xkr@alap3.anarazel.de
Whole thread Raw
Responses Re: segfault in geqo on experimental gcc animal  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Hi,

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=moonjelly&dt=2019-11-09%2010%3A17%3A06

shows a failure, including a backtrace:

======-=-====== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ======-=-======
[New LWP 42902]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: fabien regression [local] SELECT                                    '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000006d962b in gimme_tour (root=root@entry=0x1cfb4b0, edge_table=edge_table@entry=0x1d3afc0,
new_gene=<optimizedout>, num_gene=5) at geqo_erx.c:209
 
209            remove_gene(root, new_gene[i - 1], edge_table[(int) new_gene[i - 1]], edge_table);
#0  0x00000000006d962b in gimme_tour (root=root@entry=0x1cfb4b0, edge_table=edge_table@entry=0x1d3afc0,
new_gene=<optimizedout>, num_gene=5) at geqo_erx.c:209
 
#1  0x00000000006da0a8 in geqo (root=0x1cfb4b0, number_of_rels=<optimized out>, initial_rels=<optimized out>) at
geqo_main.c:190
#2  0x00000000006de084 in make_one_rel (root=root@entry=0x1cfb4b0, joinlist=joinlist@entry=0x1d0a868) at
allpaths.c:227
#3  0x0000000000701d19 in query_planner (root=root@entry=0x1cfb4b0, qp_callback=qp_callback@entry=0x702300
<standard_qp_callback>,qp_extra=qp_extra@entry=0x7ffd46b55a60) at planmain.c:269
 
#4  0x0000000000706844 in grouping_planner () at planner.c:2054
#5  0x00000000007093c7 in subquery_planner (glob=glob@entry=0x1cfb418, parse=parse@entry=0x1cd77b8,
parent_root=parent_root@entry=0x0,hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at
planner.c:1014
#6  0x000000000070a803 in standard_planner (parse=0x1cd77b8, cursorOptions=256, boundParams=<optimized out>) at
planner.c:406
#7  0x00000000007cb1dc in pg_plan_query (querytree=0x1cd77b8, cursorOptions=256, boundParams=0x0) at postgres.c:873
#8  0x00000000007cb2be in pg_plan_queries (querytrees=0x1cfb3c0, cursorOptions=cursorOptions@entry=256,
boundParams=boundParams@entry=0x0)at postgres.c:963
 
#9  0x00000000007cb618 in exec_simple_query () at postgres.c:1154
#10 0x00000000007cd384 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x1c23058, dbname=<optimized out>,
username=<optimizedout>) at postgres.c:4278
 
#11 0x000000000074b574 in BackendRun (port=0x1c1c650) at postmaster.c:4498
#12 BackendStartup (port=0x1c1c650) at postmaster.c:4189
#13 ServerLoop () at postmaster.c:1727
#14 0x000000000074c34d in PostmasterMain (argc=argc@entry=8, argv=argv@entry=0x1bf35b0) at postmaster.c:1400
#15 0x0000000000491f41 in main (argc=8, argv=0x1bf35b0) at main.c:210
$1 = {si_signo = 11, si_errno = 0, si_code = 1, _sifields = {_pad = {30650304, -12, 0 <repeats 26 times>}, _kill =
{si_pid= 30650304, si_uid = 4294967284}, _timer = {si_tid = 30650304, si_overrun = -12, si_sigval = {sival_int = 0,
sival_ptr= 0x0}}, _rt = {si_pid = 30650304, si_uid = 4294967284, si_sigval = {sival_int = 0, sival_ptr = 0x0}},
_sigchld= {si_pid = 30650304, si_uid = 4294967284, si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = {si_addr =
0xfffffff401d3afc0,_addr_lsb = 0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {si_band = -51508957248, si_fd
=0}}}
 

I don't think there's been any relevant code changes since the last
success.

last success:
2019-11-09 09:20:28.346 CET [28785:1] LOG:  starting PostgreSQL 13devel on x86_64-pc-linux-gnu, compiled by gcc (GCC)
10.0.020191102 (experimental), 64-bit
 

first failure:
2019-11-09 11:19:36.277 CET [42512:1] LOG:  starting PostgreSQL 13devel on x86_64-pc-linux-gnu, compiled by gcc (GCC)
10.0.020191109 (experimental), 64-bit
 


so it sure looks like a gcc upgrade caused the failure. But it's not
clear wheter it's a compiler bug, or some undefined behaviour that
triggers the bug.

Fabien, any chance to either bisect or get a bit more information on the
backtrace?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Handy describe_pg_lock function
Next
From: Mark Dilger
Date:
Subject: Re: Using multiple extended statistics for estimates