Thread: [HACKERS] initdb failure on Debian sid/mips64el in EventTriggerEndCompleteQuery
[HACKERS] initdb failure on Debian sid/mips64el in EventTriggerEndCompleteQuery
From
Christoph Berg
Date:
10beta3 and 9.6.4 are both failing during initdb on mips64el on Debian/sid (unstable): https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.6&arch=mips64el&ver=9.6.4-1&stamp=1502374949&raw=0 https://buildd.debian.org/status/fetch.php?pkg=postgresql-10&arch=mips64el&ver=10%7Ebeta3-1&stamp=1502535836&raw=0 All other architectures have succeeded, as well as the 9.6.4 build for Debian/stretch (stable) on mips64el. The difference might be the compiler version (6.3.0 vs 7.1.0). Command was: "initdb" -D "/home/myon/postgresql-9.6/postgresql-9.6-9.6.3/build/src/t est/regress/./tmp_check/data" --noclean --nosync > "/home/myon/postgresql-9.6/postgr esql-9.6-9.6.3/build/src/test/regress/log/initdb.log" 2>&1 ******** build/src/test/regress/log/initdb.log ******** Running in noclean mode. Mistakes will not be cleaned up. The files belonging to this database system will be owned by user "myon". This user must also own the server process. The database cluster will be initialized with locales COLLATE: de_DE.utf8 CTYPE: de_DE.utf8 MESSAGES: C MONETARY: de_DE.utf8NUMERIC: de_DE.utf8 TIME: de_DE.utf8 The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "german". Data page checksums are disabled. creating directory /home/myon/postgresql-9.6/postgresql-9.6-9.6.3/build/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... Segmentation fault (core dumped) child process exited with exit code 139 $ gdb build/tmp_install/usr/lib/postgresql/9.6/bin/postgres build/src/test/regress/tmp_check/data/core GNU gdb (Debian 7.12-6) 7.12.0.20161007-git Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "mips64el-linux-gnuabi64". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from build/tmp_install/usr/lib/postgresql/9.6/bin/postgres...done. [New LWP 24217] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/mips64el-linux-gnuabi64/libthread_db.so.1". Core was generated by `/home/myon/postgresql-9.6/postgresql-9.6-9.6.3/build/tmp_install/usr/lib/postgr'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000000aababa6634 in EventTriggerEndCompleteQuery () at ./build/../src/backend/commands/event_trigger.c:1263 1263 MemoryContextDelete(currentEventTriggerState->cxt); (gdb) bt full #0 0x000000aababa6634 in EventTriggerEndCompleteQuery () at ./build/../src/backend/commands/event_trigger.c:1263 prevstate = <optimized out> #1 0x000000aabad6d508 in ProcessUtilitySlow ( parsetree=parsetree@entry=0xaac3688428, queryString=queryString@entry=0xaac3687888"REVOKE ALL on pg_authid FROM public;\n", context=context@entry=PROCESS_UTILITY_TOPLEVEL,params=params@entry=0x0, completionTag=completionTag@entry=0xffff985218"", dest=0xaabb0a0378 <debugtupDR>) at ./build/../src/backend/tcop/utility.c:1582 isTopLevel = 1 '\001' isCompleteQuery = 1 '\001' needCleanup= 0 '\000' commandCollected = <optimized out> address = {classId = 2, objectId = 0, objectSubId =0} secondaryObject = {classId = 0, objectId = 0, objectSubId = 0} #2 0x000000aabad6c6cc in standard_ProcessUtility (parsetree=0xaac3688428, queryString=0xaac3687888 "REVOKE ALL on pg_authidFROM public;\n", context=<optimized out>, params=0x0, dest=0xaabb0a0378 <debugtupDR>, completionTag=0xffff985218"") at ./build/../src/backend/tcop/utility.c:907 isTopLevel = 1 '\001' __func__ = "standard_ProcessUtility" #3 0x000000aabad6d33c in ProcessUtility (parsetree=<optimized out>, queryString=<optimized out>, context=<optimized out>,params=<optimized out>, dest=<optimized out>, completionTag=<optimized out>) at ./build/../src/backend/tcop/utility.c:336 No locals. #4 0x000000aabad68e80 in PortalRunUtility (portal=portal@entry=0xaac368a8a8, utilityStmt=utilityStmt@entry=0xaac3688428, isTopLevel=isTopLevel@entry=1 '\001', setHoldSnapshot=setHoldSnapshot@entry=0'\000', dest=0xaabb0a0378 <debugtupDR>, completionTag=0xffff985218 "") at ./build/../src/backend/tcop/pquery.c:1193 snapshot = 0xaac368c8e8 __func__ = "PortalRunUtility" #5 0x000000aabad69d70 in PortalRunMulti (portal=portal@entry=0xaac368a8a8, isTopLevel=isTopLevel@entry=1 '\001', setHoldSnapshot=setHoldSnapshot@entry=0'\000', dest=dest@entry=0xaabb0a0378 <debugtupDR>, altdest=altdest@entry=0xaabb0a0378<debugtupDR>, completionTag=completionTag@entry=0xffff985218 "") at ./build/../src/backend/tcop/pquery.c:1349 stmt = 0xaac3688428 active_snapshot_set = 0 '\000' stmtlist_item= 0xaac3688738 #6 0x000000aabad6ac44 in PortalRun (portal=portal@entry=0xaac368a8a8, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1'\001', dest=dest@entry=0xaabb0a0378 <debugtupDR>, altdest=altdest@entry=0xaabb0a0378<debugtupDR>, completionTag=completionTag@entry=0xffff985218 "") ---Type <return> to continue, or q <return> to quit--- at ./build/../src/backend/tcop/pquery.c:815 save_exception_stack= 0xffff985068 save_context_stack = 0x0 local_sigjmp_buf = {{__jmpbuf = {{__pc = 733279071168, __sp = 1099504831664, __regs = {733422847016, 733422856360, 733422847096, 1,733282435960, 733422844040, 733282649704, 733422428264}, __fp = 733422847832, __gp = 733282512816, __glibc_reserved1 = -1, __fpregs = {-nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff),-nan(0xfffffffffffff), -nan(0xfffffffffffff)}}}, __mask_was_saved = 0, __saved_mask= { __val = {733282649704, 733422428264, 733282512816, 733422847832, 733280473476,733422856360, 1099165163056, 733282073632, 1, 733282512816, 733280604384, 733282512816, 3618681818990495232, 733282512816, 3618681818990495232, 733422847016}}}} result = <optimized out> nprocessed = <optimized out> saveTopTransactionResourceOwner = 0xaac361d088 saveTopTransactionContext = 0xaac3622068 saveActivePortal = 0x0 saveResourceOwner = 0xaac361d088 savePortalContext = 0x0 saveMemoryContext= 0xaac3622068 __func__ = "PortalRun" #7 0x000000aabad6802c in exec_simple_query ( query_string=0xaac3687888 "REVOKE ALL on pg_authid FROM public;\n") at./build/../src/backend/tcop/postgres.c:1094 parsetree = 0xaac3688428 portal = 0xaac368a8a8 snapshot_set= <optimized out> commandTag = <optimized out> completionTag = "\000g\345\352\377\000\000\000w\245\252\377\377\000\000\000\034\b\327\352\377\000\000\000\060)\345\352\377\000\000\000\360g\345\352\377\000\000\000\320R\230\377\377\000\000\000\370\332\344\352\377\000\000\000\020\232\323\352\377\000\000" querytree_list = <optimized out> plantree_list = 0xaac3688758 receiver = 0xaabb0a0378 <debugtupDR> format = 0 dest = DestDebug parsetree_list = 0xaac3688498 save_log_statement_stats = 0 '\000' was_logged= 0 '\000' msec_str = "\020\326^ê\000\000\000\070|\337\352\377\000\000\000\370\226\335\352\377\000\000\000\000\000\000\000\000\000\000" parsetree_item= 0xaac3688478 isTopLevel = 1 '\001' #8 PostgresMain (argc=<optimized out>, argv=<optimized out>, ---Type <return> to continue, or q <return> to quit--- dbname=<optimized out>, username=<optimized out>) at ./build/../src/backend/tcop/postgres.c:4076 query_string = 0xaac3687888 "REVOKE ALL on pg_authid FROM public;\n" input_message = { data = 0xaac3687888 "REVOKE ALL on pg_authid FROM public;\n", len = 38, maxlen = 1024,cursor = 38} local_sigjmp_buf = {{__jmpbuf = {{__pc = 733279051620, __sp = 1099504832144, __regs= {733282651656, 10, 733422077472, 733282527592, 0, 733422071296, 0, 1099506026728}, __fp = 1099506034039, __gp = 733282512816, __glibc_reserved1 = 0, __fpregs = {-nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff),-nan(0xfffffffffffff), -nan(0xfffffffffffff), -nan(0xfffffffffffff)}}}, __mask_was_saved = 1, __saved_mask = {__val = {0, 0, 733422206784, 1099506026728, 1099111271600, 0, 1099157352184,1099157547312, 1099111271600, 0, 1099157352184, 1099157547312, 1024, 1099157563376,1099156624308, 1099109578624}}}} send_ready_for_query = 0 '\000' disable_idle_in_transaction_timeout= 0 '\000' __func__ = "PostgresMain" #9 0x000000aabaa65658 in main (argc=<optimized out>, argv=0xaac35e7160) at ./build/../src/backend/main/main.c:224 No locals. (gdb) l 1258 EventTriggerQueryState *prevstate; 1259 1260 prevstate = currentEventTriggerState->previous; 1261 1262 /* this avoids the need for retail pfree of SQLDropList items: */ 1263 MemoryContextDelete(currentEventTriggerState->cxt); 1264 1265 currentEventTriggerState = prevstate; 1266 } 1267 $ gcc --version gcc (Debian 7.1.0-11) 7.1.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Christoph
Re: [HACKERS] initdb failure on Debian sid/mips64el inEventTriggerEndCompleteQuery
From
Christoph Berg
Date:
Re: To PostgreSQL Hackers 2017-08-13 <20170813130127.g3tcyzzvuvlpzcxy@msg.df7cb.de> > 10beta3 and 9.6.4 are both failing during initdb on mips64el on > Debian/sid (unstable): > > https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.6&arch=mips64el&ver=9.6.4-1&stamp=1502374949&raw=0 > https://buildd.debian.org/status/fetch.php?pkg=postgresql-10&arch=mips64el&ver=10%7Ebeta3-1&stamp=1502535836&raw=0 > > All other architectures have succeeded, as well as the 9.6.4 build for > Debian/stretch (stable) on mips64el. The difference might be the > compiler version (6.3.0 vs 7.1.0). Seems to be a gcc-7 problem affecting several packages on mips64el: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=871514 Christoph
Re: [HACKERS] initdb failure on Debian sid/mips64el in EventTriggerEndCompleteQuery
From
Tom Lane
Date:
Christoph Berg <myon@debian.org> writes: > 10beta3 and 9.6.4 are both failing during initdb on mips64el on > Debian/sid (unstable): > All other architectures have succeeded, as well as the 9.6.4 build for > Debian/stretch (stable) on mips64el. The difference might be the > compiler version (6.3.0 vs 7.1.0). It's hard to explain that stack trace other than as a compiler bug. There shouldn't be any event triggers active here, so EventTriggerBeginCompleteQuery should have done nothing and returned false. I don't put complete faith in gdb reports of local variable values, but it says needCleanup = 0 '\000' which agrees with that. Also the core dump appears to be because currentEventTriggerState is NULL (please check that), which is expected if EventTriggerBeginCompleteQuery did nothing. However, then EventTriggerEndCompleteQuery should not have gotten called at all. I suspect you could work around this with bool isCompleteQuery = (context <= PROCESS_UTILITY_QUERY); - bool needCleanup; + volatile bool needCleanup;bool commandCollected = false; If that fixes it, it's definitely a compiler bug. That function does not change needCleanup after the sigsetjmp call, so per POSIX it should not have to label the variable volatile. This is far from being the first such bug we've seen though. regards, tom lane
Re: [HACKERS] initdb failure on Debian sid/mips64el in EventTriggerEndCompleteQuery
From
Tom Lane
Date:
Christoph Berg <myon@debian.org> writes: > Seems to be a gcc-7 problem affecting several packages on mips64el: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=871514 Hm, unless there is a use of sigsetjmp earlier in that clamav routine, I would not assume that that's the same issue. The bug I suspect we are looking at here is very specific to sigsetjmp callers: it usually amounts to the compiler unsafely trying to use the same temporary location for multiple purposes. regards, tom lane
Re: [HACKERS] initdb failure on Debian sid/mips64el inEventTriggerEndCompleteQuery
From
Christoph Berg
Date:
Re: Tom Lane 2017-08-13 <14517.1502638417@sss.pgh.pa.us> > I suspect you could work around this with > > bool isCompleteQuery = (context <= PROCESS_UTILITY_QUERY); > - bool needCleanup; > + volatile bool needCleanup; > bool commandCollected = false; > > If that fixes it, it's definitely a compiler bug. That function does > not change needCleanup after the sigsetjmp call, so per POSIX it > should not have to label the variable volatile. This is far from > being the first such bug we've seen though. In the meantime, gcc-7 is at version 7.2.0-1, so I gave 9.6 on mips64el a new try. It's still failing at initdb time, and indeed adding "volatile" makes initdb proceed, but then the rest of the testsuite fails in various ways: DETAIL: Failed process was running: CREATE TABLE enumtest_child (parent rainbow REFERENCES enumtest_parent); DETAIL: Failed process was running: create table trigtest2 (i int references trigtest(i) on delete cascade); DETAIL: Failed process was running: CREATE TABLE trunc_b (a int REFERENCES truncate_a); DETAIL: Failed process was running: CREATE SCHEMA evttrig CREATE TABLE one (col_a SERIAL PRIMARY KEY, col_btext DEFAULT 'forty two') CREATE INDEX one_idx ON one (col_b) CREATE TABLE two (col_c INTEGERCHECK (col_c > 0) REFERENCES one DEFAULT 42); Hopefully the compiler gets fixed soonish on mips64el... Thanks for the analysis, Christoph
Re: [HACKERS] initdb failure on Debian sid/mips64el inEventTriggerEndCompleteQuery
From
Christoph Berg
Date:
Re: Tom Lane 2017-08-13 <14677.1502638689@sss.pgh.pa.us> > Christoph Berg <myon@debian.org> writes: > > Seems to be a gcc-7 problem affecting several packages on mips64el: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=871514 > > Hm, unless there is a use of sigsetjmp earlier in that clamav > routine, I would not assume that that's the same issue. The bug > I suspect we are looking at here is very specific to sigsetjmp > callers: it usually amounts to the compiler unsafely trying to > use the same temporary location for multiple purposes. It appears to have been the same issue - non-long ints spilled on the stack and loaded back as long int: Changes:gcc-7 (7.2.0-3) unstable; urgency=high. * Update to SVN 20170901 (r251583) from the gcc-7-branch. - Fix PR target/81504(PPC), PR c++/82040. * Apply proposed patch for PR target/81803 (James Cowgill), conditionally for mips*targets. Closes: #871514. The package built successfully on mips64el now. Christoph