Thread: 15beta1 crash on mips64el in pg_regress/triggers
Debian unstable mips64el: 2022-05-18 22:57:34.436 UTC client backend[19222] pg_regress/triggers STATEMENT: drop trigger trg1 on trigpart3; ... 2022-05-18 22:57:39.110 UTC postmaster[7864] LOG: server process (PID 19222) was terminated by signal 11: Segmentation fault 2022-05-18 22:57:39.110 UTC postmaster[7864] DETAIL: Failed process was running: SELECT a.attname, pg_catalog.format_type(a.atttypid, a.atttypmod), (SELECT pg_catalog.pg_get_expr(d.adbin, d.adrelid, true) FROM pg_catalog.pg_attrdef d WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef), a.attnotnull, (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> t.typcollation) AS attcollation, a.attidentity, a.attgenerated FROM pg_catalog.pg_attribute a WHERE a.attrelid = '21816' AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum; ******** build/src/test/regress/tmp_check/data/core ******** warning: Can't open file /dev/shm/PostgreSQL.387042440 during file-backed mapping note processing warning: Can't open file /dev/shm/PostgreSQL.4014890228 during file-backed mapping note processing warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing warning: Can't open file /SYSV035e8a2e (deleted) during file-backed mapping note processing [New LWP 19222] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/mips64el-linux-gnuabi64/libthread_db.so.1". Core was generated by `postgres: buildd regression [local] SELECT '. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000000ffd000565c in ?? () #0 0x000000ffd000565c in ?? () No symbol table info available. #1 0x000000aaad76b730 in ExecEvalExprSwitchContext (isNull=0xfffb9e85e7, econtext=0xaab20e9f90, state=0xaab20ea108) at ./build/../src/include/executor/executor.h:343 retDatum = <optimized out> oldContext = 0xaab1fabb10 retDatum = <optimized out> oldContext = <optimized out> #2 ExecProject (projInfo=0xaab20ea100) at ./build/../src/include/executor/executor.h:377 econtext = 0xaab20e9f90 state = 0xaab20ea108 slot = 0xaab20ea5b0 isnull = false #3 ExecScan (node=0xaab20ea100, accessMtd=0xaaad78b6d0 <IndexNext>, recheckMtd=0xaaad78bf08 <IndexRecheck>) at ./build/../src/backend/executor/execScan.c:238 slot = <optimized out> econtext = <optimized out> qual = <optimized out> projInfo = 0xaab20ea100 #4 0x000000aaad76b730 in ExecEvalExprSwitchContext (isNull=0xaab20ea450, econtext=0xaab20e9f90, state=0x1208) at ./build/../src/include/executor/executor.h:343 retDatum = <optimized out> oldContext = 0xaab20ea6c8 retDatum = <optimized out> oldContext = <optimized out> #5 ExecProject (projInfo=0x1200) at ./build/../src/include/executor/executor.h:377 econtext = 0xaab20e9f90 state = 0x1208 slot = 0xaab20ea638 isnull = false #6 ExecScan (node=0xaab20ea6d0, accessMtd=0xaab20ea6cd, recheckMtd=0xaab20ea310) at ./build/../src/backend/executor/execScan.c:238 slot = <optimized out> econtext = <optimized out> qual = <optimized out> projInfo = 0x1200 #7 0xffffffffffffffff in ?? () No symbol table info available. Backtrace stopped: frame did not save the PC Full build log: https://buildd.debian.org/status/fetch.php?pkg=postgresql-15&arch=mips64el&ver=15%7Ebeta1-1&stamp=1652916002&raw=0 Christoph
Christoph Berg <myon@debian.org> writes: > Debian unstable mips64el: Hmm, so what's different between this and buildfarm member topminnow? Is the crash 100% reproducible for you? regards, tom lane
Re: Tom Lane > Christoph Berg <myon@debian.org> writes: > > Debian unstable mips64el: > > Hmm, so what's different between this and buildfarm member topminnow? > > Is the crash 100% reproducible for you? I have scheduled a rebuild now, we'll know in a few hours... Christoph
Re: Tom Lane > Christoph Berg <myon@debian.org> writes: > > Debian unstable mips64el: > > Hmm, so what's different between this and buildfarm member topminnow? That one is running Debian jessie (aka oldoldoldoldstable), uses -mabi=32 with gcc 4.9, and runs a kernel from 2015. The Debian buildd is this: https://db.debian.org/machines.cgi?host=mipsel-aql-01 The host should be running Debian buster, with the build done in an unstable chroot. I don't know what "LS3A-RS780-1w (Quad Core Loongson 3A)" means, but it's probably much newer hardware than the other one. Christoph
Christoph Berg <myon@debian.org> writes: > Re: Tom Lane >> Hmm, so what's different between this and buildfarm member topminnow? > That one is running Debian jessie (aka oldoldoldoldstable), uses > -mabi=32 with gcc 4.9, and runs a kernel from 2015. > The Debian buildd is this: https://db.debian.org/machines.cgi?host=mipsel-aql-01 > The host should be running Debian buster, with the build done in an > unstable chroot. I don't know what "LS3A-RS780-1w (Quad Core Loongson 3A)" > means, but it's probably much newer hardware than the other one. I see that the gcc farm[1] has another mips64 machine running Debian buster, so I've started a build there to see what happens. regards, tom lane [1] https://cfarm.tetaneutral.net/machines/list/
Re: To Tom Lane > > Is the crash 100% reproducible for you? > > I have scheduled a rebuild now, we'll know in a few hours... The build was much faster this time (different machine), and worked. https://buildd.debian.org/status/logs.php?pkg=postgresql-15&arch=mips64el I'll also start a test build on the mips64el porterbox I have access to. Christoph
I wrote: > I see that the gcc farm[1] has another mips64 machine running Debian > buster, so I've started a build there to see what happens. Many kilowatt-hours later, I've entirely failed to reproduce this on gcc230. Not sure how to investigate further. Given that your original build machine is so slow, could it be timing-related? Hard to see how, given the location of the crash, but ... regards, tom lane
Re: Tom Lane > Many kilowatt-hours later, I've entirely failed to reproduce this > on gcc230. Not sure how to investigate further. Given that your > original build machine is so slow, could it be timing-related? > Hard to see how, given the location of the crash, but ... My other rebuild (on yet another machine) also passed fine, so we can possibly attribute that to some hardware glitch on the original machine. But it's being used as a regular buildd for Debian, so I guess it would have already been noticed if there was any general problem with it. I'll try reaching out to the buildd folks if they know anything. https://buildd.debian.org/status/recent.php?bad_results_only=on&a=mips64el&suite=experimental Christoph