Re: test_shm_mq failing on mips* - Mailing list pgsql-hackers

From Christoph Berg
Subject Re: test_shm_mq failing on mips*
Date
Msg-id 20141125154210.GB21475@msg.df7cb.de
Whole thread Raw
In response to Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)  (Christoph Berg <cb@df7cb.de>)
Responses Re: test_shm_mq failing on mips*  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Re: To Robert Haas 2014-11-24 <20141124200824.GA22662@msg.df7cb.de>
> > Does it fail every time when run on a machine where it fails sometimes?
> 
> So far there's a consistent host -> fail-or-not mapping, but most
> mips/mipsel build hosts have seen only one attempt so far which
> actually came so far to actually run the shm_mq test.

I got the build rescheduled on the same machine and it's hanging
again.

> Atm I don't have access to the boxes where it was failing (the builds
> succeed on the mips(el) porter hosts available to Debian developers).
> I'll see if I can arrange access there and run a test.

Julien Cristau was so kind to poke into the hanging processes. The
build has been stuck now for about 4h, while that postgres backend has
only consumed 4s of CPU time (according to plain "ps"). The currently
executing query is:

SELECT test_shm_mq_pipelined(16384, (select string_agg(chr(32+(random()*95)::int), '') from generate_series(1,270000)),
200,3);
 

(Waiting f, active, backend_start 6s older than xact_start, xact_start
same as query_start, state_change 19µs newer, all 4h old)

Backtrace:

Reading symbols from
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/contrib/test_shm_mq/tmp_check/install/usr/lib/postgresql/9.4/lib/test_shm_mq.so...done.
Loaded symbols for
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/contrib/test_shm_mq/tmp_check/install/usr/lib/postgresql/9.4/lib/test_shm_mq.so
0x76ea1d98 in poll () from /lib/mipsel-linux-gnu/libc.so.6
(gdb) where
#0  0x76ea1d98 in poll () from /lib/mipsel-linux-gnu/libc.so.6
#1  0x00683044 in poll (__timeout=<optimized out>, __nfds=1, __fds=0x7fe2d084)   at
/usr/include/mipsel-linux-gnu/bits/poll2.h:41
#2  WaitLatchOrSocket (latch=0x766cae7c, wakeEvents=1, sock=-1, timeout=0) at pg_latch.c:325
#3  0x6dc91968 in test_shm_mq_pipelined (fcinfo=<optimized out>)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../contrib/test_shm_mq/test.c:231
#4  0x005d448c in ExecMakeFunctionResultNoSets (fcache=0xda4658, econtext=0xda4590, isNull=0xda4c80 "hL\332",
isDone=<optimizedout>)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/execQual.c:2023
#5  0x005db0fc in ExecTargetList (isDone=0x7fe2d210, itemIsDone=0xda4d18, isnull=0xda4c80 "hL\332",    values=0xda4c70,
econtext=0xda4590,targetlist=0xda4d00)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/execQual.c:5304
#6  ExecProject (projInfo=<optimized out>, isDone=0x7fe2d210)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/execQual.c:5519
#7  0x005f08b8 in ExecResult (node=0xda4508)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/nodeResult.c:155
#8  0x005d308c in ExecProcNode (node=0xda4508)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/execProcnode.c:373
#9  0x005cfb5c in ExecutePlan (dest=0xd9e0a0, direction=<optimized out>, numberTuples=0,    sendTuples=<optimized out>,
operation=CMD_SELECT,planstate=0xda4508, estate=0xda1640)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/execMain.c:1475
#10 standard_ExecutorRun (queryDesc=0xd18240, direction=<optimized out>, count=0)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/executor/execMain.c:308
#11 0x00707bc0 in PortalRunSelect (portal=portal@entry=0xd85518, forward=forward@entry=1 '\001', count=0,
count@entry=2147483647,dest=dest@entry=0xd9e0a0)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/tcop/pquery.c:946
#12 0x0070959c in PortalRun (portal=0xd85518, count=2147483647, isTopLevel=<optimized out>, dest=0xd9e0a0,
altdest=0xd9e0a0,completionTag=0x7fe2d6bc "")   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/tcop/pquery.c:790
#13 0x007067b0 in exec_simple_query (   query_string=0xd594d0 "SELECT test_shm_mq_pipelined(16384, (select
string_agg(chr(32+(random()*95)::int),'') from generate_series(1,270000)), 200, 3);")   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/tcop/postgres.c:1045
#14 PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0xcfa180 "contrib_regression",
username=<optimizedout>)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/tcop/postgres.c:4016
#15 0x00450888 in BackendRun (port=0xd16288)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/postmaster/postmaster.c:4123
#16 BackendStartup (port=0xd16288)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/postmaster/postmaster.c:3797
#17 ServerLoop ()   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/postmaster/postmaster.c:1576
#18 0x006983b8 in PostmasterMain (argc=8, argv=<optimized out>)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/postmaster/postmaster.c:1223
#19 0x0045256c in main (argc=8, argv=0xcf9890)   at
/build/postgresql-9.4-cq5GAt/postgresql-9.4-9.4~rc1/build/../src/backend/main/main.c:227

Kernel stack:

sudo cat /proc/29001/stack
[<ffffffff805cb7c0>] __schedule+0x300/0x9c0
[<ffffffff805cd598>] schedule_hrtimeout_range_clock+0xd8/0x100
[<ffffffff80238ca4>] poll_schedule_timeout+0x34/0x60
[<ffffffff8023a168>] do_sys_poll+0x320/0x470
[<ffffffff8023a408>] SyS_poll+0xe0/0x168
[<ffffffff801142b4>] handle_sys+0x114/0x134

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: tracking commit timestamps
Next
From: Christoph Berg
Date:
Subject: PITR failing to stop before DROP DATABASE