Re: test_shm_mq failing on mips* - Mailing list pgsql-hackers

From Robert Haas
Subject Re: test_shm_mq failing on mips*
Date
Msg-id CA+Tgmoa9HE=1GObU7MKuGLDBjBNSNRW0bDE4P3JA=P=9mqGgqw@mail.gmail.com
Whole thread Raw
In response to Re: test_shm_mq failing on mips*  (Christoph Berg <cb@df7cb.de>)
Responses Re: test_shm_mq failing on mips*  (Christoph Berg <cb@df7cb.de>)
List pgsql-hackers
On Tue, Nov 25, 2014 at 10:42 AM, Christoph Berg <cb@df7cb.de> wrote:
> Re: To Robert Haas 2014-11-24 <20141124200824.GA22662@msg.df7cb.de>
>> > Does it fail every time when run on a machine where it fails sometimes?
>>
>> So far there's a consistent host -> fail-or-not mapping, but most
>> mips/mipsel build hosts have seen only one attempt so far which
>> actually came so far to actually run the shm_mq test.
>
> I got the build rescheduled on the same machine and it's hanging
> again.
>
>> Atm I don't have access to the boxes where it was failing (the builds
>> succeed on the mips(el) porter hosts available to Debian developers).
>> I'll see if I can arrange access there and run a test.
>
> Julien Cristau was so kind to poke into the hanging processes. The
> build has been stuck now for about 4h, while that postgres backend has
> only consumed 4s of CPU time (according to plain "ps"). The currently
> executing query is:
>
> SELECT test_shm_mq_pipelined(16384, (select string_agg(chr(32+(random()*95)::int), '') from
generate_series(1,270000)),200, 3); 
>
> (Waiting f, active, backend_start 6s older than xact_start, xact_start
> same as query_start, state_change 19盜 newer, all 4h old)

I can't tell from this exactly what's going wrong.  Questions:

1. Are there any background worker processes running at the same time?If so, how many?  We'd expect to see 3.
2. Can we printout of the following variables in stack frame 3
(test_shm_mq_pipelined)?  send_count, loop_count, *outqh, *inqh,
inqh->mqh_queue[0], outqh->mqh_queue[0]
3. What does a backtrace of each background worker process look like?
If they are stuck inside copy_messages(), can you print *outqh, *inqh,
inqh->mqh_queue[0], outqh->mqh_queue[0] from that stack frame?

Sorry for the hassle; I just don't have a better idea how to debug this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pg_stat_statement normalization fails due to temporary tables
Next
From: Alvaro Herrera
Date:
Subject: Re: Nitpicky doc corrections for BRIN functions of pageinspect