Re: Why is parula failing? - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Why is parula failing?
Date
Msg-id 130dde50-a2bf-4373-ac1a-ab98fad4c890@enterprisedb.com
Whole thread Raw
In response to Re: Why is parula failing?  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers

On 4/9/24 05:48, David Rowley wrote:
> On Mon, 8 Apr 2024 at 23:56, Robins Tharakan <tharakan@gmail.com> wrote:
>> #3  0x000000000083ed84 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents@entry=41, timeout=600000,
wait_event_info=wait_event_info@entry=150994946)at latch.c:538
 
>> #4  0x0000000000907404 in pg_sleep (fcinfo=<optimized out>) at misc.c:406
> 
>> #17 0x000000000086a944 in exec_simple_query (query_string=query_string@entry=0x28171c90 "SELECT pg_sleep(0.1);") at
postgres.c:1274
> 
> I have no idea why WaitLatch has timeout=600000.  That should be no
> higher than timeout=100 for "SELECT pg_sleep(0.1);".  I have no
> theories aside from a failing RAM module, cosmic ray or a well-timed
> clock change between the first call to gettimeofday() in pg_sleep()
> and the next one.
> 
> I know this animal is running debug_parallel_query = regress, so that
> 0.1 Const did have to get serialized and copied to the worker, so
> there's another opportunity for the sleep duration to be stomped on,
> but that seems pretty unlikely.
> 

AFAIK that GUC is set only for HEAD, so it would not explain the
failures on the other branches.

> I can't think of a reason why the erroneous  reltuples=48 would be
> consistent over 2 failing runs if it were failing RAM or a cosmic ray.
> 

Yeah, that seems very unlikely.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: [MASSMAIL]Identify huge pages accessibility using madvise
Next
From: Tomas Vondra
Date:
Subject: Re: Why is parula failing?