Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders
Date
Msg-id 9852.1492878455@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders
List pgsql-hackers
I wrote:
> Taking a quick census of other buildfarm machines that are known to be
> running the recovery test, it appears that most (not all) are seeing
> one or both traps.  But the test is reporting success anyway, everywhere
> except on Noah's 32-bit AIX critters.

Or, to be a bit more scientific, let's dig into the buildfarm database.
A couple more critters have started running the recovery test since
yesterday; these are the latest reports we have:

pgbfprod=> select sysname, max(snapshot) as newest, count(*) from build_status_log where log_stage =
'recovery-check.log'group by 1 order by 2;   sysname    |       newest        | count  
---------------+---------------------+-------hamster       | 2016-09-24 16:00:07 |   182jacana        | 2017-04-20
21:00:20|     3skink         | 2017-04-22 05:00:01 |     2sungazer      | 2017-04-22 06:07:17 |     7tern          |
2017-04-2206:38:09 |     8hornet        | 2017-04-22 06:41:12 |     7mandrill      | 2017-04-22 08:44:09 |
8nightjar     | 2017-04-22 13:54:24 |    55longfin       | 2017-04-22 14:29:17 |    13calliphoridae | 2017-04-22
14:30:01|     4piculet       | 2017-04-22 14:30:01 |     3culicidae     | 2017-04-22 14:30:01 |     5francolin     |
2017-04-2214:30:01 |     3prion         | 2017-04-22 14:33:05 |    12crake         | 2017-04-22 14:37:21 |    86 
(15 rows)

Grepping those specific reports for "TRAP" yields
   sysname    |
  l                                                                                                              

---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------jacana
      | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line:
92)sungazer     | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)sungazer      | TRAP:
FailedAssertion("!(*ptr== ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)tern
      | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)tern          | TRAP:
FailedAssertion("!(*ptr== ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)hornet
      | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)hornet        | TRAP:
FailedAssertion("!(*ptr== ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line:
92)mandrill     | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File: "walsender.c", Line: 3331)mandrill      | TRAP:
FailedAssertion("!(*ptr== ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line:
92)nightjar     | TRAP: FailedAssertion("!(lsn >= prev.lsn)", File:
"/pgbuild/root/HEAD/pgsql.build/../pgsql/src/backend/replication/walsender.c",Line: 3331)nightjar      | TRAP:
FailedAssertion("!(*ptr== ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"/pgbuild/root/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line: 92)longfin       | TRAP:
FailedAssertion("!(lsn>= prev.lsn)", File: "walsender.c", Line: 3331)longfin       | TRAP: FailedAssertion("!(*ptr ==
((TransactionId)0) || (*ptr == parent && overwriteOK))", File: "subtrans.c", Line: 92)calliphoridae | TRAP:
FailedAssertion("!(*ptr== ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"/home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line:
92)piculet      | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"/home/andres/build/buildfarm-piculet/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line:
92)culicidae    | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"/home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line:
92)francolin    | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"/home/andres/build/buildfarm-francolin/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line: 92)prion
       | TRAP: FailedAssertion("!(*ptr == ((TransactionId) 0) || (*ptr == parent && overwriteOK))", File:
"/home/ec2-user/bf/root/HEAD/pgsql.build/../pgsql/src/backend/access/transam/subtrans.c",Line: 92) 
(18 rows)

So 6 of 15 critters are getting the walsender.c assertion,
and those six plus six more are seeing the subtrans.c one,
and three are seeing neither one.  There's probably a pattern
to that, don't know what it is.

(Actually, it looks like hamster stopped running this test
a long time ago, so whatever is in its last report is probably
not very relevant.  So more like 12 of 14 critters are seeing
one or both traps.)
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Replication lag tracking for walsenders