Thread: HS/SR Assert server crash

HS/SR Assert server crash

From
Bruce Momjian
Date:
I was able to easily crash the standby server today just by starting it
and connecting to it via psql.  The master was idle.  The failure was:
LOG:  streaming replication successfully connected to primaryTRAP: FailedAssertion("!(((xmax) >= ((TransactionId)
3)))",File: "procarray.c", Line: 1211)LOG:  server process (PID 12761) was terminated by signal 6: Abort trapLOG:
terminatingany other active server processes
 

My master postgresql.conf was:
wal_level = hot_standby                 # minimal, archive, or hot_standbyarchive_mode = on               # allows
archivingto be donearchive_command = 'cp -i %p /u/pg/archive/%f < /dev/null '  # command to use to archive a logfile
segmentmax_wal_senders= 1             # max number of walsender processes
 

My slave postgresql.conf was:
port = 5433                             # (change requires restart)wal_level = hot_standby                 # minimal,
archive,or hot_standbyarchive_mode = off              # allows archiving to be donearchive_command = 'cp -i %p
/u/pg/archive/%f< /dev/null '      # command to use to archive a logfile segmenthot_standby = on                #
allowsqueries during recoverymax_wal_senders = 1             # max number of walsender processes
 

and my slave recovery.conf was:
restore_command = 'cp /u/pg/archive/%f %p'              # e.g. 'cp /mnt/server/archivedir/%f %p'standby_mode =
'on'primary_conninfo= 'host=localhost port=5432'           # e.g. 'host=localhost port=5432'
 

Let me know what additional information I can supply.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com


Re: HS/SR Assert server crash

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> I was able to easily crash the standby server today just by starting it
> and connecting to it via psql.  The master was idle.  The failure was:
> 
>     LOG:  streaming replication successfully connected to primary
>     TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: "procarray.c", Line: 1211)
>     LOG:  server process (PID 12761) was terminated by signal 6: Abort trap
>     LOG:  terminating any other active server processes
> 
> My master postgresql.conf was:
> 
>     wal_level = hot_standby                 # minimal, archive, or hot_standby
>     archive_mode = on               # allows archiving to be done
>     archive_command = 'cp -i %p /u/pg/archive/%f < /dev/null '  # command to use to archive a logfile segment
>     max_wal_senders = 1             # max number of walsender processes
> 
> My slave postgresql.conf was:
> 
>     port = 5433                             # (change requires restart)
>     wal_level = hot_standby                 # minimal, archive, or hot_standby
>     archive_mode = off              # allows archiving to be done
>     archive_command = 'cp -i %p /u/pg/archive/%f < /dev/null '      # command to use to archive a logfile segment
>     hot_standby = on                # allows queries during recovery
>     max_wal_senders = 1             # max number of walsender processes
> 
> and my slave recovery.conf was:
> 
>     restore_command = 'cp /u/pg/archive/%f %p'              # e.g. 'cp /mnt/server/archivedir/%f %p'
>     standby_mode = 'on'
>     primary_conninfo = 'host=localhost port=5432'           # e.g. 'host=localhost port=5432'
> 
> Let me know what additional information I can supply.

I saw Simon's commit fixing this bug.  Another good reason we didn't
bundle 9.0 beta2 yesterday.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com


Re: HS/SR Assert server crash

From
Simon Riggs
Date:
On Thu, 2010-05-13 at 18:01 -0400, Bruce Momjian wrote:

> I was able to easily crash the standby server today just by starting it
> and connecting to it via psql.  The master was idle.  The failure was:
> 
>     LOG:  streaming replication successfully connected to primary
>     TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: "procarray.c", Line: 1211)
>     LOG:  server process (PID 12761) was terminated by signal 6: Abort trap
>     LOG:  terminating any other active server processes

Thanks for the report. Fix applied. (Sorry for delay in replying)

-- Simon Riggs           www.2ndQuadrant.com