Home > mailing lists

Re: Critical failure of standby - Mailing list pgsql-general

From	Alvaro Herrera
Subject	Re: Critical failure of standby
Date	August 12, 2016 19:20:43
Msg-id	20160812192035.GA703905@alvherre.pgsql Whole thread Raw
In response to	Critical failure of standby (James Sewell <james.sewell@jirotech.com>)
Responses	Re: Critical failure of standby Re: Critical failure of standby Re: Critical failure of standby
List	pgsql-general

Tree view

James Sewell wrote:

> 2016-08-12 04:43:53 GMT [23614]: [5-1] user=,db=,client=  (0:00000)LOG:  consistent recovery state reached at
3/8811DFF0
> 2016-08-12 04:43:53 GMT [23614]: [6-1] user=,db=,client=  (0:XX000)FATAL:  invalid memory alloc request size
3445219328
> 2016-08-12 04:43:53 GMT [23612]: [3-1] user=,db=,client=  (0:00000)LOG:  database system is ready to accept read only
connections
> 2016-08-12 04:43:53 GMT [23612]: [4-1] user=,db=,client=  (0:00000)LOG:  startup process (PID 23614) exited with exit
code1 
> 2016-08-12 04:43:53 GMT [23612]: [5-1] user=,db=,client=  (0:00000)LOG:  terminating any other active server
processes
> 2016-08-12 04:43:53 GMT [23612]: [6-1] user=,db=,client=  (0:00000)LOG:  archiver process (PID 23627) exited with
exitcode 1 

What version is this?

Hm, so the startup process finds the consistent point (which signals
postmaster so that line 23612/3 says "ready to accept read-only conns")
and immediately dies because of the invalid memory alloc error.  I
suppose that error must be while trying to process some xlog record, but
without a xlog address it's difficult to say anything.  I suppose you
could try to pg_xlogdump WAL starting at the last known good address
3/8811DFF0 but I wouldn't know what to look for.

One strange thing is that xlog replay sets up an error context, so you
would have had a line like "xlog redo HEAP" etc, but there's nothing
here.  So maybe the allocation is not exactly in xlog replay, but
something different.  We'd need to see a backtrace in order to see what.
Since this occurs in the startup process, probably the easiest way is to
patch the source to turn that error into PANIC, then re-run and examine
the resulting core file.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-general by date:

From: Adrian Klaver
Date: 12 August 2016, 19:01:21
Subject: Re: Corrupted Data ?

From: Hannes Erven
Date: 12 August 2016, 19:31:09
Subject: Re: How to parse xml containing optional elements

Re: Critical failure of standby - Mailing list pgsql-general

Previous

Next