Home > mailing lists

Re: Critical failure of standby - Mailing list pgsql-general

From	James Sewell
Subject	Re: Critical failure of standby
Date	August 13, 2016 00:57:01
Msg-id	CAANVwEso9qL8h0xyk5sPgHz5+SpcyMJZJc1O5kdbGkX+-=2kLA@mail.gmail.com Whole thread Raw
In response to	Re: Critical failure of standby (Alvaro Herrera <alvherre@2ndquadrant.com>)
List	pgsql-general

Tree view

Hello,

I double posted this (posted once from an unregistered email and assumed it would be junked).

I'm continuing all discussion on the other thread now.

Cheers,

James Sewell,

PostgreSQL Team Lead / Solutions Architect

Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009

P (+61) 2 8099 9000 W www.jirotech.com F (+61) 2 8099 9099

On Sat, Aug 13, 2016 at 5:20 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

James Sewell wrote:

> 2016-08-12 04:43:53 GMT [23614]: [5-1] user=,db=,client= (0:00000)LOG: consistent recovery state reached at 3/8811DFF0
> 2016-08-12 04:43:53 GMT [23614]: [6-1] user=,db=,client= (0:XX000)FATAL: invalid memory alloc request size 3445219328
> 2016-08-12 04:43:53 GMT [23612]: [3-1] user=,db=,client= (0:00000)LOG: database system is ready to accept read only connections
> 2016-08-12 04:43:53 GMT [23612]: [4-1] user=,db=,client= (0:00000)LOG: startup process (PID 23614) exited with exit code 1
> 2016-08-12 04:43:53 GMT [23612]: [5-1] user=,db=,client= (0:00000)LOG: terminating any other active server processes
> 2016-08-12 04:43:53 GMT [23612]: [6-1] user=,db=,client= (0:00000)LOG: archiver process (PID 23627) exited with exit code 1

What version is this?

Hm, so the startup process finds the consistent point (which signals
postmaster so that line 23612/3 says "ready to accept read-only conns")
and immediately dies because of the invalid memory alloc error. I
suppose that error must be while trying to process some xlog record, but
without a xlog address it's difficult to say anything. I suppose you
could try to pg_xlogdump WAL starting at the last known good address
3/8811DFF0 but I wouldn't know what to look for.

One strange thing is that xlog replay sets up an error context, so you
would have had a line like "xlog redo HEAP" etc, but there's nothing
here. So maybe the allocation is not exactly in xlog replay, but
something different. We'd need to see a backtrace in order to see what.
Since this occurs in the startup process, probably the easiest way is to
patch the source to turn that error into PANIC, then re-run and examine
the resulting core file.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.

pgsql-general by date:

From: James Sewell
Date: 13 August 2016, 00:54:08
Subject: Re: Critical failure of standby

From: Patrick B
Date: 13 August 2016, 09:51:02
Subject: Re: PK Index - Removal

Re: Critical failure of standby - Mailing list pgsql-general

Previous

Next