Thread: standby replication server throws invalid memory alloc request size ,does not start up

Hi All,

 

This is my first postgres query to admin list, so if I am not following the right standards for asking the question, pls let me know 😊

 

 

The problem:

 

I have a postgres cluster as

 

A (primary)-> streaming replication -> B(hot_standby=on)

 

We had a power outage in one of the data centers, and when we got back, one of the databases servers (B the standby node) seem to show weird errors and is not starting up. A recovered fine, and it running fine.

 

--------- logs

 

2018-06-25 10:57:04 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2018-06-25 10:57:04 UTC WARNING:  terminating connection because of crash of another server process

2018-06-25 10:57:04 UTC DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2018-06-25 10:57:04 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.

2018-06-25 10:57:04 UTC LOG:  database system is shut down

2018-06-27 16:59:28 UTC LOG:  listening on IPv4 address "0.0.0.0", port 5432

2018-06-27 16:59:28 UTC LOG:  listening on IPv6 address "::", port 5432

2018-06-27 16:59:28 UTC LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"

2018-06-27 16:59:28 UTC LOG:  database system was interrupted while in recovery at log time 2018-06-25 10:52:21 UTC

2018-06-27 16:59:28 UTC HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.

2018-06-27 16:59:28 UTC LOG:  entering standby mode

2018-06-27 16:59:28 UTC LOG:  recovered replication state of node 1 to 63/89B52A98

2018-06-27 16:59:28 UTC LOG:  redo starts at 9A/6F1B3888

2018-06-27 16:59:28 UTC LOG:  consistent recovery state reached at 9A/6F25BFF0

2018-06-27 16:59:28 UTC FATAL:  invalid memory alloc request size 1768185856

2018-06-27 16:59:28 UTC LOG:  database system is ready to accept read only connections

2018-06-27 16:59:28 UTC LOG:  startup process (PID 11829) exited with exit code 1

2018-06-27 16:59:28 UTC LOG:  terminating any other active server processes

2018-06-27 16:59:28 UTC LOG:  database system is shut down

pg_ctl: could not start server

Examine the log output.

 

 

-------what I tried

I did a lot of searching around “invalid memory alloc request” it mostly points to some data error,

https://www.postgresql.org/message-id/AANLkTik88wGQ6h-xFHJBU_Fj_CnXK5LX7aBws_2oANKP%40mail.gmail.com

 

mem checks and disks are ok.

 

I took  a filesystem snapshot, synced it onto another server and tried to reload it, it gave the same error.

 

Now to ensure there was not real data causing this issue, I took a full dump of the primary db (A) using pg_dumpall and restored in onto the test server. I was able to load the db and was able to query the tables without issues.

So I am confused why the standby is behaving weird after the power outage.

 

Just FYI, I spun up a new node (  C  ) as a replacement for the problem standby(B) with pg_baseback and it is working fine. Can query all tables without problems. My only concern was do I need to worry about this error showing up again?

 

 

Most the googling points to some operation and finding out errors in some operation performed etc,

https://blog.dob.sk/2012/05/19/fixing-pg_dump-invalid-memory-alloc-request-size/         -   1

https://confluence.atlassian.com/jirakb/invalid-memory-alloc-request-size-440107132.html   -   2

 

I ran that function (in 1), could not find bad data rows.

 

 

Also, I tried to bump up (memory to 32GB, cpu to 8 cpu and shmmax to  18446744073692774399 to figure out is there was any resource constraint that I could temp bump up and then debug what went wrong, but it did not start up.

 

 

 

$ /usr/lib/postgresql/10/bin/postgres -V

postgres (PostgreSQL) 10.4 (Ubuntu 10.4-2.pgdg16.04+1)

 

 

$ grep -v "#" /etc/postgresql/10/main/postgresql.conf | sed -e '/^$/d'

ssl = on

ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem'

ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key'

log_timezone = UTC

stats_temp_directory = '/var/run/postgresql/10-main.pg_stat_tmp'

datestyle = 'iso, mdy'

timezone = UTC

default_text_search_config = 'pg_catalog.english'

log_directory = pg_log

archive_mode = on

maintenance_work_mem = 16MB

max_wal_size = 1GB

max_stack_depth = 2MB

autovacuum_analyze_scale_factor = 0.05

autovacuum_vacuum_scale_factor = 0.1

shared_preload_libraries = pg_stat_statements

work_mem = 4MB

max_wal_senders = 10

temp_buffers = 16MB

archive_command = '/bin/true'

effective_cache_size = 256MB

max_replication_slots = 10

hot_standby = on

wal_keep_segments = 64

wal_level = hot_standby

listen_addresses = '*'

log_min_duration_statement = 10000

max_worker_processes = 18

 

 

 

 

sysctl:

kernel.shmall = 18446744073692774399

kernel.shmmax = 18446744073692774399

 

 

 

$ free -m

              total        used        free      shared  buff/cache   available

Mem:          16047         134       14469           8        1442       15620

Swap:           975           0         975

$ nproc

4

 

 

Thanks,

Vijay

You need to worry, very much, that your setup did not preserve data write ordering when the power went out.

- you might have fsync turned off in PG
- you might be using a non-journaled filesystem
- your disks might be cheating on sync commands (consumer grade disks are notorious for this)
- you might be using consumer-grade flash that can't flush its cache on power loss

This highly unlikely to be a PG bug, and you do need to worry that it could have been the master if you had been less
lucky.

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/



> On Jun 28, 2018, at 7:17 AM, Vijaykumar Jain <vjain@opentable.com> wrote:
>
> My only concern was do I need to worry about this error showing up again?



> On Jun 28, 2018, at 7:26 AM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
>
> you do need to worry that it could have been the master

Correcting myself: you do need to worry that all 3 could have been corrupted if you had been unlucky, and that you
couldhave experienced corruption that was silent and you wouldn't know it--in fact you still need to worry about that 

Thanks Scott.

fsync is on.

postgres=# show fsync;
 fsync
-------
 on
(1 row)


The is a storage in qa which is less powerful then prod. (it’s on a Ubuntu 16.04 VM, vsan  / ssd and database is on lvm
partition)

It is an ext4 filesystem.
/path/vgx-root ext4      108G  6.3G   97G   7% /

Wrt:
 - your disks might be cheating on sync commands (consumer grade disks are notorious for this)    
 - you might be using consumer-grade flash that can't flush its cache on power loss

Ok I am not much aware of quality of hardware, i'll check with the team, we have not seen this issue in atleast 2 years
ofsince I have been here.
 

We do have multiple points of backup (s3 daily, barman and 2 standby for each primary).

----though not relevant to the bug but ...

And this particular cluster is one of a kind with a multi master mesh (if you could call it) setup.
Each region has one table that logically replicated to all the other regions db servers.

And also we spin up only one db per cluster for the same reason as it is easy to isolate the problem and use cheap
hardware(if that really is __)
 

But I'll check with the team and what the expectation should be from the storage that we get.


Thanks Scott. I just wanted to confirm if this is nothing to do with pg bug.


Vijay


On 6/28/18, 6:56 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:

    You need to worry, very much, that your setup did not preserve data write ordering when the power went out.
    
    - you might have fsync turned off in PG
    - you might be using a non-journaled filesystem
    - your disks might be cheating on sync commands (consumer grade disks are notorious for this)
    - you might be using consumer-grade flash that can't flush its cache on power loss
    
    This highly unlikely to be a PG bug, and you do need to worry that it could have been the master if you had been
lesslucky.
 
    
    --
    Scott Ribe
    scott_ribe@elevated-dev.com
    https://www.linkedin.com/in/scottribe/
    
    
    
    > On Jun 28, 2018, at 7:17 AM, Vijaykumar Jain <vjain@opentable.com> wrote:
    > 
    > My only concern was do I need to worry about this error showing up again?
    
    



> On Jun 28, 2018, at 9:47 AM, Vijaykumar Jain <vjain@opentable.com> wrote:
>
> Thanks Scott.
>
> fsync is on.
>
> postgres=# show fsync;
> fsync
> -------
> on
> (1 row)
>

Yeah, I agree with Scott; it sounds like fsync() is being shorted somewhere in the I/O path.  Are you using a RAID
controller;if so is the cache setting — write through or write back?  If it’s write back then writes are being cached
anda power outage will lead to data corruption — the only saving grace at that point would be the battery backup logic
onthe controller which may or nay not be trustworthy. 




Thanks guys.
We are better with backups as we have split the dbs across different availability groups with diff power supply etc.
So it is expected that servers will crash and we need to failover the db onto the other regions avg if required.
but servers crash all the time and we have been asked to make arrangements thinking of that.
So as long as this is a h/w failure i guess there is only one db server getting impacted and we should be good which is
good
If this was a pg bug all the nodes would get impacted and we would be in deep waters :)
So thanks.

Regards,
Vijay

________________________________________
From: Rui DeSousa <rui@crazybean.net>
Sent: Thursday, June 28, 2018 8:13:16 PM
To: Vijaykumar Jain
Cc: Scott Ribe; pgsql-admin@lists.postgresql.org
Subject: Re: [External] Re: standby replication server throws invalid memory alloc request size , does not start up



> On Jun 28, 2018, at 9:47 AM, Vijaykumar Jain <vjain@opentable.com> wrote:
>
> Thanks Scott.
>
> fsync is on.
>
> postgres=# show fsync;
> fsync
> -------
> on
> (1 row)
>

Yeah, I agree with Scott; it sounds like fsync() is being shorted somewhere in the I/O path.  Are you using a RAID
controller;if so is the cache setting — write through or write back?  If it’s write back then writes are being cached
anda power outage will lead to data corruption — the only saving grace at that point would be the battery backup logic
onthe controller which may or nay not be trustworthy. 




On 2018-06-28 10:43:16 -0400, Rui DeSousa wrote:
> 
> 
> > On Jun 28, 2018, at 9:47 AM, Vijaykumar Jain <vjain@opentable.com> wrote:
> > 
> > Thanks Scott.
> > 
> > fsync is on.
> > 
> > postgres=# show fsync;
> > fsync
> > -------
> > on
> > (1 row)
> > 
> 
> Yeah, I agree with Scott; it sounds like fsync() is being shorted
> somewhere in the I/O path.  Are you using a RAID controller; if so is
> the cache setting — write through or write back?  If it’s write back
> then writes are being cached and a power outage will lead to data
> corruption — the only saving grace at that point would be the battery
> backup logic on the controller which may or nay not be trustworthy.

How are you making the inference that that's the problem? It could very
well be a postgres bug or somebody having used pg_resetxlog or.. ?

Greetings,

Andres Freund


Hi,

On 2018-06-28 13:17:52 +0000, Vijaykumar Jain wrote:
> Hi All,
> 
> This is my first postgres query to admin list, so if I am not following the right standards for asking the question,
plslet me know 😊
 
> 
> 
> The problem:
> 
> I have a postgres cluster as
> 
> A (primary)-> streaming replication -> B(hot_standby=on)
> 
> We had a power outage in one of the data centers, and when we got back, one of the databases servers (B the standby
node)seem to show weird errors and is not starting up. A recovered fine, and it running fine.
 
> 
> --------- logs
> 
> 2018-06-25 10:57:04 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
> 2018-06-25 10:57:04 UTC WARNING:  terminating connection because of crash of another server process
> 2018-06-25 10:57:04 UTC DETAIL:  The postmaster has commanded this server process to roll back the current
transactionand exit, because another server process exited abnormally and possibly corrupted shared memory.
 
> 2018-06-25 10:57:04 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This sounds like either was a crash or you shutdown the server with
force.  Could you post the preceding lines?


> 2018-06-25 10:57:04 UTC LOG:  database system is shut down
> 2018-06-27 16:59:28 UTC LOG:  listening on IPv4 address "0.0.0.0", port 5432
> 2018-06-27 16:59:28 UTC LOG:  listening on IPv6 address "::", port 5432
> 2018-06-27 16:59:28 UTC LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
> 2018-06-27 16:59:28 UTC LOG:  database system was interrupted while in recovery at log time 2018-06-25 10:52:21 UTC
> 2018-06-27 16:59:28 UTC HINT:  If this has occurred more than once some data might be corrupted and you might need to
choosean earlier recovery target.
 
> 2018-06-27 16:59:28 UTC LOG:  entering standby mode
> 2018-06-27 16:59:28 UTC LOG:  recovered replication state of node 1 to 63/89B52A98
> 2018-06-27 16:59:28 UTC LOG:  redo starts at 9A/6F1B3888
> 2018-06-27 16:59:28 UTC LOG:  consistent recovery state reached at 9A/6F25BFF0
> 2018-06-27 16:59:28 UTC FATAL:  invalid memory alloc request size 1768185856
> 2018-06-27 16:59:28 UTC LOG:  database system is ready to accept read only connections
> 2018-06-27 16:59:28 UTC LOG:  startup process (PID 11829) exited with exit code 1
> 2018-06-27 16:59:28 UTC LOG:  terminating any other active server processes
> 2018-06-27 16:59:28 UTC LOG:  database system is shut down


Could you restart the server with log_min_messages set to debug3, and
change log_line_prefix so it includs the pid?

Greetings,

Andres Freund


> On Jun 28, 2018, at 10:36 AM, Andres Freund <andres@anarazel.de> wrote:
>
> How are you making the inference that that's the problem? It could very
> well be a postgres bug or somebody having used pg_resetxlog or.. ?

There could always be a postgres bug, however misconfigured cache layers are a very common cause of this kind of
symptom,and postgres bugs very rare. 

I don't believe pg_resetxlog would cause this particular problem. Failure to recovery yes, but not illegal memory
operation.

Hi,

On 2018-06-28 10:43:16 -0600, Scott Ribe wrote:
> > On Jun 28, 2018, at 10:36 AM, Andres Freund <andres@anarazel.de> wrote:
> > 
> > How are you making the inference that that's the problem? It could very
> > well be a postgres bug or somebody having used pg_resetxlog or.. ?
> 
> There could always be a postgres bug, however misconfigured cache
> layers are a very common cause of this kind of symptom, and postgres
> bugs very rare.

Sure, but it's also easy to not find postgres bugs if you don't
investigate. If the error comes from parsing WAL - which is likely but
not guaranteed from the log output - then it's unlikely the error is
caused by on-disk corruption. The WAL is protected by a CRC checksum,
and if it fails, you don't get to the memory allocation bit.


> I don't believe pg_resetxlog would cause this particular
> problem. Failure to recovery yes, but not illegal memory operation.

It can. Not as a result of reading WAL itself, but as a result of the
resulting corruption. That's why I asked for debugging output.

- Andres


The server was destroyed.

But did copy the pg_data folder and restored it on the test server.
It failed with similar error. Will that do.

I do have some very high (DEBUG5) level logs in my console along with strace (postgresql start) logs.

Not sure if this is ok to paste in the mail (it will not be readable) 


On 6/28/18, 10:16 PM, "Andres Freund" <andres@anarazel.de> wrote:

    Hi,
    
    On 2018-06-28 10:43:16 -0600, Scott Ribe wrote:
    > > On Jun 28, 2018, at 10:36 AM, Andres Freund <andres@anarazel.de> wrote:
    > > 
    > > How are you making the inference that that's the problem? It could very
    > > well be a postgres bug or somebody having used pg_resetxlog or.. ?
    > 
    > There could always be a postgres bug, however misconfigured cache
    > layers are a very common cause of this kind of symptom, and postgres
    > bugs very rare.
    
    Sure, but it's also easy to not find postgres bugs if you don't
    investigate. If the error comes from parsing WAL - which is likely but
    not guaranteed from the log output - then it's unlikely the error is
    caused by on-disk corruption. The WAL is protected by a CRC checksum,
    and if it fails, you don't get to the memory allocation bit.
    
    
    > I don't believe pg_resetxlog would cause this particular
    > problem. Failure to recovery yes, but not illegal memory operation.
    
    It can. Not as a result of reading WAL itself, but as a result of the
    resulting corruption. That's why I asked for debugging output.
    
    - Andres
    


On 2018-06-28 17:00:23 +0000, Vijaykumar Jain wrote:
> The server was destroyed.
> 
> But did copy the pg_data folder and restored it on the test server.
> It failed with similar error. Will that do.
> 
> I do have some very high (DEBUG5) level logs in my console along with strace (postgresql start) logs.
> 
> Not sure if this is ok to paste in the mail (it will not be readable) 

Can't you attach files containing them?

Thanks,

Andres




On Jun 28, 2018, at 12:36 PM, Andres Freund <andres@anarazel.de> wrote:

How are you making the inference that that's the problem? It could very
well be a postgres bug or somebody having used pg_resetxlog or.. ?


Is this to me? Most RAID controllers I’ve seen have been setup with write back for performance and also they also seem to have faulty batteries.

I’m merely stating that the I/O path needs to be validated to insure fsync() actually works — otherwise a power failure will introduce data corruption without doubt.  

We know the event was occurred after a power failure; and that event would not have corrupted Postgres in a properly configured system; thus, pointing to a misconfiguration in the system.



 

I do not have earlier system logs, the machine was wiped.
I only got access to db dump which I took from that machine. 
These logs are from when I restore the /var/lib/postgresql of the problem server on the test server.

I have enabled most of the debugging flags, it may be verbose but I have them attached in this mail.

Let me know if anything else is needed.


Thanks,
Vijay

On 6/28/18, 10:34 PM, "Andres Freund" <andres@anarazel.de> wrote:

    On 2018-06-28 17:00:23 +0000, Vijaykumar Jain wrote:
    > The server was destroyed.
    > 
    > But did copy the pg_data folder and restored it on the test server.
    > It failed with similar error. Will that do.
    > 
    > I do have some very high (DEBUG5) level logs in my console along with strace (postgresql start) logs.
    > 
    > Not sure if this is ok to paste in the mail (it will not be readable) 
    
    Can't you attach files containing them?
    
    Thanks,
    
    Andres
    
    


Attachment