Re: ERROR: out of memory DETAIL: Failed on request of size ??? - Mailing list pgsql-general

From Christofer C. Bell
Subject Re: ERROR: out of memory DETAIL: Failed on request of size ???
Date
Msg-id CAOEVnYsBCXJ=vYd2bo8xqRwnE9rq_PTOQa2OKsUWec59YMRr-Q@mail.gmail.com
Whole thread Raw
In response to Re: ERROR: out of memory DETAIL: Failed on request of size ???  (Edson Richter <edsonrichter@hotmail.com>)
List pgsql-general
On Fri, Nov 22, 2013 at 1:09 PM, Edson Richter <edsonrichter@hotmail.com> wrote:
Em 19/11/2013 02:30, Brian Wong escreveu:
I've tried any work_mem value from 1gb all the way up to 40gb, with no effect on the error.  I'd like to think of this problem as a server process memory (not the server's buffers) or client process memory issue, primarily because when we tested the error there was no other load whatsoever.  Unfortunately,  the error doesn't say what kinda memory ran out.

--- Original Message ---

From: "bricklen" <bricklen@gmail.com>
Sent: November 18, 2013 7:25 PM
To: "Brian Wong" <bwong64@hotmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] ERROR: out of memory DETAIL: Failed on request of size ???

On Mon, Nov 18, 2013 at 12:40 PM, Brian Wong <bwong64@hotmail.com> wrote:
We'd like to seek out your expertise on postgresql regarding this error that we're getting in an analytical database.

Some specs:
proc: Intel Xeon X5650 @ 2.67Ghz dual procs 6-core, hyperthreading on.
memory: 48GB
OS: Oracle Enterprise Linux 6.3
postgresql version: 9.1.9
shared_buffers: 18GB

After doing a lot of googling, I've tried setting FETCH_COUNT on psql AND/OR setting work_mem.  I'm just not able to work around this issue, unless if I take most of the MAX() functions out but just one.

Excuse me (or just ignore me) if it is a stupid question, but have you configured sysctl.conf accordingly?
For instance, to use larget memory settings, I had to configure my EL as follows:

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

On most systems:
kernel.shmmax = 68719476736 (64 GB) implies a kernel.shmall = 16777216 (4k pages)

While:
kernel.shmall = 4294967296 (4k pages) implies a kernel.shmmax = 17592186044416 (16 TB)

kernel.shmmax is the amount of memory you want to reserve in bytes.  The kernel.shmmax is the amount of memory you want to reserve in pages (normally 4096 bytes each unless you are using big pages or huge pages, the rest of this assumes 4096, adjust accordingly for your setup).  So generally, kernel.shmall will be kernel.shmmax divided by 4096 (and likewise, kernel.shmmax will be kernel.shmall multiplied by 4096).

To find out your page size, use this command:
$ getconf PAGE_SIZE

To find out the maximum physical pages available in the system, use this command:
$ getconf_PHYS_PAGES

To use the 8G recommended by Tomáš Vondra earlier, and assuming 4kb pages, your settings should be:

kernel.shmmax = 8589934592
kernel.shmall = 2097152

If the database does not start (assuming you're set shared_buffers to exactly 8GB in postgresql.conf), it will give you an error message with a corrected (slightly higher) value.  Use that value for kernel.shmmax and use that number divided by your page size (again, generally 4096) for kernel.shmall.

For example, for my small database, I am using 256 MB of shared buffers.  So I initially try to use these settings:

kernel.shmmax = 268435456
kernel.shmall = 65536

However, my database does not start.  The suggested kernel.shmmax giving by PostgresSQL is 298156032 (~284 MB).  So I use replace the above with these values:

kernel.shmmax = 288940032
kernel.shmall = 70542

I'm open to correction where I've misspoken and I hope this is helpful to you.

Good luck!
 
--
Chris

"If you wish to make an apple pie from scratch, you must first invent the Universe." -- Carl Sagan


pgsql-general by date:

Previous
From: Ken Tanzer
Date:
Subject: Re: Getting non_NULL right-side values on a non-matching join?
Next
From: Ben Chobot
Date:
Subject: Re: 9.1.9 -> 9.1.10 causing corruption