Re: Is there a significant difference in Memory settings between 9.5and 12 - Mailing list pgsql-general
From | Tory M Blue |
---|---|
Subject | Re: Is there a significant difference in Memory settings between 9.5and 12 |
Date | |
Msg-id | CAEaSS0b5gSQXq-vV0RFEpfzzB4swKT=8FqS8PF7duaRtAxq3Og@mail.gmail.com Whole thread Raw |
In response to | Re: Is there a significant difference in Memory settings between 9.5and 12 (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Is there a significant difference in Memory settings between 9.5and 12
Re: Is there a significant difference in Memory settings between 9.5and 12 |
List | pgsql-general |
On Mon, May 11, 2020 at 9:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, May 12, 2020 at 2:52 PM Tory M Blue <tmblue@gmail.com> wrote:
> It took the change but didn't help. So 10GB of shared_buffers in 12 is still a no go. I'm down to 5GB and it works, but this is the same hardware, the same exact 9.5 configuration. So I'm missing something. WE have not had to mess with kernel memory settings since 9.4, so this is an odd one.
>
> I'll keep digging, but i'm hesitant to do my multiple TB db's with half of their shared buffer configs, until I understand what 12 is doing differently than 9.5
Which exact version of 9.5.x are you coming from? What's the exact
error message on 12 (you showed the shared_memory_type=sysv error, but
with the default value (mmap) how does it look)? What's your
huge_pages setting?
9.5-20
postgresql95-9.5.20-2PGDG.rhel7.x86_64postgresql95-contrib-9.5.20-2PGDG.rhel7.x86_64
postgresql95-libs-9.5.20-2PGDG.rhel7.x86_64
postgresql95-server-9.5.20-2PGDG.rhel7.x86_64
I don't use huge_pages
And this error is actually from the default mmap
May 08 12:33:58 qdb01.prod.ca postmaster[8790]: < 2020-05-08 12:33:58.324 PDT >HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 11026235392 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
The above error is with 12 trying to start with shared_buffers = 10GB...
9.5 starts fine with the same configuration file. That kind of started me down this path.
And just to repeat. Same exact hardware, same kernel, nothing more than installing the latest postgres12, copying my config files from 9.5 to 12 and running the pg_upgrade.
9.5 has been running for years with the same configuration file, so something changed somewhere along the line that is preventing 12 to start with the same config file. And the allocation error is with either the sysv or mman on 12. (will start with 5GB allocated, but not 10GB, on a 15GB box (dedicated postgres server).
Can you reproduce the problem with a freshly created test cluster? As
a regular user, assuming regular RHEL packaging, something like
/usr/pgsql-12/bin/initdb -D test_pgdata, and then
/usr/pgsql-12/bin/postgres -D test_pgdata -c shared_buffers=10GB (then
^C to stop it). If that fails to start in the same way, it'd be
interesting to see the output of the second command with strace in
front of it, in the part where it allocates shared memory. And
perhaps it'd be interesting to see the same output with
/usr/pgsql-9.5/bin/XXX (if you still have the packages). For example,
on my random dev laptop that looks like:
openat(AT_FDCWD, "/proc/meminfo", O_RDONLY) = 6
fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(6, "MemTotal: 16178852 kB\nMemF"..., 1024) = 1024
read(6, ": 903168 kB\nShmemHugePages: "..., 1024) = 311
close(6) = 0
mmap(NULL, 11016339456, PROT_READ|PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot
allocate memory)
mmap(NULL, 11016003584, PROT_READ|PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7ff74e579000
shmget(0x52e2c1, 56, IPC_CREAT|IPC_EXCL|0600) = 3244038
shmat(3244038, NULL, 0) = 0x7ff9df5ad000
The output is about the same on REL9_5_STABLE and REL_12_STABLE for
me, only slightly different sizes. If that doesn't fail in the same
way on your system with 12, perhaps there are some more settings from
your real clusters required to make it fail. You could add them one
by one with -c foo=bar or in the throw away
test_pgdata/postgresql.conf, and perhaps that process might shed some
light?
I was going to ask if it might be a preloaded extension that is asking
for gobs of extra memory in 12, but we can see from your "Failed
system call was shmget(key=5432001, size=11026235392, 03600)" that
it's in the same ballpark as my total above for shared_buffers=10GB.
Be more than happy to test this out. I'll see what I can pull tomorrow and provide some dataz :) I know it's not ideal to use the same config file, I know that various things are added or changed (usually added) but the defaults are typically safe. But after sometime dialing in the settings for our use case, I've just kind of kept moving them forward.
But let me do some more testing tomorrow (since I'm trying to get to the bottom of this, before I attempt my big DB upgrades). So I'll spend some time testing and see if I can't get similar "failures/challenges"? and go from there.
Appreciate the ideas!
Tory
pgsql-general by date: