Thread: Too many logs are written on Windows (LOG: could not reserveshared memory region (addr=%p) for child %p:)

Hi,


My customer uses PostgreSQL on Windows and hits the problem that following log is written to the server logs too
frequently(250 thousand times per day).
 
"LOG:  could not reserve shared memory region (addr=%p) for child %p:"

This log is written when pgwin32_ReserveSharedMemoryRegion() in win32_shmem.c fails. If fails, internal_forkexec() in
postmaster.cretries up to 100 times. In most of my customer cases, internal_forkexec() finally succeeded to
pgwin32_ReserveSharedMemoryRegion()by retrying.
 

According to the comment of internal_forkexec(), pgwin32_ReserveSharedMemoryRegion() sometimes fails if ASLR is active.
Ifso, I think administrators are not interested in this log since it is a normal event. 
 

I think the log level should not be "LOG", but should be lower level such as "DEBUG1".


Regards,
Ryohei Takahashi




On Tue, Nov 20, 2018 at 1:21 AM Takahashi, Ryohei
<r.takahashi_2@jp.fujitsu.com> wrote:
> My customer uses PostgreSQL on Windows and hits the problem that following log is written to the server logs too
frequently(250 thousand times per day). 
> "LOG:  could not reserve shared memory region (addr=%p) for child %p:"
>
> This log is written when pgwin32_ReserveSharedMemoryRegion() in win32_shmem.c fails. If fails, internal_forkexec() in
postmaster.cretries up to 100 times. In most of my customer cases, internal_forkexec() finally succeeded to
pgwin32_ReserveSharedMemoryRegion()by retrying. 
>
> According to the comment of internal_forkexec(), pgwin32_ReserveSharedMemoryRegion() sometimes fails if ASLR is
active.If so, I think administrators are not interested in this log since it is a normal event. 

You might be right, but maybe we should first try to understand why
this is happening so frequently.  Maybe it's not that normal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Hi,


Thank you for replying.


> You might be right, but maybe we should first try to understand why
> this is happening so frequently.  Maybe it's not that normal.

I found past threads [1] and [2].

In thread [1], the error is mentioned as an ASLR problem.
In thread [2], the patch is provided which retries pgwin32_ReserveSharedMemoryRegion() to resolve the ASLR problem.

Therefore, I think our community regard that the error is an ASLR problem and is already resolved by the patch.


[1] -
https://www.postgresql.org/message-id/flat/CACmJi2JyDLMtxE3i_Krp%3DhrK4KKZ%3DD83s6eMw9q%3DHeM_srQdXg%40mail.gmail.com
[2] -
https://www.postgresql.org/message-id/flat/CAA4eK1%2BR6hSx6t_yvwtx%2BNRzneVp%2BMRqXAdGJZChcau8Uij-8g%40mail.gmail.com



Regards,
Ryohei Takahashi

Hi,


I found the reason of the message.

My customer uses "F-secure" antivirus software.
There are several pages that indicate F-secure causes this message such as [1].
I told my customer to stop F-secure and report to the vendor.


Anyway, I think this message is not helpful to administrators and should be lower level such as "DEBUG1".


[1] - https://www.postgresql.org/message-id/flat/feb75b80-aa4e-0b45-fbe7-0c2335904fcc%40evolu-s.it


Regards,
Ryohei Takahashi


On Wed, Nov 21, 2018 at 10:05:34AM -0500, Robert Haas wrote:
> On Tue, Nov 20, 2018 at 1:21 AM Takahashi, Ryohei <r.takahashi_2@jp.fujitsu.com> wrote:
> > My customer uses PostgreSQL on Windows and hits the problem that following log is written to the server logs too
frequently(250 thousand times per day).
 
> > "LOG:  could not reserve shared memory region (addr=%p) for child %p:"

> > According to the comment of internal_forkexec(), pgwin32_ReserveSharedMemoryRegion() sometimes fails if ASLR is
active.If so, I think administrators are not interested in this log since it is a normal event.
 

Windows ASLR can't explain your system's behavior.

> You might be right, but maybe we should first try to understand why
> this is happening so frequently.  Maybe it's not that normal.

When PostgreSQL installed that workaround, we weren't aware of any system
where this message was common enough to make the log volume a problem.  I
agree the message is harmful on your system, but I'm not inclined to change it
to DEBUG1 for the benefit of one system.  Can you adopt pgbouncer, to reduce
the frequency of starting new backend processes?  That should improve your
performance, too.

Could you collect the information
http://postgr.es/m/20181203053506.GB2860387@rfd.leadboat.com requests?  That
may help us understand your system's unusual behavior.  (The issue in that
thread is related but not identical.)




On Thu, Dec 6, 2018 at 7:04 AM Takahashi, Ryohei <r.takahashi_2@jp.fujitsu.com> wrote:
Hi,


I found the reason of the message.

My customer uses "F-secure" antivirus software.
There are several pages that indicate F-secure causes this message such as [1].
I told my customer to stop F-secure and report to the vendor.


Anyway, I think this message is not helpful to administrators and should be lower level such as "DEBUG1".


In this particular case, it seems it *was* helpful, no? That's how you found out the customer used a broken antivirus product, which may certainly also cause *other* issues.

Some sort of rate limiting to reduce the volume might help, but the message itself seems to have clearly been useful. 


--
From: Magnus Hagander [mailto:magnus@hagander.net]
> In this particular case, it seems it *was* helpful, no? That's how you found
> out the customer used a broken antivirus product, which may certainly also
> cause *other* issues.
> 
> Some sort of rate limiting to reduce the volume might help, but the message
> itself seems to have clearly been useful.

+1
We can change pgwin32_ReserveSharedMemoryRegion() to take an argument "int loglevel".  Then the caller first calls it
withLOG, and DEBUGx afterwards.  It may also be helpful for the caller to output "LOG:  tried %d times to reserve
sharedmemory region" when the caller ran the function twice or more before success.  That explains the possibly long
timeor CPU spikes of connection establishment.
 


Regards
Takayuki Tsunakawa



Hi Noah, Magnus and Tsunakawa-san,


Thank you for replying.


> Can you adopt pgbouncer, to reduce
> the frequency of starting new backend processes?  That should improve your
> performance, too.

Actually, before I found that F-secure causes this message,
I recommend my customer to use connection pooling to reduce the number of connection times.


> Could you collect the information
> http://postgr.es/m/20181203053506.GB2860387@rfd.leadboat.com requests?
> That may help us understand your system's unusual behavior.  (The issue in that
> thread is related but not identical.)

Sorry. Since my customer uses PostgreSQL in production environment,
I cannot deploy debug modules.


> In this particular case, it seems it *was* helpful, no? That's how you found
> out the customer used a broken antivirus product, which may certainly also
> cause *other* issues.
> 
> Some sort of rate limiting to reduce the volume might help, but the message
> itself seems to have clearly been useful.

Yes. You are right.
The message itself was useful.
Your idea to reduce the volume seems good.


> We can change pgwin32_ReserveSharedMemoryRegion() to take an argument "int
> loglevel".  Then the caller first calls it with LOG, and DEBUGx afterwards.
> It may also be helpful for the caller to output "LOG:  tried %d times to
> reserve shared memory region" when the caller ran the function twice or
> more before success.  That explains the possibly long time or CPU spikes
> of connection establishment.

It seems good idea for me.


Regards,
Ryohei Takahashi


On Mon, Dec 10, 2018 at 02:15:07AM +0000, Takahashi, Ryohei wrote:
>> We can change pgwin32_ReserveSharedMemoryRegion() to take an argument "int
>> loglevel".  Then the caller first calls it with LOG, and DEBUGx afterwards.
>> It may also be helpful for the caller to output "LOG:  tried %d times to
>> reserve shared memory region" when the caller ran the function twice or
>> more before success.  That explains the possibly long time or CPU spikes
>> of connection establishment.
>
> It seems good idea for me.

Like a couple of others on this thread I doubt that lowering the elevel
here would be a good thing, as keeping it noisy has been what allows to
know that something has gone wrong, no?  The current behavior is
useful.
--
Michael

Attachment
Hi,


Thank you for replying.


> Like a couple of others on this thread I doubt that lowering the elevel
> here would be a good thing, as keeping it noisy has been what allows to
> know that something has gone wrong, no?  The current behavior is
> useful.

Currently, I agree with you.
I cancel my proposal.


Regards,
Ryohei Takahashi