Re: [GENERAL] Causeless CPU load waves in backend, on windows, 9.5.5(EDB binary). - Mailing list pgsql-general

From Nikolai Zhubr
Subject Re: [GENERAL] Causeless CPU load waves in backend, on windows, 9.5.5(EDB binary).
Date
Msg-id 589460EA.2010404@yandex.ru
Whole thread Raw
In response to Re: [GENERAL] Causeless CPU load waves in backend, on windows, 9.5.5(EDB binary).  (Nikolai Zhubr <n-a-zhubr@yandex.ru>)
Responses Re: [GENERAL] Causeless CPU load waves in backend, on windows, 9.5.5(EDB binary).  (Nikolai Zhubr <n-a-zhubr@yandex.ru>)
List pgsql-general
02.02.2017 2:14, I wrote:
> 01.02.2017 1:02, I wrote:
> [...]
>>> Could you use process monitor or such to see what the process is doing
>>> while using a lot of CPU?
>>
>> I'm not sure how to do this, especially considering that the process in
>> question is running as a service?
>>
>> Now, some more input:
>>
>> * 9.5.2 server running on linux x86_64 - unaffected! (What a relief! We
>> are moving to Centos soon anyway!)
>>
>> * 9.4.4 server running on win7 32-bit - affected, same thing as on XP.
>
> I've managed to create a "fix" (see diff below).
> It looks like the wait logic is somehow broken on windows currently,
> though I can not find the problem myself yet.
> It would be great if someone more familiar with the (windows-specific)
> code came up with ideas.
> I have a build environment ready so I could do more tests then.

Some update.

Adding this "Sleep(15)" before "goto retry" into secure_read() has
apparently eliminated the effect at our production server too. That is,
my load-bug-detector has been quiet for > 24hr or more.

Now by adding more debigging stuff into secure_read() and secure_write()
I've found that:

* secure_write() is likely irrelevant, as "goto retry" there was never
actually hit yet;

* in secure_read(), during the intervals of excessive cpu load,
WaitLatchOrSocket() was never observed to indicate latch event, and was
never observed to (erroneously) indicate socket readiness more than once
(with socket read attempt in between), which I was suspecting happening,
so I can not blame secure_read() itself and this all makes me wonder
even more...

Note: I'm testing with SSL off now.

As always, and hints greatly appreciated!


Thank you.
Nikolai

>
> --- be-secure.c.orig 2017-02-01 22:37:37.228032608 +0300
> +++ be-secure.c 2017-02-01 22:51:17.655751292 +0300
> @@ -159,6 +159,7 @@
> * socket to become ready again.
> */
> }
> + Sleep(15); /* n.zhubr */
> goto retry;
> }
>
> @@ -238,6 +239,7 @@
> * socket to become ready again.
> */
> }
> + Sleep(15); /* n.zhubr */
> goto retry;
> }
>
>
> Thank you.
>
> Nikolai
>
>>
>>
>> Thank you.
>>
>> Nikolai
>>
>>>
>>> Regards,
>>>
>>> Andres
>>>
>>
>>
>>
>
>
>



pgsql-general by date:

Previous
From: JP Jacoupy
Date:
Subject: Re: [GENERAL] Synchronous Commit, WAL archiving and statement_timeout
Next
From: Jong-won Choi
Date:
Subject: Re: [GENERAL] Row level security policy - calling function for righthand side value of 'in' in using_expression