Thread: [bug fix] Stats collector is not restarted on the standby

[bug fix] Stats collector is not restarted on the standby

From
"Tsunakawa, Takayuki"
Date:
Hello,

If the stats collector is forcibly terminated on the standby in streaming replication configuration, it won't be
restarteduntil the standby is promoted to the primary.  The attached patch restarts the stats collector on the standby. 

FYI, when the stats collector is down, SELECTs against the statistics views get stale data with the following message.

LOG:  using stale statistics instead of current ones because stats collector is not responding
STATEMENT:  select * from pg_stat_user_tables

Regards
Takayuki Tsunakawa


Attachment

Re: [bug fix] Stats collector is not restarted on the standby

From
Michael Paquier
Date:
On Wed, Oct 26, 2016 at 2:46 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
> If the stats collector is forcibly terminated on the standby in streaming replication configuration, it won't be
restarteduntil the standby is promoted to the primary.  The attached patch restarts the stats collector on the
standby.
>
> FYI, when the stats collector is down, SELECTs against the statistics views get stale data with the following
message.
>
> LOG:  using stale statistics instead of current ones because stats collector is not responding
> STATEMENT:  select * from pg_stat_user_tables

Oops. This could be a problem for some applications... As far as I can
see and after playing with it, your patch looks correct.
-- 
Michael



Re: [bug fix] Stats collector is not restarted on the standby

From
"Tsunakawa, Takayuki"
Date:
From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
> Oops. This could be a problem for some applications... As far as I can see
> and after playing with it, your patch looks correct.

Thank you for checking the patch.  I'm relieved.

Regards
Takayuki Tsunakawa


Re: [bug fix] Stats collector is not restarted on the standby

From
Michael Paquier
Date:
On Wed, Oct 26, 2016 at 4:10 PM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
>> Oops. This could be a problem for some applications... As far as I can see
>> and after playing with it, your patch looks correct.
>
> Thank you for checking the patch.  I'm relieved.

It would be a good idea to add that to next CF if nobody pops into the
thread so as we don't forget about it.
-- 
Michael



Re: [bug fix] Stats collector is not restarted on the standby

From
"Tsunakawa, Takayuki"
Date:
From: Michael Paquier [mailto:michael.paquier@gmail.com]
> It would be a good idea to add that to next CF if nobody pops into the thread
> so as we don't forget about it.

Thanks for the notice, done.

Regards
Takayuki Tsunakawa


Re: [bug fix] Stats collector is not restarted on the standby

From
Kuntal Ghosh
Date:
On Wed, Oct 26, 2016 at 12:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Oct 26, 2016 at 2:46 PM, Tsunakawa, Takayuki
> <tsunakawa.takay@jp.fujitsu.com> wrote:
>> If the stats collector is forcibly terminated on the standby in streaming replication configuration, it won't be
restarteduntil the standby is promoted to the primary.  The attached patch restarts the stats collector on the
standby.
>>
>> FYI, when the stats collector is down, SELECTs against the statistics views get stale data with the following
message.
>>
>> LOG:  using stale statistics instead of current ones because stats collector is not responding
>> STATEMENT:  select * from pg_stat_user_tables
>
> Oops. This could be a problem for some applications... As far as I can
> see and after playing with it, your patch looks correct.
> --
I've tested with the patch. The patch doesn't solve the problem
completely. In standby, after forcible termination, statistics
collector process is taking some time to get restarted. In between, if
somebody SELECTs against the statistics views, he will still get stale
data with the above LOG message.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



Re: [bug fix] Stats collector is not restarted on the standby

From
Michael Paquier
Date:
On Wed, Oct 26, 2016 at 7:12 PM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
> On Wed, Oct 26, 2016 at 12:10 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Wed, Oct 26, 2016 at 2:46 PM, Tsunakawa, Takayuki
>> <tsunakawa.takay@jp.fujitsu.com> wrote:
>>> If the stats collector is forcibly terminated on the standby in streaming replication configuration, it won't be
restarteduntil the standby is promoted to the primary.  The attached patch restarts the stats collector on the
standby.
>>>
>>> FYI, when the stats collector is down, SELECTs against the statistics views get stale data with the following
message.
>>>
>>> LOG:  using stale statistics instead of current ones because stats collector is not responding
>>> STATEMENT:  select * from pg_stat_user_tables
>>
>> Oops. This could be a problem for some applications... As far as I can
>> see and after playing with it, your patch looks correct.
>> --
> I've tested with the patch. The patch doesn't solve the problem
> completely. In standby, after forcible termination, statistics
> collector process is taking some time to get restarted. In between, if
> somebody SELECTs against the statistics views, he will still get stale
> data with the above LOG message.

If you test on a master node that would be the same: there is a delay
until the stats process restart. I have not looked at the code closely
enough in this area (reaper()?) to determine if there are ways to
improve the responsiveness of this process restart that is a
non-auxiliary proces btw, still improving this behavior is something I
feel would be invasive, and something that would be dedicated to HEAD.
The patch proposed here by Tsunakawa-san makes at least sure that a
node in PM_HOT_STANDBY state restarts it.
-- 
Michael



Re: [bug fix] Stats collector is not restarted on the standby

From
Kuntal Ghosh
Date:
On Wed, Oct 26, 2016 at 4:47 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>
> If you test on a master node that would be the same: there is a delay
> until the stats process restart. I have not looked at the code closely
> enough in this area (reaper()?) to determine if there are ways to
> improve the responsiveness of this process restart that is a
> non-auxiliary proces btw, still improving this behavior is something I
> feel would be invasive, and something that would be dedicated to HEAD.
> The patch proposed here by Tsunakawa-san makes at least sure that a
> node in PM_HOT_STANDBY state restarts it.
Yes, you are right. Master also has some delay for restarting the
process. Otherwise, the patch solves the problem.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



Re: [bug fix] Stats collector is not restarted on the standby

From
Robert Haas
Date:
On Wed, Oct 26, 2016 at 8:41 AM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
> On Wed, Oct 26, 2016 at 4:47 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>
>> If you test on a master node that would be the same: there is a delay
>> until the stats process restart. I have not looked at the code closely
>> enough in this area (reaper()?) to determine if there are ways to
>> improve the responsiveness of this process restart that is a
>> non-auxiliary proces btw, still improving this behavior is something I
>> feel would be invasive, and something that would be dedicated to HEAD.
>> The patch proposed here by Tsunakawa-san makes at least sure that a
>> node in PM_HOT_STANDBY state restarts it.
> Yes, you are right. Master also has some delay for restarting the
> process. Otherwise, the patch solves the problem.

The delay is intentional.  Per pgstat_start():
   /*    * Do nothing if too soon since last collector start.  This is a safety    * valve to protect against
continuousrespawn attempts if the collector    * is dying immediately at launch.  Note that since we will be re-called
 * from the postmaster main loop, we will get another chance later.    */   curtime = time(NULL);   if ((unsigned int)
(curtime- last_pgstat_start_time) <       (unsigned int) PGSTAT_RESTART_INTERVAL)       return 0;
last_pgstat_start_time= curtime;
 

Committed and back-patched all the way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [bug fix] Stats collector is not restarted on the standby

From
"Tsunakawa, Takayuki"
Date:
From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
> The delay is intentional.  Per pgstat_start():

It's kind of you to tell the reason.


> Committed and back-patched all the way.

Thanks again!

Regards
Takayuki Tsunakawa