proposal: add 'waiting for replication' to pg_stat_activity.state - Mailing list pgsql-hackers

From Julian Schauder
Subject proposal: add 'waiting for replication' to pg_stat_activity.state
Date
Msg-id 565F52EF.6060200@credativ.de
Whole thread Raw
Responses Re: proposal: add 'waiting for replication' to pg_stat_activity.state  (Craig Ringer <craig@2ndquadrant.com>)
Re: proposal: add 'waiting for replication' to pg_stat_activity.state  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
Hello Hackers,


I recently analyzed an incident where a major lag in synchronous replication
blocked a number of synchronous backends. I found myself looking at backends
that, according to pg_stat_activity, were neither waiting nor idle but yet they
didn't finish their work.

As it turns out, the major waiting loop for syncrep updates the processtitle,
but is silent within postgres and stat_activity. It seems misleading that
commited but waiting backends are 'active' although there is little done apart
from waiting.

> # select pid, waiting, state, substr(query,1,6) from pg_stat_activity ;
>   pid  | waiting | state  | substr
> -------+---------+--------+--------
>  26294 | f       | active | END;
>  26318 | f       | active | create
>  26323 | f       | active | insert
>  26336 | f       | active | insert
(output of waiting statements [vanilla])

While 'active' is technically correct for a backend that is commited but waiting
for replication in terms of 'not beeing available for new tasks', it also
implies that a backend is dealing with the issue at hand. The remote host
however is out of our clusters control, hence all signs should be pointing to
the standby-host.


I suggest adding a new state to pg_stat_activity.state for backends that are
waiting for their synchronous commit to be flushed on the remote host.
I chose 'waiting for synchronous replication' for now.

One should refrain from the waiting flag at this point as there is no waiting
done on internal processes. Instead the backend waits for factors beyond our
clusters control to change.


> # select pid, waiting, state, substr(query,1,6) from pg_stat_activity ;
>  pid  | waiting |                state                | substr
> ------+---------+-------------------------------------+--------
>  3360 | f       | waiting for synchronous replication | END;
>  3465 | f       | waiting for synchronous replication | create
>  3477 | f       | waiting for synchronous replication | insert
>  3489 | f       | waiting for synchronous replication | insert
(output of waiting statements [patched])


patch attached


regards,

Julian Schauder

Attachment

pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Another little thing about psql wrapped expanded output
Next
From: Kevin Grittner
Date:
Subject: Re: snapshot too old, configured by time