Re: Vacuum Issues - Mailing list pgsql-admin

From Rui DeSousa
Subject Re: Vacuum Issues
Date
Msg-id 8D9390C5-E192-4232-B18D-BDF9F658DFEA@crazybean.net
Whole thread Raw
In response to Re: Vacuum Issues  (Darron Harrison <darron@realtyserver.com>)
Responses Re: Vacuum Issues  (Darron Harrison <darron@realtyserver.com>)
List pgsql-admin


On Mar 26, 2020, at 5:40 PM, Darron Harrison <darron@realtyserver.com> wrote:

There is no current lag on the replicas. Replication does traverse a firewall, but we have made no changes recently.

I will say that one of the hot standbys was only recently attached, and it seems like the issues started when we began sending some longer running queries it's way. We have since placed those queries back on the master, but the vacuum issues remain.

One side effect of whatever is happening, is that nightly backups are taking twice as long as normal.

It could sill be a bad/sale replication session.  If the firewall drops the replication stream and does not send a reset packets (bad pratice) then the replication session might still be lingering on the Postgres server and holding on to a very old xmin.  Do you know if you have TCP/IP Keepalive enabled? I don’t think in 9.2 replication sessions are listed in pg_stat_activity; thus, you’ll have to look for TCP/IP connections to the replics that should not exist. 

Check all the upstream servers for stale TCP/IP replication connections; using netstat.  I would also look at the system’s process list for walsender processes to see if there more more than there should be; i.e. two of them for single replica.

If you do find one; the best way to terminate it is to drop the TCP/IP connection.  i.e. In FreeBSD it would be the command “tcpdrop”; for Linux there are few utilities that do same -- I just don’t recall the name of them.



pgsql-admin by date:

Previous
From: Darron Harrison
Date:
Subject: Re: Vacuum Issues
Next
From: Darron Harrison
Date:
Subject: Re: Vacuum Issues