Re: Replication slots and footguns - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Replication slots and footguns
Date
Msg-id 5320FB0F.40805@agliodbs.com
Whole thread Raw
In response to Replication slots and footguns  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On 03/12/2014 04:52 PM, Thom Brown wrote:
> On 12 March 2014 23:17, Michael Paquier <michael.paquier@gmail.com> wrote:
>> On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown <thom@linux.com> wrote:
>>> I'm not clear on why would dropping an active replication slot would
>>> solve disk space problems related to WAL.  I thought it was inactive
>>> slots that were the problem in this regard?
>> You could still have an active slot with a standby that is not able to
>> catch up AFAIK.
> 
> In that scenario, why would one wish to drop the replication slot?  If
> it can't keep up, dropping the replication slot would likely mean
> you'd orphan the standby due to the primary no longer holding on to
> the necessary WAL, and the standby is then useless.  In which case, if
> the standby is causing such problems, why not shut down that standby,
> thereby effectively decommissioning it, then delete the slot?

The problem I'm anticipating is that the replica server is actually
offline, but the master doesn't know it yet.  So here's the situ:

1. replica with a slot dies
2. wal logs start piling up and master is running low on disk space
3. replica is still marked "active" because we're waiting for default
tcp timeout (3+ hours) or for the proxy to kill the connection (forever).

But as Andres has shown, there's a two ways to fix the above.  So we're
in good shape.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: [PATCH] Store Extension Options
Next
From: Tom Lane
Date:
Subject: Re: db_user_namespace a "temporary measure"