Home > mailing lists

Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Hard limit on WAL space used (because PANIC sucks)
Date	January 23, 2014 12:57:02
Msg-id	CA+U5nMKaZGHofGY6O=ZUvz_V+n=ooh3CmO4cQhy=2dKrcuNiRg@mail.gmail.com Whole thread
In response to	Re: Hard limit on WAL space used (because PANIC sucks) (Jim Nasby <jim@nasby.net>)
Responses	Re: Hard limit on WAL space used (because PANIC sucks)
List	pgsql-hackers

Tree view

On 23 January 2014 01:19, Jim Nasby <jim@nasby.net> wrote:
> On 1/21/14, 6:46 PM, Andres Freund wrote:
>>
>> On 2014-01-21 16:34:45 -0800, Peter Geoghegan wrote:
>>>
>>> >On Tue, Jan 21, 2014 at 3:43 PM, Andres Freund<andres@2ndquadrant.com>
>>> > wrote:
>>>>
>>>> > >I personally think this isn't worth complicating the code for.
>>>
>>> >
>>> >You're probably right. However, I don't see why the bar has to be very
>>> >high when we're considering the trade-off between taking some
>>> >emergency precaution against having a PANIC shutdown, and an assured
>>> >PANIC shutdown
>>
>> Well, the problem is that the tradeoff would very likely include making
>> already complex code even more complex. None of the proposals, even the
>> one just decreasing the likelihood of a PANIC, like like they'd end up
>> being simple implementation-wise.
>> And that additional complexity would hurt robustness and prevent things
>> I find much more important than this.
>
>
> If we're not looking for perfection, what's wrong with Peter's idea of a
> ballast file? Presumably the check to see if that file still exists would be
> cheap so we can do that before entering the appropriate critical section.
>
> There's still a small chance that we'd end up panicing, but it's better than
> today. I'd argue that even if it doesn't work for CoW filesystems it'd still
> be a win.

I grant that it does sound simple enough for a partial stop gap.

My concern is that it provides only a short delay before the eventual
disk-full situation, which it doesn't actually prevent.

IMHO the main issue now is how we clear down old WAL files. We need to
perform a checkpoint to do that - and as has been pointed out in
relation to my proposal, we cannot complete that because of locks that
will be held for some time when we do eventually lock up.

That issue is not solved by having a ballast file(s).

IMHO we need to resolve the deadlock inherent in the
disk-full/WALlock-up/checkpoint situation. My view is that can be
solved in a similar way to the way the buffer pin deadlock was
resolved for Hot Standby.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: KONDO Mitsumasa
Date: 23 January 2014, 12:37:55
Subject: Re: Add min and max execute statement time in pg_stat_statement

From: Andres Freund
Date: 23 January 2014, 12:57:22
Subject: Re: Add CREATE support to event triggers

Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers

Previous

Next