Home > mailing lists

Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Hard limit on WAL space used (because PANIC sucks)
Date	June 7, 2013 09:03:18
Msg-id	51B1A1C1.7050206@vmware.com Whole thread Raw
In response to	Re: Hard limit on WAL space used (because PANIC sucks) (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: Hard limit on WAL space used (because PANIC sucks) Re: Hard limit on WAL space used (because PANIC sucks)
List	pgsql-hackers

Tree view

On 07.06.2013 00:38, Andres Freund wrote:
> On 2013-06-06 23:28:19 +0200, Christian Ullrich wrote:
>> * Heikki Linnakangas wrote:
>>
>>> The current situation is that if you run out of disk space while writing
>>> WAL, you get a PANIC, and the server shuts down. That's awful. We can
>>
>>> So we need to somehow stop new WAL insertions from happening, before
>>> it's too late.
>>
>>> A naive idea is to check if there's enough preallocated WAL space, just
>>> before inserting the WAL record. However, it's too late to check that in
>>
>> There is a database engine, Microsoft's "Jet Blue" aka the Extensible
>> Storage Engine, that just keeps some preallocated log files around,
>> specifically so it can get consistent and halt cleanly if it runs out of
>> disk space.
>>
>> In other words, the idea is not to check over and over again that there is
>> enough already-reserved WAL space, but to make sure there always is by
>> having a preallocated segment that is never used outside a disk space
>> emergency.
>
> That's not a bad technique. I wonder how reliable it would be in
> postgres.

That's no different from just having a bit more WAL space in the first 
place. We need a mechanism to stop backends from writing WAL, before you 
run out of it completely. It doesn't matter if the reservation is done 
by stashing away a WAL segment for emergency use, or by a variable in 
shared memory. Either way, backends need to stop using it up, by 
blocking or throwing an error before they enter the critical section.

I guess you could use the stashed away segment to ensure that you can 
recover after PANIC. At recovery, there are no other backends that could 
use up the emergency segment. But that's not much better than what we 
have now.

- Heikki

pgsql-hackers by date:

From: Hari Babu
Date: 07 June 2013, 07:46:46
Subject: system catalog pg_rewrite column ev_attr document description problem

From: Amit Kapila
Date: 07 June 2013, 11:41:54
Subject: Re: Performance Improvement by reducing WAL for Update Operation

Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers

Previous

Next