Thread: BUG #4089: When available disk space is low pg_stop_backup() fails, as do subsequent recovery attempts.

The following bug has been logged online:

Bug reference:      4089
Logged by:          John Smith
Email address:      sodgodofall@gmail.com
PostgreSQL version: 8.3.0
Operating system:   Linux 2.6.20-gentoo-r8
Description:        When available disk space is low pg_stop_backup() fails,
as do subsequent recovery attempts.
Details:

Steps to reproduce:
 -- start with a running PG instance with WAL archiving enabled
 -- select pg_start_backup('test');
 -- Fill up the disk on which the data directory is present
 -- select pg_stop_backup();
    -- fails with: ERROR:  could not write file
"pg_xlog/000000010000000000000000.004989E8.backup": No space left on device
    -- at this point there is a 0-byte file
pg_xlog/000000010000000000000000.004989E8.backup present on disk
 -- stop and start PG
 -- recovery fails with: FATAL:  invalid data in file
"000000010000000000000000.004989E8.backup"
 -- NOTE: At this point removing 000000010000000000000000.004989E8.backup
allows PG to start successfully
On Fri, Apr 4, 2008 at 7:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "John Smith" <sodgodofall@gmail.com> writes:
>  > Steps to reproduce:
>  >  -- start with a running PG instance with WAL archiving enabled
>  >  -- select pg_start_backup('test');
>  >  -- Fill up the disk on which the data directory is present
>  >  -- select pg_stop_backup();
>  >     -- fails with: ERROR:  could not write file
>  > "pg_xlog/000000010000000000000000.004989E8.backup": No space left on device
>  >     -- at this point there is a 0-byte file
>  > pg_xlog/000000010000000000000000.004989E8.backup present on disk
>  >  -- stop and start PG
>  >  -- recovery fails with: FATAL:  invalid data in file
>  > "000000010000000000000000.004989E8.backup"
>  >  -- NOTE: At this point removing 000000010000000000000000.004989E8.backup
>  > allows PG to start successfully
>
>  What do you see as the bug here?  Seems like reasonable behavior to me.
>
>                         regards, tom lane
>

I was expecting one of two things:
1. The zero-byte file is removed upon failure to write during
pg_stop_backup() (or )
2. The zero-byte file is ignored or deleted on startup, since the
administrator has no choice but to delete the file upon a failed
startup.

- John
"John Smith" <sodgodofall@gmail.com> writes:
> Steps to reproduce:
>  -- start with a running PG instance with WAL archiving enabled
>  -- select pg_start_backup('test');
>  -- Fill up the disk on which the data directory is present
>  -- select pg_stop_backup();
>     -- fails with: ERROR:  could not write file
> "pg_xlog/000000010000000000000000.004989E8.backup": No space left on device
>     -- at this point there is a 0-byte file
> pg_xlog/000000010000000000000000.004989E8.backup present on disk
>  -- stop and start PG
>  -- recovery fails with: FATAL:  invalid data in file
> "000000010000000000000000.004989E8.backup"
>  -- NOTE: At this point removing 000000010000000000000000.004989E8.backup
> allows PG to start successfully

What do you see as the bug here?  Seems like reasonable behavior to me.

            regards, tom lane