Thread: [PATCH] `pg_dump -Fd` doesn't check write return status...

[PATCH] `pg_dump -Fd` doesn't check write return status...

From
Sean Chittenden
Date:
The attached patch fixes the case when `pg_dump -Fd …` is called on a partition where write(2) fails for some reason or
another.In this case, backup jobs were returning with a successful exit code even though most of the files in the dump
directorywere all zero length. 

I haven’t tested this patch’s failure conditions but the fix seems simple enough: cfwrite() needs to have its return
statuschecked everywhere and exit_horribly() upon any failure. In this case, callers of _WriteData() were not checking
thereturn status and were discarding the negative return status (e.g. ENOSPC). 

I made a cursory pass over the code and found one other instance where write status wasn’t being checked and also
includedthat. 

-sc




--
Sean Chittenden
sean@chittenden.org


Attachment

Re: [PATCH] `pg_dump -Fd` doesn't check write return status...

From
Bruce Momjian
Date:
On Sat, Mar 1, 2014 at 12:27:19PM -0800, Sean Chittenden wrote:
> The attached patch fixes the case when `pg_dump -Fd …` is called
> on a partition where write(2) fails for some reason or another. In
> this case, backup jobs were returning with a successful exit code even
> though most of the files in the dump directory were all zero length.
>
> I haven’t tested this patch’s failure conditions but the fix seems
> simple enough: cfwrite() needs to have its return status checked
> everywhere and exit_horribly() upon any failure. In this case,
> callers of _WriteData() were not checking the return status and were
> discarding the negative return status (e.g. ENOSPC).
>
> I made a cursory pass over the code and found one other instance where
> write status wasn’t being checked and also included that.

As is often the case with pg_dump, the problems you saw are a small part
of a larger set of problems in that code --- there is general ignoring of
read and write errors.

I have developed a comprehensive patch that addresses all the issues I
could find.  The use of function pointers and the calling of functions
directly and through function pointers makes the fix quite complex.  I
ended up placing checks at the lowest level and removing checks at
higher layers where they were sporadically placed.

Patch attached.  I have tested dump/restore of all four dump output
formats with the 9.3 regression database and all tests passed.  I
believe this patch is too complex to backpatch, and I don't know how it
could be fixed more simply.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment

Re: [PATCH] `pg_dump -Fd` doesn't check write return status...

From
Bruce Momjian
Date:
On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote:
> On Sat, Mar 1, 2014 at 12:27:19PM -0800, Sean Chittenden wrote:
> > The attached patch fixes the case when `pg_dump -Fd …` is called
> > on a partition where write(2) fails for some reason or another. In
> > this case, backup jobs were returning with a successful exit code even
> > though most of the files in the dump directory were all zero length.
> >
> > I haven’t tested this patch’s failure conditions but the fix seems
> > simple enough: cfwrite() needs to have its return status checked
> > everywhere and exit_horribly() upon any failure. In this case,
> > callers of _WriteData() were not checking the return status and were
> > discarding the negative return status (e.g. ENOSPC).
> >
> > I made a cursory pass over the code and found one other instance where
> > write status wasn’t being checked and also included that.
> 
> As is often the case with pg_dump, the problems you saw are a small part
> of a larger set of problems in that code --- there is general ignoring of
> read and write errors.
> 
> I have developed a comprehensive patch that addresses all the issues I
> could find.  The use of function pointers and the calling of functions
> directly and through function pointers makes the fix quite complex.  I
> ended up placing checks at the lowest level and removing checks at
> higher layers where they were sporadically placed.  
> 
> Patch attached.  I have tested dump/restore of all four dump output
> formats with the 9.3 regression database and all tests passed.  I
> believe this patch is too complex to backpatch, and I don't know how it
> could be fixed more simply.

Patch applied.  Thanks for the report.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: [PATCH] `pg_dump -Fd` doesn't check write return status...

From
Noah Misch
Date:
On Mon, May 05, 2014 at 08:29:56PM -0400, Bruce Momjian wrote:
> On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote:
> > As is often the case with pg_dump, the problems you saw are a small part
> > of a larger set of problems in that code --- there is general ignoring of
> > read and write errors.
> > 
> > I have developed a comprehensive patch that addresses all the issues I
> > could find.  The use of function pointers and the calling of functions
> > directly and through function pointers makes the fix quite complex.  I
> > ended up placing checks at the lowest level and removing checks at
> > higher layers where they were sporadically placed.  
> > 
> > Patch attached.  I have tested dump/restore of all four dump output
> > formats with the 9.3 regression database and all tests passed.  I
> > believe this patch is too complex to backpatch, and I don't know how it
> > could be fixed more simply.
> 
> Patch applied.  Thanks for the report.

This broke automatic detection of tar-format dumps:

$ pg_dump -Ft -f tardump
$ pg_restore tardump
pg_restore: [archiver] could not read from input file: Success
$ pg_restore -Ft tardump | wc -c
736

Stack trace:

#0  vwrite_msg (modulename=0x4166c9 "archiver", fmt=0x41aa08 "could not read from input file: %s\n", ap=0x7fffffffde38)
atpg_backup_utils.c:85
 
#1  0x000000000040f820 in exit_horribly (modulename=0x4166c9 "archiver", fmt=0x41aa08 "could not read from input file:
%s\n")at parallel.c:190
 
#2  0x000000000040557d in _discoverArchiveFormat (AH=0x425340) at pg_backup_archiver.c:2040
#3  0x0000000000405756 in _allocAH (FileSpec=0x7fffffffecbb "tardump", fmt=archUnknown, compression=0,
mode=archModeRead,setupWorkerPtr=0x4044f0 <setupRestoreWorker>) at pg_backup_archiver.c:2160
 
#4  0x0000000000405819 in OpenArchive (FileSpec=<optimized out>, fmt=<optimized out>) at pg_backup_archiver.c:148
#5  0x0000000000404331 in main (argc=2, argv=0x7fffffffea28) at pg_restore.c:375

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Re: [PATCH] `pg_dump -Fd` doesn't check write return status...

From
Bruce Momjian
Date:
On Tue, May  6, 2014 at 09:22:20AM -0400, Noah Misch wrote:
> On Mon, May 05, 2014 at 08:29:56PM -0400, Bruce Momjian wrote:
> > On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote:
> > > As is often the case with pg_dump, the problems you saw are a small part
> > > of a larger set of problems in that code --- there is general ignoring of
> > > read and write errors.
> > > 
> > > I have developed a comprehensive patch that addresses all the issues I
> > > could find.  The use of function pointers and the calling of functions
> > > directly and through function pointers makes the fix quite complex.  I
> > > ended up placing checks at the lowest level and removing checks at
> > > higher layers where they were sporadically placed.  
> > > 
> > > Patch attached.  I have tested dump/restore of all four dump output
> > > formats with the 9.3 regression database and all tests passed.  I
> > > believe this patch is too complex to backpatch, and I don't know how it
> > > could be fixed more simply.
> > 
> > Patch applied.  Thanks for the report.
> 
> This broke automatic detection of tar-format dumps:
> 
> $ pg_dump -Ft -f tardump
> $ pg_restore tardump
> pg_restore: [archiver] could not read from input file: Success
> $ pg_restore -Ft tardump | wc -c
> 736

OK, thanks for the report.  Fixed.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +