Re: pg_dump and large files - is this a problem? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: pg_dump and large files - is this a problem?
Date
Msg-id 200210230502.g9N52Ms21420@candle.pha.pa.us
Whole thread Raw
In response to Re: pg_dump and large files - is this a problem?  (Philip Warner <pjw@rhyme.com.au>)
Responses Re: pg_dump and large files - is this a problem?
List pgsql-hackers
OK, you are saying if we don't have fseeko(), there is no reason to use
off_t, and we may as well use long.  What limitations does that impose,
and are the limitations clear to the user.

What has me confused is that I only see two places that use a non-zero
fseeko, and in those cases, there is a non-fseeko code path that does
the same thing, or the call isn't actually required.  Both cases are in
pg_dump/pg_dump_custom.c.  It appears seeking in the file is an
optimization that prevents all the blocks from being read.  That is
fine, but we shouldn't introduce failure cases to do that.

If BSD/OS is the only problem OS, I can deal with that, but I have no
idea if other OS's have the same limitation, and because of the way our
code exists now, we are not even checking to see if there is a problem.

I did some poking around, and on BSD/OS, fgetpos/fsetpos use fpos_t,
which is actually off_t, and interestingly, lseek() uses off_t too. 
Seems only fseek/ftell is limited to long.  I can easily implemnt
fseeko/ftello using fgetpos/fsetpos, but that is only one OS.

One idea would be to patch up BSD/OS in backend/port/bsdi and add a
configure tests that actually fails if fseeko doesn't exist _and_
sizeof(off_t) > sizeof(long).  That would at least catch OS's before
they make >2gig backups that can't be restored.

---------------------------------------------------------------------------

Philip Warner wrote:
> At 10:46 PM 22/10/2002 -0400, Bruce Momjian wrote:
> >Uh, not exactly.  I have off_t as a quad, and I don't have fseeko, so
> >the above conditional doesn't work. I want to use off_t, but can't use
> >fseek().
> 
> Then when you create dumps, they will be invalid since I assume that ftello 
> is also broken in the same way. You need to fix _getFilePos as well. And 
> any other place that uses an off_t needs to be looked at very carefully. 
> The code was written assuming that if 'hasSeek' was set, then we could 
> trust it.
> 
> Given that you say you do have support for some kind of 64 bt offset, I 
> would be a lot happier with these changes if you did something akin to my 
> original sauggestion:
> 
> #if defined(HAVE_FSEEKO)
> #define FILE_OFFSET off_t
> #define FSEEK fseeko
> #elseif defined(HAVE_SOME_OTHER_FSEEK)
> #define FILE_OFFSET some_other_offset
> #define FSEEK some_other_fseek
> #else
> #define FILE_OFFSET long
> #define FSEEK fseek
> #end if
> 
> ...assuming you have a non-broken 64 bit fseek/tell pair, then this will 
> work in all cases, and make the code a lot less ugly (assuming of course 
> the non-broken version can be shifted).
> 
> 
> 
> ----------------------------------------------------------------
> Philip Warner                    |     __---_____
> Albatross Consulting Pty. Ltd.   |----/       -  \
> (A.B.N. 75 008 659 498)          |          /(@)   ______---_
> Tel: (+61) 0500 83 82 81         |                 _________  \
> Fax: (+61) 0500 83 82 82         |                 ___________ |
> Http://www.rhyme.com.au          |                /           \|
>                                   |    --________--
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Philip Warner
Date:
Subject: Re: pg_dump and large files - is this a problem?
Next
From: Philip Warner
Date:
Subject: Re: pg_dump and large files - is this a problem?