Re: pg_dump and large files - is this a problem? - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: pg_dump and large files - is this a problem? |
Date | |
Msg-id | 200210230502.g9N52Ms21420@candle.pha.pa.us Whole thread Raw |
In response to | Re: pg_dump and large files - is this a problem? (Philip Warner <pjw@rhyme.com.au>) |
Responses |
Re: pg_dump and large files - is this a problem?
|
List | pgsql-hackers |
OK, you are saying if we don't have fseeko(), there is no reason to use off_t, and we may as well use long. What limitations does that impose, and are the limitations clear to the user. What has me confused is that I only see two places that use a non-zero fseeko, and in those cases, there is a non-fseeko code path that does the same thing, or the call isn't actually required. Both cases are in pg_dump/pg_dump_custom.c. It appears seeking in the file is an optimization that prevents all the blocks from being read. That is fine, but we shouldn't introduce failure cases to do that. If BSD/OS is the only problem OS, I can deal with that, but I have no idea if other OS's have the same limitation, and because of the way our code exists now, we are not even checking to see if there is a problem. I did some poking around, and on BSD/OS, fgetpos/fsetpos use fpos_t, which is actually off_t, and interestingly, lseek() uses off_t too. Seems only fseek/ftell is limited to long. I can easily implemnt fseeko/ftello using fgetpos/fsetpos, but that is only one OS. One idea would be to patch up BSD/OS in backend/port/bsdi and add a configure tests that actually fails if fseeko doesn't exist _and_ sizeof(off_t) > sizeof(long). That would at least catch OS's before they make >2gig backups that can't be restored. --------------------------------------------------------------------------- Philip Warner wrote: > At 10:46 PM 22/10/2002 -0400, Bruce Momjian wrote: > >Uh, not exactly. I have off_t as a quad, and I don't have fseeko, so > >the above conditional doesn't work. I want to use off_t, but can't use > >fseek(). > > Then when you create dumps, they will be invalid since I assume that ftello > is also broken in the same way. You need to fix _getFilePos as well. And > any other place that uses an off_t needs to be looked at very carefully. > The code was written assuming that if 'hasSeek' was set, then we could > trust it. > > Given that you say you do have support for some kind of 64 bt offset, I > would be a lot happier with these changes if you did something akin to my > original sauggestion: > > #if defined(HAVE_FSEEKO) > #define FILE_OFFSET off_t > #define FSEEK fseeko > #elseif defined(HAVE_SOME_OTHER_FSEEK) > #define FILE_OFFSET some_other_offset > #define FSEEK some_other_fseek > #else > #define FILE_OFFSET long > #define FSEEK fseek > #end if > > ...assuming you have a non-broken 64 bit fseek/tell pair, then this will > work in all cases, and make the code a lot less ugly (assuming of course > the non-broken version can be shifted). > > > > ---------------------------------------------------------------- > Philip Warner | __---_____ > Albatross Consulting Pty. Ltd. |----/ - \ > (A.B.N. 75 008 659 498) | /(@) ______---_ > Tel: (+61) 0500 83 82 81 | _________ \ > Fax: (+61) 0500 83 82 82 | ___________ | > Http://www.rhyme.com.au | / \| > | --________-- > PGP key available upon request, | / > and from pgp5.ai.mit.edu:11371 |/ > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
pgsql-hackers by date: