Thread: Re: [HACKERS] 2 gig file size limit
On Friday 06 July 2001 18:51, Naomi Walker wrote: > If PostgreSQL is run on a system that has a file size limit (2 gig?), where > might cause us to hit the limit? Since PostgreSQL automatically segments its internal data files to get around such limits, the only place you will hit this limit will be when making backups using pg_dump or pg_dumpall. You may need to pipe the output of those commands into a file splitting utility, and then you'll have to pipe through a reassembly utility to restore. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Lamar Owen wrote: > > On Friday 06 July 2001 18:51, Naomi Walker wrote: > > If PostgreSQL is run on a system that has a file size limit (2 gig?), where > > might cause us to hit the limit? > > Since PostgreSQL automatically segments its internal data files to get around > such limits, the only place you will hit this limit will be when making > backups using pg_dump or pg_dumpall. You may need to pipe the output of Speaking of which. Doing a dumpall for a backup is taking a long time, the a restore from the dump files doesn't leave the database in its original state. Could a command be added that locks all the files, quickly tars them up, then releases the lock? -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com
[HACKERS removed from CC: list] Joseph Shraibman <jks@selectacast.net> writes: > Doing a dumpall for a backup is taking a long time, the a restore from > the dump files doesn't leave the database in its original state. Could > a command be added that locks all the files, quickly tars them up, then > releases the lock? As I understand it, pg_dump runs inside a transaction, so the output reflects a consistent snapshot of the database as of the time the dump starts (thanks to MVCC); restoring will put the database back to where it was at the start of the dump. Have you observed otherwise? -Doug -- The rain man gave me two cures; he said jump right in, The first was Texas medicine--the second was just railroad gin, And like a fool I mixed them, and it strangled up my mind, Now people just get uglier, and I got no sense of time... --Dylan
Doug McNaught wrote: > > [HACKERS removed from CC: list] > > Joseph Shraibman <jks@selectacast.net> writes: > > > Doing a dumpall for a backup is taking a long time, the a restore from > > the dump files doesn't leave the database in its original state. Could > > a command be added that locks all the files, quickly tars them up, then > > releases the lock? > > As I understand it, pg_dump runs inside a transaction, so the output > reflects a consistent snapshot of the database as of the time the dump > starts (thanks to MVCC); restoring will put the database back to where > it was at the start of the dump. > In theory. > Have you observed otherwise? Yes. Specifically timestamps are dumped in a way that (1) they lose percision (2) sometimes have 60 in the seconds field which prevents the dump from being restored. And I suspect any statistics generated by VACUUM ANALYZE are lost. If a database got corrupted somehow in order to restore from the dump the database would have to delete the original database then restore from the dump. Untarring would be much easier (especially as the database grows). Obviously this won't replace dumps but for quick backups it would be great. -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com
On Mon, Jul 09, 2001 at 08:59:59PM -0400, Joseph Shraibman wrote: > If a database got corrupted somehow in order to restore from the dump > the database would have to delete the original database then restore > from the dump. Untarring would be much easier (especially as the You could always shut the system down and tar on your own. Of course, tarring up several gigabytes is going to take a while. Better to fix the dump/restore process than to hack in a work around that has very limited benefit. mrc -- Mike Castle dalgoda@ix.netcom.com www.netcom.com/~dalgoda/ We are all of us living in the shadow of Manhattan. -- Watchmen fatal ("You are in a maze of twisty compiler features, all different"); -- gcc
Joseph Shraibman <jks@selectacast.net> writes: > Could a command be added that locks all the files, quickly tars them > up, then releases the lock? pg_ctl stop tar cfz - $PGDATA >someplace pg_ctl start There is no possibility of anything less drastic, if you want to ensure that the database files are consistent and not changing. Don't even think about doing a partial dump of the $PGDATA tree, either. If you don't have a pg_log that matches your data files, you've got nothing. regards, tom lane
> > Have you observed otherwise? > Yes. Specifically timestamps are dumped in a way that (1) they lose > percision (2) sometimes have 60 in the seconds field which prevents the > dump from being restored. The loss of precision for timestamp data stems from conservative attempts to get consistant behavior from the data type. It is certainly not entirely successful, but changes would have to solve some of these problems without introducing more. I've only seen the "60 seconds problem" with earlier Mandrake distros which combined normal compiler optimizations with a "fast math" optimization, against the apparent advice of the gcc developers. What kind of system are you on, and how did you build PostgreSQL? Regards. - Thomas
Tom Lane wrote: > > Joseph Shraibman <jks@selectacast.net> writes: > > Could a command be added that locks all the files, quickly tars them > > up, then releases the lock? > > pg_ctl stop > tar cfz - $PGDATA >someplace > pg_ctl start > But that would mean I would have to have all my programs detect that the database went down and make new connections. I would rather that postgres just lock all the files and do the tar. -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com
I mentioned this on general a while ago. I had the problem when I dumped my 7.0.3 db to upgrade to 7.1. I had to modify the dump because there were some 60 seconds in there. It was obvious in the code in backend/utils/adt/datetime that it was using sprintf to do the formatting, and sprintf was taking the the float the represented the seconds and rounding it. select '2001-07-10 15:39:59.999'::timestamp; ?column? --------------------------- 2001-07-10 15:39:60.00-04 (1 row) Thomas Lockhart wrote: > > > > Have you observed otherwise? > > Yes. Specifically timestamps are dumped in a way that (1) they lose > > percision (2) sometimes have 60 in the seconds field which prevents the > > dump from being restored. > > The loss of precision for timestamp data stems from conservative > attempts to get consistant behavior from the data type. It is certainly > not entirely successful, but changes would have to solve some of these > problems without introducing more. > > I've only seen the "60 seconds problem" with earlier Mandrake distros > which combined normal compiler optimizations with a "fast math" > optimization, against the apparent advice of the gcc developers. What > kind of system are you on, and how did you build PostgreSQL? > > Regards. > > - Thomas -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com