Thread: PostgreSQL theoretical maximums.
How many tables can PostgreSQL theoretically and then practically handle? What is the largest database size possible? What was the biggest database you've ever had on PostgreSQL? What were the challenges and what kind of hardware and OS works best? What is an effective way to predict database size when designing tables? regards,
Karen Hill wrote: > How many tables can PostgreSQL theoretically and then practically > handle? What is the largest database size possible? What was the > biggest database you've ever had on PostgreSQL? What were the > challenges and what kind of hardware and OS works best? Maximum number of tables etc. is a FAQ: http://www.postgresql.org/docs/faqs.FAQ.html#item4.4 It's been running on unix-like systems for much longer than Windows. Apart from that, the best system is probably determined by your experience. > What is an effective way to predict database size when designing > tables? Rule-of-thumb - assume 3-5 times the size of the raw data to allow for overhead, indexes etc. Other than that, you can find details of on-disk formats for rows towards the end of the manuals. -- Richard Huxton Archonet Ltd
Hi, Have a look at: http://www.postgresql.org/about/ /David Karen Hill wrote: > How many tables can PostgreSQL theoretically and then practically > handle? What is the largest database size possible? What was the > biggest database you've ever had on PostgreSQL? What were the > challenges and what kind of hardware and OS works best? > > What is an effective way to predict database size when designing > tables? > > regards, > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > > > >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Richard Huxton wrote: > Karen Hill wrote: >> How many tables can PostgreSQL theoretically and then >> practically handle? What is the largest database size >> possible? What was the biggest database you've ever had on >> PostgreSQL? What were the challenges and what kind of hardware >> and OS works best? > > Maximum number of tables etc. is a FAQ: > http://www.postgresql.org/docs/faqs.FAQ.html#item4.4 > > It's been running on unix-like systems for much longer than > Windows. Apart from that, the best system is probably determined > by your experience. > >> What is an effective way to predict database size when >> designing tables? > > Rule-of-thumb - assume 3-5 times the size of the raw data to > allow for overhead, indexes etc. Other than that, you can find > details of on-disk formats for rows towards the end of the > manuals. I think I've read that there are multi-hundred GB PostgreSQL databases. Since pg_dump is single-threaded, how long does it take to back up such a database? (Obviously it would need modern LTO, SuperDLT, etc drives, either FC or U320.) Are there any plans of making a multi-threaded, or even multi-process pg_dump? - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEzUXgS9HxQb37XmcRAq7eAJ9HS4YWB5y/MQ7sGTpAMIvXKMTRhQCfXPEJ rYSLF4nTKv9AclJ2ZUHLVxE= =eBsW -----END PGP SIGNATURE-----
Ron Johnson wrote: > I think I've read that there are multi-hundred GB PostgreSQL > databases. > > Since pg_dump is single-threaded, how long does it take to back up > such a database? The evasive answer is that you probably don't run regular full pg_dump on such databases. > Are there any plans of making a multi-threaded, or even > multi-process pg_dump? What do you hope to accomplish by that? pg_dump is not CPU bound. -- Peter Eisentraut http://developer.postgresql.org/~petere/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Peter Eisentraut wrote: > Ron Johnson wrote: >> I think I've read that there are multi-hundred GB PostgreSQL >> databases. >> >> Since pg_dump is single-threaded, how long does it take to back up >> such a database? > > The evasive answer is that you probably don't run regular full pg_dump > on such databases. Hmmm. >> Are there any plans of making a multi-threaded, or even >> multi-process pg_dump? > > What do you hope to accomplish by that? pg_dump is not CPU bound. Write to multiple tape drives at the same time, thereby reducing the total wall time of the backup process. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEzhl2S9HxQb37XmcRArRoAKDiVOicD7giL2lWF5zMa9qQanwXjQCfdY6r 0eswQQ5nJb8l5Mbp+ok0ZRs= =MWSM -----END PGP SIGNATURE-----
On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote: > > The evasive answer is that you probably don't run regular full pg_dump > > on such databases. > > Hmmm. > You might want to use PITR for incremental backup or maintain a standby system using Slony-I ( www.slony.info ). > >> Are there any plans of making a multi-threaded, or even > >> multi-process pg_dump? > > > > What do you hope to accomplish by that? pg_dump is not CPU bound. > > Write to multiple tape drives at the same time, thereby reducing the > total wall time of the backup process. pg_dump just produces output. You could pretty easily stripe that output across multiple devices just by using some scripts. Just make sure to write a script that can reconstruct the data again when you need to restore. You don't need multi-threaded pg_dump, you just need to use a script that produces multiple output streams. Multi-threaded design is only useful for CPU-bound applications. Doing full backups of that much data is always a challenge, and I don't think PostgreSQL has limitations that another database doesn't. Regards, Jeff Davis
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeff Davis wrote: > On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote: > >>> The evasive answer is that you probably don't run regular full pg_dump >>> on such databases. >> Hmmm. >> > > You might want to use PITR for incremental backup or maintain a standby > system using Slony-I ( www.slony.info ). We want PITR, Incremental & Full. Currently do: Full - every other day, directly to 4 tape drives. Incremental - on "the other" days, to 1 tape drive. PITR - every 8 hours, to disk. >>>> Are there any plans of making a multi-threaded, or even >>>> multi-process pg_dump? >>> What do you hope to accomplish by that? pg_dump is not CPU bound. >> Write to multiple tape drives at the same time, thereby reducing the >> total wall time of the backup process. > > pg_dump just produces output. You could pretty easily stripe that output > across multiple devices just by using some scripts. Just make sure to > write a script that can reconstruct the data again when you need to > restore. But doesn't that mean that pg_dump must dump to disk? With a *big* database, that's a whole lot of extra kit (not just spindles) to buy. > You don't need multi-threaded pg_dump, you just need to use a > script that produces multiple output streams. Multi-threaded design is > only useful for CPU-bound applications. > > Doing full backups of that much data is always a challenge, and I don't > think PostgreSQL has limitations that another database doesn't. I dunno about that. With Rdb/VMS it is trivial to backup a database directly to multiple tape drives. The Rdb backup utility has an algorithm with determines which tablespace data is copied to which tape drive. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE14u3S9HxQb37XmcRAj8xAKC6n4OmHBLeGkGoMz58RFY3FIWf0wCeIlRU Ott3Uj7/0rpdG7Yb4o+7HPY= =TdPt -----END PGP SIGNATURE-----
On Mon, 2006-08-07 at 13:51, Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Jeff Davis wrote: > > On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote: > > pg_dump just produces output. You could pretty easily stripe that output > > across multiple devices just by using some scripts. Just make sure to > > write a script that can reconstruct the data again when you need to > > restore. > > But doesn't that mean that pg_dump must dump to disk? > > With a *big* database, that's a whole lot of extra kit (not just > spindles) to buy. I'm certain you can backup direct to tape, as I've done it in the past with postgresql. This was in the day of 4 gig 4 mm tape drives...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: > On Mon, 2006-08-07 at 13:51, Ron Johnson wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Jeff Davis wrote: >>> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote: > >>> pg_dump just produces output. You could pretty easily stripe that output >>> across multiple devices just by using some scripts. Just make sure to >>> write a script that can reconstruct the data again when you need to >>> restore. >> But doesn't that mean that pg_dump must dump to disk? >> >> With a *big* database, that's a whole lot of extra kit (not just >> spindles) to buy. > > I'm certain you can backup direct to tape, as I've done it in the past > with postgresql. This was in the day of 4 gig 4 mm tape drives... Sure. That's why tar is named *Tape* ARchive. Going along with the example of the 4GB 4mm tape drive, and to simplify, we are *not* using compression (neither on the drive or s/w gzip or bzip): Say we have a 7GB database. Is there a way to way to use 2 tape drives... Argh, I guess not, since the Unix pipe mentality presumes that the mode of operation will be: $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz What would be darned useful (but only, I think, with heavy usage of tablespaces) is: $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE15mwS9HxQb37XmcRAmh/AKDN8ITbeax0+DvKBMGRJ2JBEy3cpACg4/Gd t4mFuJA+DeAUOEKSztbWHNg= =Cq3a -----END PGP SIGNATURE-----
> Say we have a 7GB database. Is there a way to way to use 2 tape > drives... Argh, I guess not, since the Unix pipe mentality presumes > that the mode of operation will be: > $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz > > What would be darned useful (but only, I think, with heavy usage of > tablespaces) is: > $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 Would passing the --multi-volume option be useful to you? http://www.apl.jhu.edu/Misc/Unix-info/tar/tar_97.html Regards, Richard Broersma Jr.
On Mon, 2006-08-07 at 14:51, Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Scott Marlowe wrote: > > On Mon, 2006-08-07 at 13:51, Ron Johnson wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- > >> Hash: SHA1 > >> > >> Jeff Davis wrote: > >>> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote: > > > >>> pg_dump just produces output. You could pretty easily stripe that output > >>> across multiple devices just by using some scripts. Just make sure to > >>> write a script that can reconstruct the data again when you need to > >>> restore. > >> But doesn't that mean that pg_dump must dump to disk? > >> > >> With a *big* database, that's a whole lot of extra kit (not just > >> spindles) to buy. > > > > I'm certain you can backup direct to tape, as I've done it in the past > > with postgresql. This was in the day of 4 gig 4 mm tape drives... > > Sure. That's why tar is named *Tape* ARchive. > > Going along with the example of the 4GB 4mm tape drive, and to > simplify, we are *not* using compression (neither on the drive or > s/w gzip or bzip): > > Say we have a 7GB database. Is there a way to way to use 2 tape > drives... Argh, I guess not, since the Unix pipe mentality presumes > that the mode of operation will be: > $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz > > What would be darned useful (but only, I think, with heavy usage of > tablespaces) is: > $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 Oh, I see where you were headed. I've found a bit here and there googling about for multiplex and tar, and found an IO:Multiplex module for perl. I don't see why someone couldn't make use of it to create some kind of system where you'd do: pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen No need for postgresql to support it directly. Restoring would likewise just be a reverse operation.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Richard Broersma Jr wrote: >> Say we have a 7GB database. Is there a way to way to use 2 tape >> drives... Argh, I guess not, since the Unix pipe mentality presumes >> that the mode of operation will be: >> $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz >> >> What would be darned useful (but only, I think, with heavy usage of >> tablespaces) is: >> $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 > > Would passing the --multi-volume option be useful to you? > > http://www.apl.jhu.edu/Misc/Unix-info/tar/tar_97.html Mostly no, because it serializes access. Fill up one tape, go to the next, then the 3rd, etc. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE152FS9HxQb37XmcRAiLuAJwIsMlCfguvti4z/2WNozCFjT0ZYQCgv8d2 LnwIypkK00EVRx/yMKzggy4= =rdW2 -----END PGP SIGNATURE-----
Ron Johnson <ron.l.johnson@cox.net> writes: > Say we have a 7GB database. Is there a way to way to use 2 tape > drives... Argh, I guess not, since the Unix pipe mentality presumes > that the mode of operation will be: > $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz If you use "z" you're compressing then using multiple volumes is not possible. If you can abdicate of that then you'll be able to use the "m" option and it will ask for another tape. > What would be darned useful (but only, I think, with heavy usage of > tablespaces) is: > $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 You can create multiple files using the "m" option and then copy those to each individual tape drive. That's the only way I can think of to redirect your output to multiple tape drives. For multiple tapes, tar will ask you for the next and all of them will be /dev/st<drive_number>. -- Jorge Godoy <jgodoy@gmail.com>
Attachment
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: > On Mon, 2006-08-07 at 14:51, Ron Johnson wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Scott Marlowe wrote: >>> On Mon, 2006-08-07 at 13:51, Ron Johnson wrote: >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> Jeff Davis wrote: >>>>> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote: >>>>> pg_dump just produces output. You could pretty easily stripe that output >>>>> across multiple devices just by using some scripts. Just make sure to >>>>> write a script that can reconstruct the data again when you need to >>>>> restore. >>>> But doesn't that mean that pg_dump must dump to disk? >>>> >>>> With a *big* database, that's a whole lot of extra kit (not just >>>> spindles) to buy. >>> I'm certain you can backup direct to tape, as I've done it in the past >>> with postgresql. This was in the day of 4 gig 4 mm tape drives... >> Sure. That's why tar is named *Tape* ARchive. >> >> Going along with the example of the 4GB 4mm tape drive, and to >> simplify, we are *not* using compression (neither on the drive or >> s/w gzip or bzip): >> >> Say we have a 7GB database. Is there a way to way to use 2 tape >> drives... Argh, I guess not, since the Unix pipe mentality presumes >> that the mode of operation will be: >> $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz >> >> What would be darned useful (but only, I think, with heavy usage of >> tablespaces) is: >> $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 > > Oh, I see where you were headed. > > I've found a bit here and there googling about for multiplex and tar, > and found an IO:Multiplex module for perl. > > I don't see why someone couldn't make use of it to create some kind of > system where you'd do: > > pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen > > No need for postgresql to support it directly. Restoring would likewise > just be a reverse operation. Interesting. Many thanks. Scary though that it hasn't been touched in 30 months. With multiple SCSI cards, each with it's own tape drive, this immensely speeds up the backup operation. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE16ABS9HxQb37XmcRAg3jAJ4nCT6vaytOuPnk/zG2QYSeulRirgCgmVkT iM+FYqrPwGX2as2fYs67lE4= =YVML -----END PGP SIGNATURE-----
On Mon, 2006-08-07 at 15:18, Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Scott Marlowe wrote: > > > > Oh, I see where you were headed. > > > > I've found a bit here and there googling about for multiplex and tar, > > and found an IO:Multiplex module for perl. > > > > I don't see why someone couldn't make use of it to create some kind of > > system where you'd do: > > > > pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen > > > > No need for postgresql to support it directly. Restoring would likewise > > just be a reverse operation. > > Interesting. Many thanks. Scary though that it hasn't been touched > in 30 months. > > With multiple SCSI cards, each with it's own tape drive, this > immensely speeds up the backup operation. Yeah, I read the description, and I think it's just a fancy name for tee. sigh. Not REAL multiplexing, but stream duplication.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: > On Mon, 2006-08-07 at 15:18, Ron Johnson wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Scott Marlowe wrote: >>> Oh, I see where you were headed. >>> >>> I've found a bit here and there googling about for multiplex and tar, >>> and found an IO:Multiplex module for perl. >>> >>> I don't see why someone couldn't make use of it to create some kind of >>> system where you'd do: >>> >>> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen >>> >>> No need for postgresql to support it directly. Restoring would likewise >>> just be a reverse operation. >> Interesting. Many thanks. Scary though that it hasn't been touched >> in 30 months. >> >> With multiple SCSI cards, each with it's own tape drive, this >> immensely speeds up the backup operation. > > Yeah, I read the description, and I think it's just a fancy name for > tee. sigh. Not REAL multiplexing, but stream duplication. This is where a multi-threaded pg_tapedump would be more effective, since it would be able to, for example, have 4 threads reading (different parts of) the database and writing to 4 separate tape drives. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE16wiS9HxQb37XmcRAvi0AJ0eHuurG7cC3HC9A1lOMXvcBDJ7QACeMlyB bS3ozQ69gFgrM70oHGJr8Zk= =8N8c -----END PGP SIGNATURE-----
On Mon, 2006-08-07 at 14:51 -0500, Ron Johnson wrote: > What would be darned useful (but only, I think, with heavy usage of > tablespaces) is: > $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 I must be missing something. What is stopping you from doing something along the lines of: // mux.script psuedocode // X is the size of a stripe of data top: read X bytes from stdin or exit if EOF asynchronously write those X bytes to /dev/st0 read X bytes from stdin or exit if EOF asynchronously write those X bytes to /dev/st1 goto top And then make an inverse script called demux.script. Then: $ pg_dump somedatabase | gzip -c | mux.script To restore: $ demux.script | gunzip -c | psql somedatabase Would that work? Obviously you'd have to document the process well to make sure the someone didn't get confused 12 months later trying to restore. You may have to do something a little more sophisticated to make it work more generally, like adding header information to each tape that says "I am the 2nd tape of 3". Regards, Jeff Davis
Ron Johnson wrote: > This is where a multi-threaded pg_tapedump would be more effective, > since it would be able to, for example, have 4 threads reading > (different parts of) the database and writing to 4 separate tape drives. It will be difficult to have a consistent dump though. You can't do that with separate transactions. (And you can't have multiple simultaneous readers without separate transactions.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, 2006-08-07 at 16:09, Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Scott Marlowe wrote: > > On Mon, 2006-08-07 at 15:18, Ron Johnson wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- > >> Hash: SHA1 > >> > >> Scott Marlowe wrote: > >>> Oh, I see where you were headed. > >>> > >>> I've found a bit here and there googling about for multiplex and tar, > >>> and found an IO:Multiplex module for perl. > >>> > >>> I don't see why someone couldn't make use of it to create some kind of > >>> system where you'd do: > >>> > >>> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen > >>> > >>> No need for postgresql to support it directly. Restoring would likewise > >>> just be a reverse operation. > >> Interesting. Many thanks. Scary though that it hasn't been touched > >> in 30 months. > >> > >> With multiple SCSI cards, each with it's own tape drive, this > >> immensely speeds up the backup operation. > > > > Yeah, I read the description, and I think it's just a fancy name for > > tee. sigh. Not REAL multiplexing, but stream duplication. > > This is where a multi-threaded pg_tapedump would be more effective, > since it would be able to, for example, have 4 threads reading > (different parts of) the database and writing to 4 separate tape drives. Actually, I kinda prefer the idea of creating multiple streams using something like RAID, where if you have two streams, it's a mirror, if you have three or more then you rotate around a parity stripe like RAID 5. Then, any error on any one tape drive could be recovered. Veritas has something like that for tape drives. Heck, the more I think about it, the more I think it would be an interesting project for a device driver that was like /dev/mdt or something, and re-used the md libs from the hard drive universe. Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0 and you're backing up. Restore the other way around.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeff Davis wrote: > On Mon, 2006-08-07 at 14:51 -0500, Ron Johnson wrote: > >> What would be darned useful (but only, I think, with heavy usage of >> tablespaces) is: >> $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3 > > I must be missing something. What is stopping you from doing something > along the lines of: Missing nothing. :) > // mux.script psuedocode > // X is the size of a stripe of data > top: > read X bytes from stdin or exit if EOF > asynchronously write those X bytes to /dev/st0 > read X bytes from stdin or exit if EOF > asynchronously write those X bytes to /dev/st1 > goto top Python queues and threads would make that very simple. Master thread reads from pg_dump pipe, passing X bytes to a queue, and each thread performs this loop: - read from queue - synchronously write to /dev/stN > And then make an inverse script called demux.script. > > Then: > $ pg_dump somedatabase | gzip -c | mux.script > To restore: > $ demux.script | gunzip -c | psql somedatabase > > Would that work? Obviously you'd have to document the process well to > make sure the someone didn't get confused 12 months later trying to > restore. The disk system would have to be fast enough to keep all X tape drives full. Depending on the speed of the tape systems and disk system, having X "threads" reading from the database/writing to tape, all at the same time is much faster. > You may have to do something a little more sophisticated to make it work > more generally, like adding header information to each tape that says "I > am the 2nd tape of 3". Depending on the capabilities of your tape drive, CRC, error checking, load-next-tape, etc, etc. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE17XRS9HxQb37XmcRApdSAKCv390n8/eQxdodF3D6PQGBvyUSUACfR5NO ZpL8Ac6RMHQRHNqCEb8Grwo= =nrzR -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alvaro Herrera wrote: > Ron Johnson wrote: > >> This is where a multi-threaded pg_tapedump would be more effective, >> since it would be able to, for example, have 4 threads reading >> (different parts of) the database and writing to 4 separate tape drives. > > It will be difficult to have a consistent dump though. You can't do > that with separate transactions. (And you can't have multiple > simultaneous readers without separate transactions.) Absolutely, you're right. All "threads" must run from within the same read-only transaction. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE17ZrS9HxQb37XmcRAitSAKCo1/HaJ8ZsoGYFgGo2sHezWmqqWACbB9Mr Hw4Xaalah+ZqExMwr4vu058= =e33c -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: > On Mon, 2006-08-07 at 16:09, Ron Johnson wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Scott Marlowe wrote: >>> On Mon, 2006-08-07 at 15:18, Ron Johnson wrote: >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> Scott Marlowe wrote: >>>>> Oh, I see where you were headed. >>>>> >>>>> I've found a bit here and there googling about for multiplex and tar, >>>>> and found an IO:Multiplex module for perl. >>>>> >>>>> I don't see why someone couldn't make use of it to create some kind of >>>>> system where you'd do: >>>>> >>>>> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen >>>>> >>>>> No need for postgresql to support it directly. Restoring would likewise >>>>> just be a reverse operation. >>>> Interesting. Many thanks. Scary though that it hasn't been touched >>>> in 30 months. >>>> >>>> With multiple SCSI cards, each with it's own tape drive, this >>>> immensely speeds up the backup operation. >>> Yeah, I read the description, and I think it's just a fancy name for >>> tee. sigh. Not REAL multiplexing, but stream duplication. >> This is where a multi-threaded pg_tapedump would be more effective, >> since it would be able to, for example, have 4 threads reading >> (different parts of) the database and writing to 4 separate tape drives. > > Actually, I kinda prefer the idea of creating multiple streams using > something like RAID, where if you have two streams, it's a mirror, if > you have three or more then you rotate around a parity stripe like RAID > 5. Then, any error on any one tape drive could be recovered. Veritas > has something like that for tape drives. > > Heck, the more I think about it, the more I think it would be an > interesting project for a device driver that was like /dev/mdt or > something, and re-used the md libs from the hard drive universe. > > Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0 > and you're backing up. Restore the other way around. Tape RAID? I like that!!!! If the disk system can keep the tape system full, though. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE17dQS9HxQb37XmcRApZIAKCJDCErL0s/MecjFv+jX4YzNzHuVACeLDoT 1BUxLkwU1GBvFzmvms7dWRU= =h7Wq -----END PGP SIGNATURE-----
On Mon, 2006-08-07 at 16:51 -0500, Ron Johnson wrote: > Python queues and threads would make that very simple. Master > thread reads from pg_dump pipe, passing X bytes to a queue, and each > thread performs this loop: > - read from queue > - synchronously write to /dev/stN Right, you can use threads also. You're doing synchronous writes from different threads, I was doing asynchronous writes from the same thread. It's just a matter of preference. > The disk system would have to be fast enough to keep all X tape > drives full. > The whole point of this is that pg_dump was producing output (from the disk) faster than you could store it (to the tape), right? So if you use multiple tapes and stripe with a multiplexing script, that should solve the problem right? Regards, Jeff Davis
On Mon, 2006-08-07 at 16:53 -0500, Ron Johnson wrote: > > It will be difficult to have a consistent dump though. You can't do > > that with separate transactions. (And you can't have multiple > > simultaneous readers without separate transactions.) > > Absolutely, you're right. All "threads" must run from within the > same read-only transaction. > The idea is that pg_dump already works and already creates a good backup. Why not split up the data after pg_dump produces it? Of course it should be split up in a stream fashion, like I suggested before in this thread with my multiplexing script. Regards, Jeff Davis
On Mon, 2006-08-07 at 16:57 -0500, Ron Johnson wrote: > > Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0 > > and you're backing up. Restore the other way around. > > Tape RAID? I like that!!!! > > If the disk system can keep the tape system full, though. > What situation are you worried about here exactly? Regards, Jeff Davis
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeff Davis wrote: > On Mon, 2006-08-07 at 16:57 -0500, Ron Johnson wrote: >>> Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0 >>> and you're backing up. Restore the other way around. >> Tape RAID? I like that!!!! >> >> If the disk system can keep the tape system full, though. >> > > What situation are you worried about here exactly? Backing up a large database in a limited amount of time. If the DB is large enough, and the time is short enough, then the single-reader pg_dump can not read the data fast enough (especially if other operations are pounding the disk system) to meet the time limit. Thus, the need (some times) for multiple readers. But, you say: do PITR and weekly full backups. Ok. Say you do Saturday night backups and nightly PITR backups. And the disk system barfs hard on Friday. If the Tuesday night PITR tape has a parity error, you're hosed and have lost 3 days of data. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE1/KsS9HxQb37XmcRAm+sAKCDRpDvzSlLpP9cFF061rCcaz4ufwCgq8Z+ dGCOEcOgIZRMzDOrLJ+kaak= =P0Q4 -----END PGP SIGNATURE-----
On Mon, 2006-08-07 at 21:10 -0500, Ron Johnson wrote: > > What situation are you worried about here exactly? > > Backing up a large database in a limited amount of time. > > If the DB is large enough, and the time is short enough, then the > single-reader pg_dump can not read the data fast enough (especially > if other operations are pounding the disk system) to meet the time > limit. > > Thus, the need (some times) for multiple readers. > Oh, I see. You're not only worried about write speed (to tape), but also the ability of pg_dump to read from all your disk spindles at once to get fast enough read speed. If you use PITR, the during the backup you can copy the data directory in any order that you want. It should be pretty easy to get it to copy from multiple streams in that way. You can use tar to create a single stream out of each tablespace, and then multiplex the stream to multiple tapes so that it can write quickly enough. It's not as easy to create multiple reading streams using pg_dump, because each transaction can have at most one COPY going at a time. You might be able to hack up something, but I think PITR is the way to go. > But, you say: do PITR and weekly full backups. Ok. Say you do > Saturday night backups and nightly PITR backups. And the disk > system barfs hard on Friday. If the Tuesday night PITR tape has a > parity error, you're hosed and have lost 3 days of data. > By modifying the scripts you use to multiplex the data, you can employ some redundancy. I don't think it would be hard to include duplicate blocks or parity blocks in the multiplex script. By creating multiple reading streams, one for each tablespace, and then writing each stream to multiple tapes (with parity blocks), you should be able to back up at the speed your hardware allows, and restore at the speed your hardware allows. Furthermore, with PITR you can do incremental backups continuously and wait 'til the weekend to do a full backup. I'm imagining something similar to the following: mydb=# select pg_start_backup('mybackup'); $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2 $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5 $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8 mydb=# select pg_stop_backup(); -- in same connection as pg_start_backup Where mux.script can write to three tapes and use parity blocks. Adjust based on the number of tapes you actually have. Hope this helps, Jeff Davis
On Tue, 2006-08-08 at 09:47 -0700, Jeff Davis wrote: > I'm imagining something similar to the following: > mydb=# select pg_start_backup('mybackup'); > > $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2 > $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5 > $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8 > Oops, let me clarify. Those tar commands are independent and can be run in parallel, which was the point I was trying to make. So, just run them in the background in whatever way is most convenient. Regards, Jeff Davis
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeff Davis wrote: > On Tue, 2006-08-08 at 09:47 -0700, Jeff Davis wrote: >> I'm imagining something similar to the following: >> mydb=# select pg_start_backup('mybackup'); >> >> $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2 >> $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5 >> $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8 >> > > Oops, let me clarify. Those tar commands are independent and can be run > in parallel, which was the point I was trying to make. So, just run them > in the background in whatever way is most convenient. Open database or closed database? - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2Nx/S9HxQb37XmcRAgM5AKCkFR21o7/tUQiDPF/tvxpT0hmENACgsCBW LeMdlk1n2TnD7gfqYeAFySw= =R8CG -----END PGP SIGNATURE-----
On Tue, 2006-08-08 at 13:48 -0500, Ron Johnson wrote: > >> I'm imagining something similar to the following: > >> mydb=# select pg_start_backup('mybackup'); > >> > >> $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2 > >> $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5 > >> $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8 > >> > > > > Oops, let me clarify. Those tar commands are independent and can be run > > in parallel, which was the point I was trying to make. So, just run them > > in the background in whatever way is most convenient. > > Open database or closed database? See http://www.postgresql.org/docs/8.1/static/backup-online.html In section 23.3.2, it suggests that you _don't_ need the same connection to do pg_start_backup('mybackup') as to do pg_stop_backup(). So, it appears you really need no database connections at all open during the process, but clients can freely connect if you want with no interruption. This backup procedure will not affect your applications or stop your database. All you need to do is: (1) connect, issue pg_start_backup('mybackup'), and disconnect (2) back up the data dir + all tablespaces. You can do this in parallel if you want. You don't have to make a filesystem snapshot to do this, the restoration will fix any internal inconsistencies as long as you have archived the WAL files according to the above docs. So long as the backup makes it onto permanent media, and so have the WAL files, it should be fine. (3) connect, issue pg_stop_backup(), and disconnect I recommend reading the above link carefully and doing a few tests to make sure it works as you expect. But you certainly can do online backups (hot backups) with PITR, and the restoration as well. With PITR, the large portion of the data is moved using standard filesystem interaction, so you can do it in parallel with no problems. Hope this helps, Jeff Davis