Thread: Block size with pg_dump?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 When I make a backup of a database, I put the output file directly on magnetic tape; i.e., my command looks like this: pg_dump --file=/dev/st0 .... This way I do not have to worry if the total backup exceeds the size of a file system, and it saves me the trouble of copying it to the tape as a separate step. My current tapes will hold 20 GBytes raw or 40GBytes if I enable hardware compression (assuming 2:1 compression happens). Now it says in the documentation that if I use format c it will compress the data in software, so I doubt the hardware compression will do much. I do not know what blocksize pg_dump uses, or if it insists on a particular blocksize on input. Now my tape drive will work with any blocksize, but prefers 65536-byte blocks. I do not see any options for this in pg_dump, but I could pipe the output of pg_dump through dd I suppose to make any blocksize I want. On the way back, likewise I could pipe the tape through dd before giving it to pg_restore. Does pg_dump care what blocksize it gets? If so, what is it? - -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939./()\ Shrewsbury, New Jersey http://counter.li.org^^-^^ 17:20:01 up 17 days, 20:42, 5 users, load average: 5.12,5.26, 5.21 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFG0fITPtu2XpovyZoRAouwAKCTEour7jbi3uKWmEjerOM3U51xKQCeKYrQ 6jbamlqvTvH04jD7oRbTAKY= =piNw -----END PGP SIGNATURE-----
Jean-David Beyer wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > When I make a backup of a database, I put the output file directly on > magnetic tape; i.e., my command looks like this: > > pg_dump --file=/dev/st0 .... > > This way I do not have to worry if the total backup exceeds the size of a > file system, and it saves me the trouble of copying it to the tape as a > separate step. My current tapes will hold 20 GBytes raw or 40GBytes if I > enable hardware compression (assuming 2:1 compression happens). Now it says > in the documentation that if I use format c it will compress the data in > software, so I doubt the hardware compression will do much. > > I do not know what blocksize pg_dump uses, or if it insists on a particular > blocksize on input. > > Now my tape drive will work with any blocksize, but prefers 65536-byte > blocks. I do not see any options for this in pg_dump, but I could pipe the > output of pg_dump through dd I suppose to make any blocksize I want. > > On the way back, likewise I could pipe the tape through dd before giving it > to pg_restore. > > Does pg_dump care what blocksize it gets? If so, what is it? I assume you could pipe pg_dump into dd and specify the block size in dd. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bruce Momjian wrote: > Jean-David Beyer wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> When I make a backup of a database, I put the output file directly on >> magnetic tape; i.e., my command looks like this: >> >> pg_dump --file=/dev/st0 .... >> >> This way I do not have to worry if the total backup exceeds the size of a >> file system, and it saves me the trouble of copying it to the tape as a >> separate step. My current tapes will hold 20 GBytes raw or 40GBytes if I >> enable hardware compression (assuming 2:1 compression happens). Now it says >> in the documentation that if I use format c it will compress the data in >> software, so I doubt the hardware compression will do much. >> >> I do not know what blocksize pg_dump uses, or if it insists on a particular >> blocksize on input. >> >> Now my tape drive will work with any blocksize, but prefers 65536-byte >> blocks. I do not see any options for this in pg_dump, but I could pipe the >> output of pg_dump through dd I suppose to make any blocksize I want. >> >> On the way back, likewise I could pipe the tape through dd before giving it >> to pg_restore. >> >> Does pg_dump care what blocksize it gets? If so, what is it? > > I assume you could pipe pg_dump into dd and specify the block size in > dd. > Of course on the way out I can do that. The main question is, If I present pg_restore with a 65536-byte blocksize and it is expecting, e.g., 1024-bytes, will the rest of each block get skipped? I.e., do I have to use dd on the way back too? And if so, what should the blocksize be? - -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939./()\ Shrewsbury, New Jersey http://counter.li.org^^-^^ 21:05:01 up 18 days, 27 min, 0 users, load average: 4.32,4.12, 4.09 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFG0iRlPtu2XpovyZoRAsXeAKCDuWnpDzTSEhvcBGjKXLO1oS2iAgCgrWB4 6Wj1bz9QoFOXrfL3galipDU= =pxyE -----END PGP SIGNATURE-----
On Aug 26, 2007, at 8:09 PM, Jean-David Beyer wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Bruce Momjian wrote: >> Jean-David Beyer wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> When I make a backup of a database, I put the output file >>> directly on >>> magnetic tape; i.e., my command looks like this: >>> >>> pg_dump --file=/dev/st0 .... >>> >>> This way I do not have to worry if the total backup exceeds the >>> size of a >>> file system, and it saves me the trouble of copying it to the >>> tape as a >>> separate step. My current tapes will hold 20 GBytes raw or >>> 40GBytes if I >>> enable hardware compression (assuming 2:1 compression happens). >>> Now it says >>> in the documentation that if I use format c it will compress the >>> data in >>> software, so I doubt the hardware compression will do much. >>> >>> I do not know what blocksize pg_dump uses, or if it insists on a >>> particular >>> blocksize on input. >>> >>> Now my tape drive will work with any blocksize, but prefers 65536- >>> byte >>> blocks. I do not see any options for this in pg_dump, but I could >>> pipe the >>> output of pg_dump through dd I suppose to make any blocksize I want. >>> >>> On the way back, likewise I could pipe the tape through dd before >>> giving it >>> to pg_restore. >>> >>> Does pg_dump care what blocksize it gets? If so, what is it? >> >> I assume you could pipe pg_dump into dd and specify the block size in >> dd. >> > Of course on the way out I can do that. > > The main question is, If I present pg_restore with a 65536-byte > blocksize > and it is expecting, e.g., 1024-bytes, will the rest of each block get > skipped? I.e., do I have to use dd on the way back too? And if so, > what > should the blocksize be? Postgres (by default) uses 8K blocks. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Erik Jones wrote: > >>> On the way back, likewise I could pipe the tape through dd before > >>> giving it > >>> to pg_restore. > >>> > >>> Does pg_dump care what blocksize it gets? If so, what is it? > >> > >> I assume you could pipe pg_dump into dd and specify the block size in > >> dd. > >> > > Of course on the way out I can do that. > > > > The main question is, If I present pg_restore with a 65536-byte > > blocksize > > and it is expecting, e.g., 1024-bytes, will the rest of each block get > > skipped? I.e., do I have to use dd on the way back too? And if so, > > what > > should the blocksize be? > > Postgres (by default) uses 8K blocks. That is true of the internal storage, but not of pg_dump's output because it is using libpq to pull rows and output them in a stream, meaning there is no blocking in pg_dumps output itself. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bruce Momjian wrote: > Erik Jones wrote: >>>>> On the way back, likewise I could pipe the tape through dd before >>>>> giving it >>>>> to pg_restore. >>>>> >>>>> Does pg_dump care what blocksize it gets? If so, what is it? >>>> I assume you could pipe pg_dump into dd and specify the block size in >>>> dd. >>>> >>> Of course on the way out I can do that. >>> >>> The main question is, If I present pg_restore with a 65536-byte >>> blocksize >>> and it is expecting, e.g., 1024-bytes, will the rest of each block get >>> skipped? I.e., do I have to use dd on the way back too? And if so, >>> what >>> should the blocksize be? >> Postgres (by default) uses 8K blocks. > > That is true of the internal storage, but not of pg_dump's output > because it is using libpq to pull rows and output them in a stream, > meaning there is no blocking in pg_dumps output itself. > Is that true for both input and output (i.e., pg_restore and pg_dump)? I.e., can I use dd to write 65536-byte blocks to tape, and then do nothing on running pg_restore? I.e., that pg_restore will accept any block size I choose to offer it? - -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939./()\ Shrewsbury, New Jersey http://counter.li.org^^-^^ 08:25:01 up 18 days, 11:47, 2 users, load average: 4.34,4.31, 4.27 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFG0sNpPtu2XpovyZoRAvVpAKCD0YPHpZVXwIweDwDfozA/79XJSACg0Jao qmFsnsJpy8209W8CGwhJ31Y= =u7p6 -----END PGP SIGNATURE-----
Jean-David Beyer wrote: > >>> The main question is, If I present pg_restore with a 65536-byte > >>> blocksize > >>> and it is expecting, e.g., 1024-bytes, will the rest of each block get > >>> skipped? I.e., do I have to use dd on the way back too? And if so, > >>> what > >>> should the blocksize be? > >> Postgres (by default) uses 8K blocks. > > > > That is true of the internal storage, but not of pg_dump's output > > because it is using libpq to pull rows and output them in a stream, > > meaning there is no blocking in pg_dumps output itself. > > > Is that true for both input and output (i.e., pg_restore and pg_dump)? > I.e., can I use dd to write 65536-byte blocks to tape, and then do nothing > on running pg_restore? I.e., that pg_restore will accept any block size I > choose to offer it? Yes. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bruce Momjian wrote: > Jean-David Beyer wrote: >>>>> The main question is, If I present pg_restore with a 65536-byte >>>>> blocksize >>>>> and it is expecting, e.g., 1024-bytes, will the rest of each block get >>>>> skipped? I.e., do I have to use dd on the way back too? And if so, >>>>> what >>>>> should the blocksize be? >>>> Postgres (by default) uses 8K blocks. >>> That is true of the internal storage, but not of pg_dump's output >>> because it is using libpq to pull rows and output them in a stream, >>> meaning there is no blocking in pg_dumps output itself. >>> >> Is that true for both input and output (i.e., pg_restore and pg_dump)? >> I.e., can I use dd to write 65536-byte blocks to tape, and then do nothing >> on running pg_restore? I.e., that pg_restore will accept any block size I >> choose to offer it? > > Yes. > Did not work at first: ... pg_dump: dumping contents of table vl_ranks 51448+2 records in 401+1 records out 26341760 bytes (26 MB) copied, 122.931 seconds, 214 kB/s So I suppose that worked. (This database just has some small initial tables loaded. The biggest one is still empty.) But then trillian:postgres[~]$ ./restore.db pg_restore: [archiver] did not find magic string in file header trillian:postgres[~]$ I fixed it by changing my backup script as follows: $ cat backup.db #!/bin/bash # # This is to backup the postgreSQL database, stock. # DD=/bin/dd DD_OPTIONS="obs=65536 of=/dev/st0" MT=/bin_mt MT_OPTIONS="-f /dev/st0 setblk 0" PG_OPTIONS="--format=c --username=postgres --verbose" PG_DUMP=/usr/bin/pg_dump $PG_DUMP $PG_OPTIONS stock | $DD $DD_OPTIONS and it still would not restore until I changed the restore script to this: $ cat restore.db #!/bin/bash # This is to restore database stock. FILENAME=/dev/st0 DD=/bin/dd DD_OPTIONS="ibs=65536 if=$FILENAME" MT=/bin/mt MT_OPTIONS="-f $FILENAME setblk 0" PG_OPTIONS="--clean --dbname=stock --format=c --username=postgres --verbose" PG_RESTORE=/usr/bin/pg_restore $MT $MT_OPTIONS $DD $DD_OPTIONS | $PG_RESTORE $PG_OPTIONS It appears that I must read in the same blocksize as I wrote. My normal backup program (BRU) can infer the blocksize from the first record, but apparently pg_restore does not. But dd will read it if I tell it the size. Hence the above. The MT stuff is to tell the tape driver to accept variable block size so the program that opens it can set it. DD can do that, but I infer that pg_restore does not. - -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939./()\ Shrewsbury, New Jersey http://counter.li.org^^-^^ 11:00:01 up 18 days, 14:22, 3 users, load average: 5.54,4.84, 4.45 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org iD8DBQFG0vQuPtu2XpovyZoRAlwcAKC5ApaGOoZrnHDUa5vgg9tx4jrqjwCeLfLV oPLB1xCbJ0/WLYrg5/qVs2g= =BkQ6 -----END PGP SIGNATURE-----