Thread: Block size with pg_dump?

Block size with pg_dump?

From
Jean-David Beyer
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

When I make a backup of a database, I put the output file directly on
magnetic tape; i.e., my command looks like this:

pg_dump --file=/dev/st0 ....

This way I do not have to worry if the total backup exceeds the size of a
file system, and it saves me the trouble of copying it to the tape as a
separate step. My current tapes will hold 20 GBytes raw or 40GBytes if I
enable hardware compression (assuming 2:1 compression happens). Now it says
in the documentation that if I use format c it will compress the data in
software, so I doubt the hardware compression will do much.

I do not know what blocksize pg_dump uses, or if it insists on a particular
blocksize on input.

Now my tape drive will work with any blocksize, but prefers 65536-byte
blocks. I do not see any options for this in pg_dump, but I could pipe the
output of pg_dump through dd I suppose to make any blocksize I want.

On the way back, likewise I could pipe the tape through dd before giving it
to pg_restore.

Does pg_dump care what blocksize it gets? If so, what is it?

- -- .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 17:20:01 up 17 days, 20:42, 5 users, load average:
5.12,5.26, 5.21
 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org

iD8DBQFG0fITPtu2XpovyZoRAouwAKCTEour7jbi3uKWmEjerOM3U51xKQCeKYrQ
6jbamlqvTvH04jD7oRbTAKY=
=piNw
-----END PGP SIGNATURE-----


Re: Block size with pg_dump?

From
Bruce Momjian
Date:
Jean-David Beyer wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> When I make a backup of a database, I put the output file directly on
> magnetic tape; i.e., my command looks like this:
> 
> pg_dump --file=/dev/st0 ....
> 
> This way I do not have to worry if the total backup exceeds the size of a
> file system, and it saves me the trouble of copying it to the tape as a
> separate step. My current tapes will hold 20 GBytes raw or 40GBytes if I
> enable hardware compression (assuming 2:1 compression happens). Now it says
> in the documentation that if I use format c it will compress the data in
> software, so I doubt the hardware compression will do much.
> 
> I do not know what blocksize pg_dump uses, or if it insists on a particular
> blocksize on input.
> 
> Now my tape drive will work with any blocksize, but prefers 65536-byte
> blocks. I do not see any options for this in pg_dump, but I could pipe the
> output of pg_dump through dd I suppose to make any blocksize I want.
> 
> On the way back, likewise I could pipe the tape through dd before giving it
> to pg_restore.
> 
> Does pg_dump care what blocksize it gets? If so, what is it?

I assume you could pipe pg_dump into dd and specify the block size in
dd.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Block size with pg_dump?

From
Jean-David Beyer
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruce Momjian wrote:
> Jean-David Beyer wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> When I make a backup of a database, I put the output file directly on
>> magnetic tape; i.e., my command looks like this:
>>
>> pg_dump --file=/dev/st0 ....
>>
>> This way I do not have to worry if the total backup exceeds the size of a
>> file system, and it saves me the trouble of copying it to the tape as a
>> separate step. My current tapes will hold 20 GBytes raw or 40GBytes if I
>> enable hardware compression (assuming 2:1 compression happens). Now it says
>> in the documentation that if I use format c it will compress the data in
>> software, so I doubt the hardware compression will do much.
>>
>> I do not know what blocksize pg_dump uses, or if it insists on a particular
>> blocksize on input.
>>
>> Now my tape drive will work with any blocksize, but prefers 65536-byte
>> blocks. I do not see any options for this in pg_dump, but I could pipe the
>> output of pg_dump through dd I suppose to make any blocksize I want.
>>
>> On the way back, likewise I could pipe the tape through dd before giving it
>> to pg_restore.
>>
>> Does pg_dump care what blocksize it gets? If so, what is it?
> 
> I assume you could pipe pg_dump into dd and specify the block size in
> dd.
> 
Of course on the way out I can do that.

The main question is, If I present pg_restore with a 65536-byte blocksize
and it is expecting, e.g., 1024-bytes, will the rest of each block get
skipped? I.e., do I have to use dd on the way back too? And if so, what
should the blocksize be?

- -- .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 21:05:01 up 18 days, 27 min, 0 users, load average:
4.32,4.12, 4.09
 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org

iD8DBQFG0iRlPtu2XpovyZoRAsXeAKCDuWnpDzTSEhvcBGjKXLO1oS2iAgCgrWB4
6Wj1bz9QoFOXrfL3galipDU=
=pxyE
-----END PGP SIGNATURE-----


Re: Block size with pg_dump?

From
Erik Jones
Date:
On Aug 26, 2007, at 8:09 PM, Jean-David Beyer wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Bruce Momjian wrote:
>> Jean-David Beyer wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> When I make a backup of a database, I put the output file
>>> directly on
>>> magnetic tape; i.e., my command looks like this:
>>>
>>> pg_dump --file=/dev/st0 ....
>>>
>>> This way I do not have to worry if the total backup exceeds the
>>> size of a
>>> file system, and it saves me the trouble of copying it to the
>>> tape as a
>>> separate step. My current tapes will hold 20 GBytes raw or
>>> 40GBytes if I
>>> enable hardware compression (assuming 2:1 compression happens).
>>> Now it says
>>> in the documentation that if I use format c it will compress the
>>> data in
>>> software, so I doubt the hardware compression will do much.
>>>
>>> I do not know what blocksize pg_dump uses, or if it insists on a
>>> particular
>>> blocksize on input.
>>>
>>> Now my tape drive will work with any blocksize, but prefers 65536-
>>> byte
>>> blocks. I do not see any options for this in pg_dump, but I could
>>> pipe the
>>> output of pg_dump through dd I suppose to make any blocksize I want.
>>>
>>> On the way back, likewise I could pipe the tape through dd before
>>> giving it
>>> to pg_restore.
>>>
>>> Does pg_dump care what blocksize it gets? If so, what is it?
>>
>> I assume you could pipe pg_dump into dd and specify the block size in
>> dd.
>>
> Of course on the way out I can do that.
>
> The main question is, If I present pg_restore with a 65536-byte
> blocksize
> and it is expecting, e.g., 1024-bytes, will the rest of each block get
> skipped? I.e., do I have to use dd on the way back too? And if so,
> what
> should the blocksize be?

Postgres (by default) uses 8K blocks.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




Re: Block size with pg_dump?

From
Bruce Momjian
Date:
Erik Jones wrote:
> >>> On the way back, likewise I could pipe the tape through dd before  
> >>> giving it
> >>> to pg_restore.
> >>>
> >>> Does pg_dump care what blocksize it gets? If so, what is it?
> >>
> >> I assume you could pipe pg_dump into dd and specify the block size in
> >> dd.
> >>
> > Of course on the way out I can do that.
> >
> > The main question is, If I present pg_restore with a 65536-byte  
> > blocksize
> > and it is expecting, e.g., 1024-bytes, will the rest of each block get
> > skipped? I.e., do I have to use dd on the way back too? And if so,  
> > what
> > should the blocksize be?
> 
> Postgres (by default) uses 8K blocks.

That is true of the internal storage, but not of pg_dump's output
because it is using libpq to pull rows and output them in a stream,
meaning there is no blocking in pg_dumps output itself.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Block size with pg_dump?

From
Jean-David Beyer
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruce Momjian wrote:
> Erik Jones wrote:
>>>>> On the way back, likewise I could pipe the tape through dd before  
>>>>> giving it
>>>>> to pg_restore.
>>>>>
>>>>> Does pg_dump care what blocksize it gets? If so, what is it?
>>>> I assume you could pipe pg_dump into dd and specify the block size in
>>>> dd.
>>>>
>>> Of course on the way out I can do that.
>>>
>>> The main question is, If I present pg_restore with a 65536-byte  
>>> blocksize
>>> and it is expecting, e.g., 1024-bytes, will the rest of each block get
>>> skipped? I.e., do I have to use dd on the way back too? And if so,  
>>> what
>>> should the blocksize be?
>> Postgres (by default) uses 8K blocks.
> 
> That is true of the internal storage, but not of pg_dump's output
> because it is using libpq to pull rows and output them in a stream,
> meaning there is no blocking in pg_dumps output itself.
> 
Is that true for both input and output (i.e., pg_restore and pg_dump)?
I.e., can I use dd to write 65536-byte blocks to tape, and then do nothing
on running pg_restore? I.e., that pg_restore will accept any block size I
choose to offer it?

- -- .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 08:25:01 up 18 days, 11:47, 2 users, load average:
4.34,4.31, 4.27
 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org

iD8DBQFG0sNpPtu2XpovyZoRAvVpAKCD0YPHpZVXwIweDwDfozA/79XJSACg0Jao
qmFsnsJpy8209W8CGwhJ31Y=
=u7p6
-----END PGP SIGNATURE-----


Re: Block size with pg_dump?

From
Bruce Momjian
Date:
Jean-David Beyer wrote:
> >>> The main question is, If I present pg_restore with a 65536-byte  
> >>> blocksize
> >>> and it is expecting, e.g., 1024-bytes, will the rest of each block get
> >>> skipped? I.e., do I have to use dd on the way back too? And if so,  
> >>> what
> >>> should the blocksize be?
> >> Postgres (by default) uses 8K blocks.
> > 
> > That is true of the internal storage, but not of pg_dump's output
> > because it is using libpq to pull rows and output them in a stream,
> > meaning there is no blocking in pg_dumps output itself.
> > 
> Is that true for both input and output (i.e., pg_restore and pg_dump)?
> I.e., can I use dd to write 65536-byte blocks to tape, and then do nothing
> on running pg_restore? I.e., that pg_restore will accept any block size I
> choose to offer it?

Yes.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Block size with pg_dump?

From
Jean-David Beyer
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruce Momjian wrote:
> Jean-David Beyer wrote:
>>>>> The main question is, If I present pg_restore with a 65536-byte  
>>>>> blocksize
>>>>> and it is expecting, e.g., 1024-bytes, will the rest of each block get
>>>>> skipped? I.e., do I have to use dd on the way back too? And if so,  
>>>>> what
>>>>> should the blocksize be?
>>>> Postgres (by default) uses 8K blocks.
>>> That is true of the internal storage, but not of pg_dump's output
>>> because it is using libpq to pull rows and output them in a stream,
>>> meaning there is no blocking in pg_dumps output itself.
>>>
>> Is that true for both input and output (i.e., pg_restore and pg_dump)?
>> I.e., can I use dd to write 65536-byte blocks to tape, and then do nothing
>> on running pg_restore? I.e., that pg_restore will accept any block size I
>> choose to offer it?
> 
> Yes.
> 
Did not work at first:

...
pg_dump: dumping contents of table vl_ranks
51448+2 records in
401+1 records out
26341760 bytes (26 MB) copied, 122.931 seconds, 214 kB/s

So I suppose that worked. (This database just has some small initial tables
loaded. The biggest one is still empty.) But then

trillian:postgres[~]$ ./restore.db
pg_restore: [archiver] did not find magic string in file header
trillian:postgres[~]$

I fixed it by changing my backup script as follows:

$ cat backup.db
#!/bin/bash
#
#       This is to backup the postgreSQL database, stock.
#
DD=/bin/dd
DD_OPTIONS="obs=65536 of=/dev/st0"
MT=/bin_mt
MT_OPTIONS="-f /dev/st0 setblk 0"
PG_OPTIONS="--format=c --username=postgres --verbose"
PG_DUMP=/usr/bin/pg_dump

$PG_DUMP $PG_OPTIONS stock | $DD $DD_OPTIONS

and it still would not restore until I changed the restore script to this:

$ cat restore.db
#!/bin/bash

#       This is to restore database stock.
FILENAME=/dev/st0

DD=/bin/dd
DD_OPTIONS="ibs=65536 if=$FILENAME"
MT=/bin/mt
MT_OPTIONS="-f $FILENAME setblk 0"
PG_OPTIONS="--clean --dbname=stock --format=c --username=postgres --verbose"
PG_RESTORE=/usr/bin/pg_restore

$MT $MT_OPTIONS
$DD $DD_OPTIONS | $PG_RESTORE $PG_OPTIONS

It appears that I must read in the same blocksize as I wrote. My normal
backup program (BRU) can infer the blocksize from the first record, but
apparently pg_restore does not. But dd will read it if I tell it the size.
Hence the above.

The MT stuff is to tell the tape driver to accept variable block size so the
program that opens it can set it. DD can do that, but I infer that
pg_restore does not.


- -- .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 11:00:01 up 18 days, 14:22, 3 users, load average:
5.54,4.84, 4.45
 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with CentOS - http://enigmail.mozdev.org

iD8DBQFG0vQuPtu2XpovyZoRAlwcAKC5ApaGOoZrnHDUa5vgg9tx4jrqjwCeLfLV
oPLB1xCbJ0/WLYrg5/qVs2g=
=BkQ6
-----END PGP SIGNATURE-----