Thread: pg_dump fundenental question

pg_dump fundenental question

From
"J. Cassidy"
Date:
Hello all,

I have hopefully an "easy" question.

If I issue the pg_dump command with no switches or options i.e.

/usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd

Is their any "default" compression involved or not? Does pg_dump talk to zlib during the dump process given that I have not specified any compression on the 
command line? (see above).

Your considered opinions would be much appreciated.


Regards,


Henk
 

Re: pg_dump fundenental question

From
Adrian Klaver
Date:
On 07/05/2016 07:54 AM, J. Cassidy wrote:
> Hello all,
>
> I have hopefully an "easy" question.
>
> If I issue the pg_dump command with no switches or options i.e.
>
> /usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd
>
> Is their any "default" compression involved or not? Does pg_dump talk to
> zlib during the dump process given that I have not specified any
> compression on the
> command line? (see above).
>
> Your considered opinions would be much appreciated.

https://www.postgresql.org/docs/9.5/static/app-pgdump.html

"
-F format
--format=format

     Selects the format of the output. format can be one of the following:

     p
     plain

         Output a plain-text SQL script file (the default).
"

......

In line tag:

"-Z 0..9
--compress=0..9

     Specify the compression level to use. Zero means no compression.
For the custom archive format, this specifies compression of individual
table-data segments, and the default is to compress at a moderate level.
<* SEE HERE For plain text output, setting a nonzero compression level
causes the entire output file to be compressed, as though it had been
fed through gzip; but the default is not to compress. SEE HERE *> The
tar archive format currently does not support compression at all.
"


>
>
> Regards,
>
>
> Henk
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: pg_dump fundenental question

From
"David G. Johnston"
Date:
On Tue, Jul 5, 2016 at 10:54 AM, J. Cassidy <sean@jdcassidy.eu> wrote:
Hello all,

I have hopefully an "easy" question.

If I issue the pg_dump command with no switches or options i.e.

/usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd

Is their any "default" compression involved or not? Does pg_dump talk to zlib during the dump process given that I have not specified any compression on the 
command line? (see above).
​In general the easiest way to answer the question of whether something (well, a text file at least) is compressed or not is to attempt to open it in a text editor - if you can read the contents​
 
​its not compressed.

Reading the Description section at the following location provides the answer you seek - no, there is no compression by default.​


Re: pg_dump fundenental question

From
Rob Sargent
Date:


On 07/05/2016 10:54 AM, David G. Johnston wrote:
On Tue, Jul 5, 2016 at 10:54 AM, J. Cassidy <sean@jdcassidy.eu> wrote:
Hello all,

I have hopefully an "easy" question.

If I issue the pg_dump command with no switches or options i.e.

/usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd

Is their any "default" compression involved or not? Does pg_dump talk to zlib during the dump process given that I have not specified any compression on the 
command line? (see above).
​ In general the easiest way to answer the question of whether something (well, a text file at least) is compressed or not is to attempt to open it in a text editor - if you can read the contents​
 
​ its not compressed.

Reading the Description section at the following location provides the answer you seek - no, there is no compression by default.​


I like the 'file' command, and now the MS is shipping bash, it should be available to all.

Re: pg_dump fundenental question

From
"J. Cassidy"
Date:
Hello Adrian,

appreciate the prompt reply, thank you.

As stated in the original email, I want to know whether compression (whatever level) is on by default (or not) -  if I supply NO extra switches/options.
I have read the documentation and it is unclear in this respect. I am a Mainframer and perhaps have a different world view on how to explain things...

TIA and regards,


Henk.



On 07/05/2016 07:54 AM, J. Cassidy wrote:
> Hello all,
>
> I have hopefully an "easy" question.
>
> If I issue the pg_dump command with no switches or options i.e.
>
> /usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd
>
> Is their any "default" compression involved or not? Does pg_dump talk to
> zlib during the dump process given that I have not specified any
> compression on the
> command line? (see above).
>
> Your considered opinions would be much appreciated.

https://www.postgresql.org/docs/9.5/static/app-pgdump.html

"
-F format
--format=format

Selects the format of the output. format can be one of the following:

p
plain

Output a plain-text SQL script file (the default).
"

......

In line tag:

"-Z 0..9
--compress=0..9

Specify the compression level to use. Zero means no compression.
For the custom archive format, this specifies compression of individual
table-data segments, and the default is to compress at a moderate level.
<* SEE HERE For plain text output, setting a nonzero compression level
causes the entire output file to be compressed, as though it had been
fed through gzip; but the default is not to compress. SEE HERE *> The
tar archive format currently does not support compression at all.
"


>
>
> Regards,
>
>
> Henk
>


--
Adrian Klaver
adrian.klaver@aklaver.com



J

Re: pg_dump fundenental question

From
"J. Cassidy"
Date:
Hello David,

a good point, but I would prefer NOT to open a 324GB backup file in a text editor. I can however cat/less/head/tail the file in Linux.

Regards,


Henk

 

Re: pg_dump fundenental question

From
"J. Cassidy"
Date:
Rob,


appreciate the reply but I have never used nor never will use "that" os (small capitals intentional.


Regards,


Henk

Re: pg_dump fundenental question

From
Alvaro Herrera
Date:
J. Cassidy wrote:

> As stated in the original email, I want to know whether compression
> (whatever level) is on by default (or not) -  if I supply NO extra
> switches/options.  I have read the documentation and it is unclear in
> this respect. I am a Mainframer and perhaps have a different world
> view on how to explain things...

I disagree on it being unclear.  It says plain and simple "the default
is not to compress" when using the text output which it also says to be
the default output format.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pg_dump fundenental question

From
"Joshua D. Drake"
Date:
On 07/05/2016 10:10 AM, J. Cassidy wrote:
> Hello Adrian,
>
> appreciate the prompt reply, thank you.
>
> As stated in the original email, I want to know whether compression
> (whatever level) is on by default (or not) -  if I supply NO extra
> switches/options.

There is no compression by default.


--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


Re: pg_dump fundenental question

From
Francisco Olarte
Date:
On Tue, Jul 5, 2016 at 4:54 PM, J. Cassidy <sean@jdcassidy.eu> wrote:
> I have hopefully an "easy" question.
> If I issue the pg_dump command with no switches or options i.e.
> /usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd
> Is their any "default" compression involved or not? Does pg_dump talk to
> zlib during the dump process given that I have not specified any compression
> on the > command line? (see above).

IIRC no options means you are requesting an SQL-script. Those are not
compressed, just pipe them through your favorite compressor. ( In a
later message you stated you were in Linux and had a 324Gb file, and
could head/tail it, have you done so? ).

> Your considered opinions would be much appreciated.

OTOH, with those big backup I would recommend using custom format (
-Fc ), it's much more flexible ( andyou can have the sql script but
asking pg_restore to generate it if you need it, but not the other way
round ).


Francisco Olarte.


Re: pg_dump fundenental question

From
"J. Cassidy"
Date:
Francisco,

appreciate the tips/hints.

My input (source) DB is  1TB in size, using the options as stated in my original email (i.e. no compression it would seem) the output file size is "only" 324GB. 
I presume all of the formatting/indices have been ommited. As I said before, I can browse the backup file with less/heat/cat/tail etc.

Regards,

Henk



On Tue, Jul 5, 2016 at 4:54 PM, J. Cassidy <sean@jdcassidy.eu> wrote:
> I have hopefully an "easy" question.
> If I issue the pg_dump command with no switches or options i.e.
> /usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd
> Is their any "default" compression involved or not? Does pg_dump talk to
> zlib during the dump process given that I have not specified any compression
> on the > command line? (see above).

IIRC no options means you are requesting an SQL-script. Those are not
compressed, just pipe them through your favorite compressor. ( In a
later message you stated you were in Linux and had a 324Gb file, and
could head/tail it, have you done so? ).

> Your considered opinions would be much appreciated.

OTOH, with those big backup I would recommend using custom format (
-Fc ), it's much more flexible ( andyou can have the sql script but
asking pg_restore to generate it if you need it, but not the other way
round ).


Francisco Olarte.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


 

Re: pg_dump fundenental question

From
"J. Cassidy"
Date:

Thank you all for the information.

On 07/05/2016 10:10 AM, J. Cassidy wrote:
> Hello Adrian,
>
> appreciate the prompt reply, thank you.
>
> As stated in the original email, I want to know whether compression
> (whatever level) is on by default (or not) - if I supply NO extra
> switches/options.

There is no compression by default.


--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


Attachment

Re: pg_dump fundenental question

From
Sameer Kumar
Date:


On Wed, 6 Jul 2016, 1:39 a.m. J. Cassidy, <sean@jdcassidy.eu> wrote:
Francisco,

appreciate the tips/hints.

My input (source) DB is  1TB in size, using the options as stated in my original email (i.e. no compression it would seem) the output file size is "only" 324GB. 

It would be because of indexes do not take any space in backup, since they are just an SQL statement. Some space might be saved because of bloats in your db (which are not copied over in the sql backup).


I presume all of the formatting/indices have been ommited. As I said before, I can browse the backup file with less/heat/cat/tail etc.

Ofcourse you can but consider using custom or directory format (what is your version? It is a good practice to state that along with your query) and use -j to specify multiple threads to copy the data. Use -Z for compression level.

pg_dump -d prod_db -Fd -j6 -f /mybackups/20160706-prod-db -Z9

If ever you want to browse the backup or look a specific table from the backup, pg_restore with -f will be helpful e.g 

pg_restore -f emp-from-backup.sql -Fd /mybackups/20160706-prod-db




Regards,

Henk



On Tue, Jul 5, 2016 at 4:54 PM, J. Cassidy <sean@jdcassidy.eu> wrote:
> I have hopefully an "easy" question.
> If I issue the pg_dump command with no switches or options i.e.
> /usr/local/pgsql/bin/pg_dump -v dbabcd > /data3TB/Backup/dbabcd
> Is their any "default" compression involved or not? Does pg_dump talk to
> zlib during the dump process given that I have not specified any compression
> on the > command line? (see above).

IIRC no options means you are requesting an SQL-script. Those are not
compressed, just pipe them through your favorite compressor. ( In a
later message you stated you were in Linux and had a 324Gb file, and
could head/tail it, have you done so? ).

> Your considered opinions would be much appreciated.

OTOH, with those big backup I would recommend using custom format (
-Fc ), it's much more flexible ( andyou can have the sql script but
asking pg_restore to generate it if you need it, but not the other way
round ).


Francisco Olarte.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


 
--
--
Best Regards
Sameer Kumar | DB Solution Architect 
ASHNIK PTE. LTD.

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com

Re: pg_dump fundenental question

From
Paul Linehan
Date:
> a good point, but I would prefer NOT to open a 324GB backup file in a text
> editor. I can however cat/less/head/tail the file in Linux.

Use vi (or flavour thereof) - it doesn't load the entire file in order to
read the contents of lines 1 - 100 (say).


Paul...


Re: pg_dump fundenental question

From
"Christofer C. Bell"
Date:
On Tue, Jul 5, 2016 at 3:17 PM, Paul Linehan <linehanp@tcd.ie> wrote:
> a good point, but I would prefer NOT to open a 324GB backup file in a text
> editor. I can however cat/less/head/tail the file in Linux.

Use vi (or flavour thereof) - it doesn't load the entire file in order to
read the contents of lines 1 - 100 (say).

What Paul says is very true, but if you are doing this, do be cognizant of where your fingers are.  If you press any key that will put vi into edit mode, it will then make a copy of the file for backup purposes.  Your 324GB file will become 648GB of disk usage and you'll have to wait while the backup copy is written out to disk.  It will not load the whole file into memory (ever) nor will it make a copy of the file as long as you stay in command mode.

If you want to use vi or equivalent for viewing the file without any danger of accidentally putting the editor into edit mode (and thus triggering the backup copy), you may want to invoke vi as "view(1)".  This opens vi read-only and you'll be able to use vi style hotkeys for navigation.
 
Paul...
 
Chris

"If you wish to make an apple pie from scratch, you must first invent the Universe." -- Carl Sagan


Re: pg_dump fundenental question

From
trafdev
Date:
"less" is much better for opening huge text files in *nix for reading.

On 07/05/16 15:13, Christofer C. Bell wrote:
> On Tue, Jul 5, 2016 at 3:17 PM, Paul Linehan <linehanp@tcd.ie
> <mailto:linehanp@tcd.ie>> wrote:
>
>     > a good point, but I would prefer NOT to open a 324GB backup file in a text
>     > editor. I can however cat/less/head/tail the file in Linux.
>
>     Use vi (or flavour thereof) - it doesn't load the entire file in
>     order to
>     read the contents of lines 1 - 100 (say).
>
>
> What Paul says is very true, but if you are doing this, do be cognizant
> of where your fingers are.  If you press any key that will put vi into
> edit mode, it will then make a copy of the file for backup purposes.
> Your 324GB file will become 648GB of disk usage and you'll have to wait
> while the backup copy is written out to disk.  It will not load the
> whole file into memory (ever) nor will it make a copy of the file as
> long as you stay in command mode.
>
> If you want to use vi or equivalent for viewing the file without any
> danger of accidentally putting the editor into edit mode (and thus
> triggering the backup copy), you may want to invoke vi as "view(1)".
> This opens vi read-only and you'll be able to use vi style hotkeys for
> navigation.
>
>
>     Paul...
>
>
> Chris
>
> "If you wish to make an apple pie from scratch, you must first invent
> the Universe." -- Carl Sagan
>
>



Re: pg_dump fundenental question

From
Francisco Olarte
Date:
On Tue, Jul 5, 2016 at 7:39 PM, J. Cassidy <sean@jdcassidy.eu> wrote:
> My input (source) DB is  1TB in size, using the options as stated in my
> original email (i.e. no compression it would seem) the output file size is
> "only" 324GB.
> I presume all of the formatting/indices have been ommited. As I said before,
> I can browse the backup file with less/heat/cat/tail etc.

It's been told and you are nearly right. It's normal for a backup to
be about a third of the database size, even less on busy or very
indexed databases. Many effects come into place:

- Indices on the backup are, approximately, a 'create index' line.
- Data in the real db is stored in pages, which have some overhead and
some free space in them.
- Data on the backup is normally stored in 'copy' format, which is
normally more compact than the binary format used in the database
pages ( but slower and less flexible ).

Also, all the backup formats have more or less the same information as
the 'plain' format, and ocupy more or less the same WHEN UNCOMPRESSED.
The main advantage of the custom ( and somehow the tar formats ) is
that it stores every piece of information separated ( and potentially
compressed, it's a lot like a zip file ) and so can perform selective
restores ( you can select what to restore, and playing with the -l /
-L options even on what order, which gives you a lot of play ).

Francisco Olarte.