Thread: Compressed Backup too big

Compressed Backup too big

From
"Andrus"
Date:
"PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"
Database size in disk returned by  pg_database_size() is 210 MB

Database compressesed  backup file size is now 125 MB.
This seems too much. I expect compression to decrease size 10 times, also
indexes are not backed up. A year ago compressed backup size was 9 MB only.

I created query returning biggest tables with and without indexes and found:

      1  pg_toast_22185                      95 MB           96 MB
      2  rid                                 21 MB           27 MB
      3  klient                              13 MB           19 MB
      4  mailbox                             10 MB           11 MB
      5  dok                                 7640 kB         12 MB
      6  desktop                             8080 kB         8200 kB
      7  strings                             5536 kB         6584 kB
      8  pg_toast_22338                      5232 kB         5368 kB

...

Questions:

1. Tables are relatively small and thus cannot create 125 MB compressed
backup file.
Why backup file sis so big ?

2. How to determine what data is containing in  pg_toast_22185  ?
Why this is so big ?

Andrus.



Re: Compressed Backup too big

From
"Filip Rembiałkowski"
Date:
2007/11/15, Andrus <kobruleht2@hot.ee>:
> "PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
> (mingw-special)"
> Database size in disk returned by  pg_database_size() is 210 MB
>
> Database compressesed  backup file size is now 125 MB.

How do you produce this dump? pg_dump -Fc ?

If you create plain dump ( pg_dump DBNAME > dump.file ) and zip it
(normal deflate, like winzip or sth like that), what's the size?

If this is much smaller than 125MB, maybe you got some corner case
with postgres builtin  compression.

You cannot *always* expect 10x ratio... it depends on data. Maybe
usage patterns of your database changed and now the data is less
compressable?


--
Filip Rembiałkowski

Re: Compressed Backup too big

From
MaXX
Date:
On Thu, 15 Nov 2007 20:35:35 +0200
"Andrus" <kobruleht2@hot.ee> wrote:

> "PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
> (mingw-special)"
> Database size in disk returned by  pg_database_size() is 210 MB
>
> Database compressesed  backup file size is now 125 MB.
> This seems too much. I expect compression to decrease size 10 times, also
> indexes are not backed up. A year ago compressed backup size was 9 MB only.
>
> I created query returning biggest tables with and without indexes and found:
>
>       1  pg_toast_22185                      95 MB           96 MB
>       2  rid                                 21 MB           27 MB
>       3  klient                              13 MB           19 MB
>       4  mailbox                             10 MB           11 MB
>       5  dok                                 7640 kB         12 MB
>       6  desktop                             8080 kB         8200 kB
>       7  strings                             5536 kB         6584 kB
>       8  pg_toast_22338                      5232 kB         5368 kB
>
> ...
>
> Questions:
>
> 1. Tables are relatively small and thus cannot create 125 MB compressed
> backup file.
> Why backup file sis so big ?
I assume that "mailbox" contains e-mail messages as the name suggest... One can assume that those emails also have
attachmentslike pdf, jpg and other mostly incompressible file format. 

In my understanding of postgresql, when you a have a variable length field (like text or bytea), pg will only stores
thefirst bytes (hundreds or so) of that field in the main table and then put the rest in an "hidden[1]" TOAST table. 

> 2. How to determine what data is containing in  pg_toast_22185  ?
If you have PGAdmin at hand or you can install it, its quite easy, just enable "Show System Objects" under the "View"
menu[2]. Based on my assumptions, I guess it's the TOAST of the field which contains the body of your emails... 

Enough wild guesses,
HTH,
--
MaXX

[1] hidden: transparent from an user point of view
[2] you can also obtain those information from the information schema IIRC

Re: Compressed Backup too big

From
Shane Ambler
Date:
Andrus wrote:

> 2. How to determine what data is containing in  pg_toast_22185  ?
> Why this is so big ?
>

That will be determined by the columns you define and the data you insert.

Read http://www.postgresql.org/docs/8.2/interactive/storage-toast.html
to get an explanation of TOAST and how you can change the settings if
you wish.


This is most likely the reason for your "lack of compression" - the data
in the toast tables is most likely already compressed data so will be
close to a 1-1 export size.




--

Shane Ambler
pgSQL@Sheeky.Biz

Get Sheeky @ http://Sheeky.Biz

Re: Compressed Backup too big

From
Tom Lane
Date:
"Andrus" <kobruleht2@hot.ee> writes:
> I created query returning biggest tables with and without indexes and found:

>       1  pg_toast_22185                      95 MB           96 MB

> 1. Tables are relatively small and thus cannot create 125 MB compressed
> backup file.

You have failed to consider that data that goes into a toast table is
usually compressed first.  The above table is probably accounting for
close to 100MB of compressed dump output.

            regards, tom lane

Re: Compressed Backup too big

From
Magnus Hagander
Date:
On Thu, 2007-11-15 at 20:35 +0200, Andrus wrote:
> "PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
> (mingw-special)"
> Database size in disk returned by  pg_database_size() is 210 MB
>
> Database compressesed  backup file size is now 125 MB.
> This seems too much. I expect compression to decrease size 10 times, also
> indexes are not backed up. A year ago compressed backup size was 9 MB only.
>
> I created query returning biggest tables with and without indexes and found:
>
>       1  pg_toast_22185                      95 MB           96 MB
>       2  rid                                 21 MB           27 MB
>       3  klient                              13 MB           19 MB
>       4  mailbox                             10 MB           11 MB
>       5  dok                                 7640 kB         12 MB
>       6  desktop                             8080 kB         8200 kB
>       7  strings                             5536 kB         6584 kB
>       8  pg_toast_22338                      5232 kB         5368 kB
>
> ...
>
> Questions:
>
> 1. Tables are relatively small and thus cannot create 125 MB compressed
> backup file.
> Why backup file sis so big ?

Is this a pg_dump backup or a PITR style backup?
If it's a pg_dump backup, you can open it up in an editor to find out
what's taking so much space.

> 2. How to determine what data is containing in  pg_toast_22185  ?
> Why this is so big ?

Could it be that you haven't been VACUUMing properly? Possibly you need
to run a VACUUM FULL if you haven't kept up. If it's a PITR style backup
on 1, that could be the same reason.


To find what table has pg_toast_22185, try:
SELECT relname FROM pg_class WHERE oid=22185


//Magnus


Re: Compressed Backup too big

From
Stuart Bishop
Date:
Filip Rembiałkowski wrote:
> 2007/11/15, Andrus <kobruleht2@hot.ee>:
>> "PostgreSQL 8.2.3 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
>> (mingw-special)"
>> Database size in disk returned by  pg_database_size() is 210 MB
>>
>> Database compressesed  backup file size is now 125 MB.
>
> How do you produce this dump? pg_dump -Fc ?
>
> If you create plain dump ( pg_dump DBNAME > dump.file ) and zip it
> (normal deflate, like winzip or sth like that), what's the size?
>
> If this is much smaller than 125MB, maybe you got some corner case
> with postgres builtin  compression.
>
> You cannot *always* expect 10x ratio... it depends on data. Maybe
> usage patterns of your database changed and now the data is less
> compressable?

We find we get much, much better compression using bzip2.

(It would be wonderful if pg_restore supported this natively, but I don't
know enough to know if it is possible or not.)

--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/


Attachment