Re: more about pg_toast growth - Mailing list pgsql-general

From Jeffrey W. Baker
Subject Re: more about pg_toast growth
Date
Msg-id 1016051755.5255.26.camel@heat
Whole thread Raw
In response to Re: more about pg_toast growth  (Jan Wieck <janwieck@yahoo.com>)
Responses Re: more about pg_toast growth
List pgsql-general
On Wed, 2002-03-13 at 12:16, Jan Wieck wrote:
> Jeffrey W. Baker wrote:
> > On Wed, 2002-03-13 at 07:22, Jan Wieck wrote:
> > > [...]
> > >
> > >     Remember,  TOAST  doesn't  only  come  in  slices,  don't you
> > >     usually brown it?  Meaning, the data gets compressed (with  a
> > >     lousy  but  really  fast  algorithm).   What  kind of data is
> > >     resp_body? 50% compression  ratio  ...  I  guess  it's  html,
> > >     right?
> >
> > It is gzipped and base64-encoded text.  It's somewhat strange that a
> > fast LZ would deflate it very much, but I guess it must be an artifact
> > of the base64.  The initial gzip tends to deflate the data by about 90%.
>
>     Now  THAT is very surprising to me! The SLZ algorithm used in
>     TOAST will for sure not be able to squeeze anything out of  a
>     gzip  compressed  stream.   The result would be bigger again.
>     B64 changes the file size basically to 4/3rd, but  since  the
>     input  stream  is gzipped, the resulting B64 stream shouldn't
>     contain patterns that SLZ can use to reduce the size again.
>
>     Are you sure you're B64-encoding the gzipped  text?

I am positive:

rupert=# select substr(body, 0, 200) from resp_body where resp = (select
max(resp) from resp_body);

eJztfXt34riy799hrf4OGuZMJ1k3BL949SScRQhJmCbAAbp7z75zV5bAAjxtbI5tkjB75rvfkiwb
GxxDHt0dgvtBjC2VpFLVr6qkknMydiZ6+WRMsFo+6dV7jVqZnOE5ami2oxkjG31ALWdMLLgxIIZN
UFvHDrFPsm7Z1MmEOBiNHWeaIf87025P07X7qWYRO40Gp

rupert=# select min(length(body)), max(length(body)), avg(length(body))
from resp_body;
 min |  max   |       avg
-----+--------+------------------
   0 | 261948 | 21529.5282897281

>     I  mean,
>     you  have  an  average  body size of 23K "gzipped", so you're
>     telling that the average  uncompressed  body  size  is  about
>     230K?  You  are  storing  230  Megabytes of raw body data per
>     hour? Man, who is writing all that text?

Reuters.

I have increased the free space map and will be able to restart the
postmaster today at around midnight GMT.

Thanks for you help,
Jeffrey


pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: index on large table
Next
From: thiemo
Date:
Subject: Re: [JDBC] Fwd: DBvisualizer on MacOS X