Thread: more about pg_toast growth

more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
I have more data about unchecked growth of pg_toast tables.  After
another day's worth of inserting and deleting, the table in question has
39504 tuples.  The pg_toast table has 234773 tuples, and 126697 are
marked as unused.  The sum(length(chunk_data)) from pg_toast is
433165242.  The sum(length(resp_body)) -- the actual data in the table
-- is much bigger: 921615964.  How is that possible?

In any case it is clear that the table is just growing again.  The file
increased from 420MB to 730MB overnight, without a corresponding
increase in tuples.

The free space map settings in postgresql.conf are commented out.

I'd be very interested to find out how the sum of the length of the
tuples can be much larger than both the sum of lengths from the toast
table and the actual size of the file.

Regards,
Jeffrey Baker





Re: more about pg_toast growth

From
John Gray
Date:
Caveat: Because it's otherwise quiet, I'm trying to be helpful. If you
already have someone more knowledgeable than I on the case, please don't
feel you need to respond!

On Tue, 2002-03-12 at 21:30, Jeffrey W. Baker wrote:
> I have more data about unchecked growth of pg_toast tables.  After
> another day's worth of inserting and deleting, the table in question has
> 39504 tuples.  The pg_toast table has 234773 tuples, and 126697 are
> marked as unused.  The sum(length(chunk_data)) from pg_toast is
> 433165242.  The sum(length(resp_body)) -- the actual data in the table
> -- is much bigger: 921615964.  How is that possible?
>
TOAST compresses data where possible -if your data is English text, then
that seems like a reasonable compression factor.


> In any case it is clear that the table is just growing again.  The file
> increased from 420MB to 730MB overnight, without a corresponding
> increase in tuples.
>

This implies to me that there are some updates going on against the
table. Presumably the sum(length(chunk_data)) hasn't increased during
the same period?

I should point out that although I know how TOAST is implemented, I have
no real knowledge of lazy vacuum and the free space map.

I notice that ordinary 'lazy' vacuum will only attempt a truncation if
it thinks it can reclaim at least 1/16 of the relation size.
Furthermore, it does need an exclusive lock to do this. You don't have
any process that performs lemgthy operations on resp_body which would
prevent vacuum from getting its lock? -if it can't get the lock, it will
just give up rather than blocking.

Perhaps in this case, a message could be printed in the vacuum verbose
output - "Could not get exclusive lock on relation -not truncating" - to
distinguish this case from the case where the relation is not seen as a
good candidate for attempting truncation.

> The free space map settings in postgresql.conf are commented out.
>

This implies they'll have their default values. How many tables in your
database get touched in your update process? What is the ballpark volume
of data that gets updated/inserted per day?

> I'd be very interested to find out how the sum of the length of the
> tuples can be much larger than both the sum of lengths from the toast
> table and the actual size of the file.
>
LZ Compression, as mentioned above.

You may wish to wait for a better answer before doing anything drastic
with your DB!

Regards

John


Re: more about pg_toast growth

From
Jan Wieck
Date:
Jeffrey W. Baker wrote:
> I have more data about unchecked growth of pg_toast tables.  After
> another day's worth of inserting and deleting, the table in question has
> 39504 tuples.  The pg_toast table has 234773 tuples, and 126697 are
> marked as unused.  The sum(length(chunk_data)) from pg_toast is
> 433165242.  The sum(length(resp_body)) -- the actual data in the table
> -- is much bigger: 921615964.  How is that possible?
>
> In any case it is clear that the table is just growing again.  The file
> increased from 420MB to 730MB overnight, without a corresponding
> increase in tuples.
>
> The free space map settings in postgresql.conf are commented out.
>
> I'd be very interested to find out how the sum of the length of the
> tuples can be much larger than both the sum of lengths from the toast
> table and the actual size of the file.

    Remember,  TOAST  doesn't  only  come  in  slices,  don't you
    usually brown it?  Meaning, the data gets compressed (with  a
    lousy  but  really  fast  algorithm).   What  kind of data is
    resp_body? 50% compression  ratio  ...  I  guess  it's  html,
    right?

    Anyway,  I  would  suggest  you  increase  the  max_fsm_pages
    parameter.  Commented out parameters in  the  postgresql.conf
    file  means  "default".   You  said  you're doing about 1,000
    inserts an hour and a daily bulk delete  of  approx.  24,000.
    Assuming  most of the toast tuples are contigous, that'd mean
    you are freeing something like 35,000 toast  pages.  I  would
    suggest  a freespace map size of 50,000 pages, to start with.
    That should at least lower the growth rate. If you still  see
    growth, go ahead and increase it further.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
On Wed, 2002-03-13 at 07:22, Jan Wieck wrote:
> Jeffrey W. Baker wrote:
> > I have more data about unchecked growth of pg_toast tables.  After
> > another day's worth of inserting and deleting, the table in question has
> > 39504 tuples.  The pg_toast table has 234773 tuples, and 126697 are
> > marked as unused.  The sum(length(chunk_data)) from pg_toast is
> > 433165242.  The sum(length(resp_body)) -- the actual data in the table
> > -- is much bigger: 921615964.  How is that possible?
> >
> > In any case it is clear that the table is just growing again.  The file
> > increased from 420MB to 730MB overnight, without a corresponding
> > increase in tuples.
> >
> > The free space map settings in postgresql.conf are commented out.
> >
> > I'd be very interested to find out how the sum of the length of the
> > tuples can be much larger than both the sum of lengths from the toast
> > table and the actual size of the file.
>
>     Remember,  TOAST  doesn't  only  come  in  slices,  don't you
>     usually brown it?  Meaning, the data gets compressed (with  a
>     lousy  but  really  fast  algorithm).   What  kind of data is
>     resp_body? 50% compression  ratio  ...  I  guess  it's  html,
>     right?

It is gzipped and base64-encoded text.  It's somewhat strange that a
fast LZ would deflate it very much, but I guess it must be an artifact
of the base64.  The initial gzip tends to deflate the data by about 90%.

>     Anyway,  I  would  suggest  you  increase  the  max_fsm_pages
>     parameter.  Commented out parameters in  the  postgresql.conf
>     file  means  "default".   You  said  you're doing about 1,000
>     inserts an hour and a daily bulk delete  of  approx.  24,000.
>     Assuming  most of the toast tuples are contigous, that'd mean
>     you are freeing something like 35,000 toast  pages.  I  would
>     suggest  a freespace map size of 50,000 pages, to start with.
>     That should at least lower the growth rate. If you still  see
>     growth, go ahead and increase it further.

I will happily do so.  What is the argument against increasing the free
space map?  Does it consume more memory?  The machine has 4GB main
memory, and I wouln't notice increased consumption of a few megabytes.

What triggers storage in the toast tables?  Is it because of a tuple
longer than some threshold?

-jwb



Re: more about pg_toast growth

From
Jan Wieck
Date:
Jeffrey W. Baker wrote:
> On Wed, 2002-03-13 at 07:22, Jan Wieck wrote:
> > [...]
> >
> >     Remember,  TOAST  doesn't  only  come  in  slices,  don't you
> >     usually brown it?  Meaning, the data gets compressed (with  a
> >     lousy  but  really  fast  algorithm).   What  kind of data is
> >     resp_body? 50% compression  ratio  ...  I  guess  it's  html,
> >     right?
>
> It is gzipped and base64-encoded text.  It's somewhat strange that a
> fast LZ would deflate it very much, but I guess it must be an artifact
> of the base64.  The initial gzip tends to deflate the data by about 90%.

    Now  THAT is very surprising to me! The SLZ algorithm used in
    TOAST will for sure not be able to squeeze anything out of  a
    gzip  compressed  stream.   The result would be bigger again.
    B64 changes the file size basically to 4/3rd, but  since  the
    input  stream  is gzipped, the resulting B64 stream shouldn't
    contain patterns that SLZ can use to reduce the size again.

    Are you sure you're B64-encoding the gzipped  text?  I  mean,
    you  have  an  average  body size of 23K "gzipped", so you're
    telling that the average  uncompressed  body  size  is  about
    230K?  You  are  storing  230  Megabytes of raw body data per
    hour? Man, who is writing all that text?

>
> >     Anyway,  I  would  suggest  you  increase  the  max_fsm_pages
> >     parameter.  Commented out parameters in  the  postgresql.conf
> >     file  means  "default".   You  said  you're doing about 1,000
> >     inserts an hour and a daily bulk delete  of  approx.  24,000.
> >     Assuming  most of the toast tuples are contigous, that'd mean
> >     you are freeing something like 35,000 toast  pages.  I  would
> >     suggest  a freespace map size of 50,000 pages, to start with.
> >     That should at least lower the growth rate. If you still  see
> >     growth, go ahead and increase it further.
>
> I will happily do so.  What is the argument against increasing the free
> space map?  Does it consume more memory?  The machine has 4GB main
> memory, and I wouln't notice increased consumption of a few megabytes.

    It allocates some more shared  memory.  It's  surely  in  the
    range of a few megabytes, so no need to worry in this case.

> What triggers storage in the toast tables?  Is it because of a tuple
> longer than some threshold?

    It is triggered by the attempt to store a row bigger than 1/4
    of the blocksize. Blocksize defaults to 8K. The toaster tries
    to shrink down the row size by first compressing attribute by
    attribute, then by moving them out to  the  toast  table.  As
    soon  as the row fit's into the 2K it'll stop and the storage
    happens.

    The 1/4 blocksize is something I choose because  it  had  the
    best  performance  in  my tests. The theory behind it is that
    your key fields are likely to be the small ones  that  remain
    uncompressed  in  the  main-row.   If more such main rows fit
    into fewer blocks, you have better caching of key attributes,
    thus faster scans and joins. And access to the huge values is
    only done after the  final  result  set  is  collected,  what
    usually  becomes a smaller percentage of the entire data, the
    larger the data is.  So it all reduces I/O.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
On Wed, 2002-03-13 at 12:16, Jan Wieck wrote:
> Jeffrey W. Baker wrote:
> > On Wed, 2002-03-13 at 07:22, Jan Wieck wrote:
> > > [...]
> > >
> > >     Remember,  TOAST  doesn't  only  come  in  slices,  don't you
> > >     usually brown it?  Meaning, the data gets compressed (with  a
> > >     lousy  but  really  fast  algorithm).   What  kind of data is
> > >     resp_body? 50% compression  ratio  ...  I  guess  it's  html,
> > >     right?
> >
> > It is gzipped and base64-encoded text.  It's somewhat strange that a
> > fast LZ would deflate it very much, but I guess it must be an artifact
> > of the base64.  The initial gzip tends to deflate the data by about 90%.
>
>     Now  THAT is very surprising to me! The SLZ algorithm used in
>     TOAST will for sure not be able to squeeze anything out of  a
>     gzip  compressed  stream.   The result would be bigger again.
>     B64 changes the file size basically to 4/3rd, but  since  the
>     input  stream  is gzipped, the resulting B64 stream shouldn't
>     contain patterns that SLZ can use to reduce the size again.
>
>     Are you sure you're B64-encoding the gzipped  text?

I am positive:

rupert=# select substr(body, 0, 200) from resp_body where resp = (select
max(resp) from resp_body);

eJztfXt34riy799hrf4OGuZMJ1k3BL949SScRQhJmCbAAbp7z75zV5bAAjxtbI5tkjB75rvfkiwb
GxxDHt0dgvtBjC2VpFLVr6qkknMydiZ6+WRMsFo+6dV7jVqZnOE5ami2oxkjG31ALWdMLLgxIIZN
UFvHDrFPsm7Z1MmEOBiNHWeaIf87025P07X7qWYRO40Gp

rupert=# select min(length(body)), max(length(body)), avg(length(body))
from resp_body;
 min |  max   |       avg
-----+--------+------------------
   0 | 261948 | 21529.5282897281

>     I  mean,
>     you  have  an  average  body size of 23K "gzipped", so you're
>     telling that the average  uncompressed  body  size  is  about
>     230K?  You  are  storing  230  Megabytes of raw body data per
>     hour? Man, who is writing all that text?

Reuters.

I have increased the free space map and will be able to restart the
postmaster today at around midnight GMT.

Thanks for you help,
Jeffrey


Re: more about pg_toast growth

From
Jan Wieck
Date:
Jeffrey W. Baker wrote:
> On Wed, 2002-03-13 at 12:16, Jan Wieck wrote:
> >
> >     Are you sure you're B64-encoding the gzipped  text?
>
> I am positive:
>
> rupert=# select substr(body, 0, 200) from resp_body where resp = (select
> max(resp) from resp_body);
>
> eJztfXt34riy799hrf4OGuZMJ1k3BL949SScRQhJmCbAAbp7z75zV5bAAjxtbI5tkjB75rvfkiwb
> GxxDHt0dgvtBjC2VpFLVr6qkknMydiZ6+WRMsFo+6dV7jVqZnOE5ami2oxkjG31ALWdMLLgxIIZN
> UFvHDrFPsm7Z1MmEOBiNHWeaIf87025P07X7qWYRO40Gp

    Hmmm,  that doesn't decode to a gzip compressed steam. But it
    doesn't decode to readable text either, so  I  have  no  clue
    what  you  encoded. Note that the pronounciation in the above
    question was on "gzipped", not on "B64-encoding".

>
> > [...]
> >     hour? Man, who is writing all that text?
>
> Reuters.

    Ah :-)

> I have increased the free space map and will be able to restart the
> postmaster today at around midnight GMT.

    Any news?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
On Fri, 2002-03-15 at 06:47, Jan Wieck wrote:
> Jeffrey W. Baker wrote:
> > On Wed, 2002-03-13 at 12:16, Jan Wieck wrote:
> > >
> > >     Are you sure you're B64-encoding the gzipped  text?
> >
> > I am positive:
> >
> > rupert=# select substr(body, 0, 200) from resp_body where resp = (select
> > max(resp) from resp_body);
> >
> > eJztfXt34riy799hrf4OGuZMJ1k3BL949SScRQhJmCbAAbp7z75zV5bAAjxtbI5tkjB75rvfkiwb
> > GxxDHt0dgvtBjC2VpFLVr6qkknMydiZ6+WRMsFo+6dV7jVqZnOE5ami2oxkjG31ALWdMLLgxIIZN
> > UFvHDrFPsm7Z1MmEOBiNHWeaIf87025P07X7qWYRO40Gp
>
>     Hmmm,  that doesn't decode to a gzip compressed steam. But it
>     doesn't decode to readable text either, so  I  have  no  clue
>     what  you  encoded. Note that the pronounciation in the above
>     question was on "gzipped", not on "B64-encoding".

You can't just throw a gzipped stream into a file and try to gunzip it.
I'm positive the data is gzipped, then base-64 encoded, because I use
this exact code to get it in there:

    # Insert the response body if any

    $zbody = MIME::Base64::encode_base64(Compress::Zlib::compress($resp->content()));

    $sth = $dbh->prepare(qq{
        INSERT INTO resp_body (resp, body) VALUES ($id, ?)
    });
    $sth->execute($zbody);

And I use the inverse to retrieve it.


> > I have increased the free space map and will be able to restart the
> > postmaster today at around midnight GMT.
>
>     Any news?

I couldn't work it in to the last maintenance window.  I'll give it
another run tonight.

Thanks again,
Jeffrey


Re: more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
On Fri, 2002-03-15 at 10:35, Jeffrey W. Baker wrote:
> On Fri, 2002-03-15 at 06:47, Jan Wieck wrote:
>
> > > I have increased the free space map and will be able to restart the
> > > postmaster today at around midnight GMT.
> >
> >     Any news?
>
> I couldn't work it in to the last maintenance window.  I'll give it
> another run tonight.

I increased the FSM and restarted postgres several weeks ago, and the
toast tables continue to just grow and grow.  I set it to:

max_fsm_relations = 1000    # min 10, fsm is free space map
max_fsm_pages = 100000      # min 1000, fsm is free space map

Now, my table with only 107490 rows takes 20GB on disk.  I dumped and
reloaded the database and got 18.5GB free space back.

I really think this is a bug in the implementation.

-jwb


Re: more about pg_toast growth

From
Jan Wieck
Date:
Jeffrey W. Baker wrote:
> On Fri, 2002-03-15 at 10:35, Jeffrey W. Baker wrote:
> > On Fri, 2002-03-15 at 06:47, Jan Wieck wrote:
> >
> > > > I have increased the free space map and will be able to restart the
> > > > postmaster today at around midnight GMT.
> > >
> > >     Any news?
> >
> > I couldn't work it in to the last maintenance window.  I'll give it
> > another run tonight.
>
> I increased the FSM and restarted postgres several weeks ago, and the
> toast tables continue to just grow and grow.  I set it to:
>
> max_fsm_relations = 1000    # min 10, fsm is free space map
> max_fsm_pages = 100000      # min 1000, fsm is free space map
>
> Now, my table with only 107490 rows takes 20GB on disk.  I dumped and
> reloaded the database and got 18.5GB free space back.

    Dump  and  reload  is  a  bit  drastic.  A  "VACUUM FULL" and
    "REINDEX" should've done that as well.

> I really think this is a bug in the implementation.

    Come on, Jeffrey, do a little math - it's not *that* hard  to
    understand.  A  free  space  map  of 100,000 entries can hold
    information about 100,000  pages  where  new  tuples  can  be
    stored.  that's  100,000  maybe partially filled 8K pages, so
    we're talking about 800MB here.

    As soon as your table once has more  than  100,000  partially
    filled  or entirely free pages due to deleted rows, you start
    loosing pages. Now, how much data where you  pumping  through
    that table by the hour?

    I  really  think this time the bug is in front of the screen,
    not behind it :-) Give it a chance and increase max_fsm_pages
    to 10 million.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
On Tue, 2002-04-09 at 06:06, Jan Wieck wrote:
> Jeffrey W. Baker wrote:
> > On Fri, 2002-03-15 at 10:35, Jeffrey W. Baker wrote:
> > > On Fri, 2002-03-15 at 06:47, Jan Wieck wrote:
> > >
> > > > > I have increased the free space map and will be able to restart the
> > > > > postmaster today at around midnight GMT.
> > > >
> > > >     Any news?
> > >
> > > I couldn't work it in to the last maintenance window.  I'll give it
> > > another run tonight.
> >
> > I increased the FSM and restarted postgres several weeks ago, and the
> > toast tables continue to just grow and grow.  I set it to:
> >
> > max_fsm_relations = 1000    # min 10, fsm is free space map
> > max_fsm_pages = 100000      # min 1000, fsm is free space map
> >
> > Now, my table with only 107490 rows takes 20GB on disk.  I dumped and
> > reloaded the database and got 18.5GB free space back.
>
>     Dump  and  reload  is  a  bit  drastic.  A  "VACUUM FULL" and
>     "REINDEX" should've done that as well.

VACUUM FULL takes an extremely long time, while dump and reload takes
about 5 minutes.  They both require exclusive access.

> > I really think this is a bug in the implementation.
>
>     Come on, Jeffrey, do a little math - it's not *that* hard  to
>     understand.  A  free  space  map  of 100,000 entries can hold
>     information about 100,000  pages  where  new  tuples  can  be
>     stored.  that's  100,000  maybe partially filled 8K pages, so
>     we're talking about 800MB here.

Please post a URL to this documentation.  Please post a URL which
clarifies whether the FSM is shared for all tables or if another is
allocated for each table.

I have "done a little math", and the peak amount of data in the table is
approximately 450MB.  A free space map tracking 800MB of space should be
plenty.

>     As soon as your table once has more  than  100,000  partially
>     filled  or entirely free pages due to deleted rows, you start
>     loosing pages. Now, how much data where you  pumping  through
>     that table by the hour?
>
>     I  really  think this time the bug is in front of the screen,

You haven't earned the right to insult me.

>     not behind it :-) Give it a chance and increase max_fsm_pages
>     to 10 million.

Your previous advice:

On Wed, 2002-03-13 at 07:22, Jan Wieck wrote:
>     Anyway,  I  would  suggest  you  increase  the  max_fsm_pages
>     parameter.  Commented out parameters in  the  postgresql.conf
>     file  means  "default".   You  said  you're doing about 1,000
>     inserts an hour and a daily bulk delete  of  approx.  24,000.
>     Assuming  most of the toast tuples are contigous, that'd mean
>     you are freeing something like 35,000 toast  pages.  I  would
>     suggest  a freespace map size of 50,000 pages, to start with.

I doubled that, and it still doesn't work.  You are suggesting I
increase your previous estimate by a factor of 200.  Your email of
2002-03-13 at 15:16 -0500 suggests a FSM of 50,000 pages allocates "some
more shared memory.  It's  surely  in  the range of a few megabytes..."
Will a FSM map 200 times larger require 200 times more memory, or is the
growth nonlinear?  How can I calculate this requirement?  Without some
documentation this database is inoperable.

I stand behind my previous statement: if PostgreSQL's unchecked table
growth can only be prevented by changing an undocumented configuration
key using an undocumented formula producing undocumented system impact,
the implementation is flawed.

I thank you for your advice, but your insults are not welcome.

-jwb


Re: more about pg_toast growth

From
Bruce Momjian
Date:
> I doubled that, and it still doesn't work.  You are suggesting I
> increase your previous estimate by a factor of 200.  Your email of
> 2002-03-13 at 15:16 -0500 suggests a FSM of 50,000 pages allocates "some
> more shared memory.  It's  surely  in  the range of a few megabytes..."
> Will a FSM map 200 times larger require 200 times more memory, or is the
> growth nonlinear?  How can I calculate this requirement?  Without some
> documentation this database is inoperable.
>
> I stand behind my previous statement: if PostgreSQL's unchecked table
> growth can only be prevented by changing an undocumented configuration
> key using an undocumented formula producing undocumented system impact,
> the implementation is flawed.

This does bring up a point that VACUUM alone does not handle all cases
of reusing tuple space.  VACUUM FULL is needed occasionally.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: more about pg_toast growth

From
Jan Wieck
Date:
Bruce Momjian wrote:
> > I doubled that, and it still doesn't work.  You are suggesting I
> > increase your previous estimate by a factor of 200.  Your email of
> > 2002-03-13 at 15:16 -0500 suggests a FSM of 50,000 pages allocates "some
> > more shared memory.  It's  surely  in  the range of a few megabytes..."
> > Will a FSM map 200 times larger require 200 times more memory, or is the
> > growth nonlinear?  How can I calculate this requirement?  Without some
> > documentation this database is inoperable.
> >
> > I stand behind my previous statement: if PostgreSQL's unchecked table
> > growth can only be prevented by changing an undocumented configuration
> > key using an undocumented formula producing undocumented system impact,
> > the implementation is flawed.
>
> This does bring up a point that VACUUM alone does not handle all cases
> of reusing tuple space.  VACUUM FULL is needed occasionally.

    I still believe it's due to the massive amount of data pumped
    through that table between vacuums and inappropriate settings
    for the freespace map size for this particular case.

    Initially  I suggested an FSM size of 50,000 "to start with".
    That was meant as an introduction to play around  with  these
    parameters a little, figuring out what the right settings are
    in his case, and reporting back the result. What we got  back
    after  a  week  or longer, was a lax "still doesn't work". It
    seemed to me he had not spent alot of time to understand  the
    underlying  concepts,  nor has he ever taken a single look at
    the code. Pumping multiple  gigabytes  every  day  through  a
    database is not the occational DB usage, where you can expect
    default settings to be appropriate. This is  clearly  a  case
    where  someone has to "learn" the finer details about tuning.

    This is an open source project.  Getting that pi**ed about my
    response, and asking that snobby for URL's to the appropriate
    documentation, finally telling "this database is inoperable",
    well,  maybe  he's  better  off  with  a support contract for
    Oracle or SQL-Server.  At  least  he'll  not  get  any  picky
    comments from those people.

    I  will  look  into  it another day, but without someone else
    running into the same problem, I  don't  feel  much  pressure
    doing so right now.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: more about pg_toast growth

From
"Jeffrey W. Baker"
Date:
On Tue, 2002-04-09 at 11:52, Jan Wieck wrote:
> Bruce Momjian wrote:
> > > I doubled that, and it still doesn't work.  You are suggesting I
> > > increase your previous estimate by a factor of 200.  Your email of
> > > 2002-03-13 at 15:16 -0500 suggests a FSM of 50,000 pages allocates "some
> > > more shared memory.  It's  surely  in  the range of a few megabytes..."
> > > Will a FSM map 200 times larger require 200 times more memory, or is the
> > > growth nonlinear?  How can I calculate this requirement?  Without some
> > > documentation this database is inoperable.
> > >
> > > I stand behind my previous statement: if PostgreSQL's unchecked table
> > > growth can only be prevented by changing an undocumented configuration
> > > key using an undocumented formula producing undocumented system impact,
> > > the implementation is flawed.
> >
> > This does bring up a point that VACUUM alone does not handle all cases
> > of reusing tuple space.  VACUUM FULL is needed occasionally.
>
>     I still believe it's due to the massive amount of data pumped
>     through that table between vacuums and inappropriate settings
>     for the freespace map size for this particular case.
>
>     Initially  I suggested an FSM size of 50,000 "to start with".
>     That was meant as an introduction to play around  with  these
>     parameters a little, figuring out what the right settings are
>     in his case, and reporting back the result. What we got  back
>     after  a  week  or longer, was a lax "still doesn't work". It
>     seemed to me he had not spent alot of time to understand  the
>     underlying  concepts,  nor has he ever taken a single look at
>     the code.

I don't need this abuse.  I'm perfectly capable of reading the source
code for PostgreSQL.  I helped diagnose a spinlock contention problem in
the 7.2beta series and I maintain DBD::Pg.  And I've contributed source
code to several other projects which I need not list here.

Surely I'd be willing to learn how to tune Pg, since I've staked my
operation on it.  But, "max_fsm_pages" produces fewer pages on Google
than "deep fried orangutan pancreas", and I do not often have the time
to leaf through 460,000 lines of source.  Some documentation would be
swell.

-jwb


Re: more about pg_toast growth

From
Jan Wieck
Date:
Jeffrey W. Baker wrote:
> On Tue, 2002-04-09 at 11:52, Jan Wieck wrote:
> >     I still believe it's due to the massive amount of data pumped
> >     through that table between vacuums and inappropriate settings
> >     for the freespace map size for this particular case.
> >
> >     Initially  I suggested an FSM size of 50,000 "to start with".
> >     That was meant as an introduction to play around  with  these
> >     parameters a little, figuring out what the right settings are
> >     in his case, and reporting back the result. What we got  back
> >     after  a  week  or longer, was a lax "still doesn't work". It
> >     seemed to me he had not spent alot of time to understand  the
> >     underlying  concepts,  nor has he ever taken a single look at
> >     the code.
>
> I don't need this abuse.  I'm perfectly capable of reading the source
> code for PostgreSQL.  I helped diagnose a spinlock contention problem in
> the 7.2beta series and I maintain DBD::Pg.  And I've contributed source
> code to several other projects which I need not list here.

    Abuse of what?

    I  did  not  mean  to  offend  you in any way. Maybe I wasn't
    carefull enough in choosing my words, maybe  you  missed  the
    smiley  in  my  earlier  posting  (that  about doing a little
    math), maybe whatever.

    Anyway,  seems  we  don't  get  along  well  enough  to  work
    together.   Thus,  it's  better  you  help  someone  else  to
    diagnose your problem.  Good luck.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: more about pg_toast growth

From
Martijn van Oosterhout
Date:
On Tue, Apr 09, 2002 at 12:26:38PM -0700, Jeffrey W. Baker wrote:
> Surely I'd be willing to learn how to tune Pg, since I've staked my
> operation on it.  But, "max_fsm_pages" produces fewer pages on Google
> than "deep fried orangutan pancreas", and I do not often have the time
> to leaf through 460,000 lines of source.  Some documentation would be
> swell.

In case anything thinks he's kidding, he's not. max_fsm_pages produces a
total of 5 hits. 1 under runtime configuration for 7.3devel docs (which is
interesting since it's not runtime configurable). 1 in the postgreql.conf
sample file.

--
Martijn van Oosterhout <kleptog@svana.org>   http://svana.org/kleptog/
> Ignorance continues to thrive when intelligent people choose to do
> nothing.  Speaking out against censorship and ignorance is the imperative
> of all intelligent people.

Re: more about pg_toast growth

From
Richard A Lough
Date:
Martijn van Oosterhout wrote:
>
> On Tue, Apr 09, 2002 at 12:26:38PM -0700, Jeffrey W. Baker wrote:
> > Surely I'd be willing to learn how to tune Pg, since I've staked my
> > operation on it.  But, "max_fsm_pages" produces fewer pages on Google
> > than "deep fried orangutan pancreas", and I do not often have the time
> > to leaf through 460,000 lines of source.  Some documentation would be
> > swell.
>
> In case anything thinks he's kidding, he's not. max_fsm_pages produces a
> total of 5 hits. 1 under runtime configuration for 7.3devel docs (which is
> interesting since it's not runtime configurable). 1 in the postgreql.conf
> sample file.
>

I'd be interested in trying to reproduce the problem. I'm trying to put
some figures together on how postgres uses disk space.

Unless I've missed something, there isn't anything in the earlier thread
on whether the table gets dropped anytime, or whether the records get
changed by INSERT's or whatever.

TIA

Richard A Lough

Re: more about pg_toast growth

From
Martijn van Oosterhout
Date:
On Wed, Apr 10, 2002 at 07:59:53AM +0100, Richard A Lough wrote:
> Martijn van Oosterhout wrote:
> >
> > On Tue, Apr 09, 2002 at 12:26:38PM -0700, Jeffrey W. Baker wrote:
> > > Surely I'd be willing to learn how to tune Pg, since I've staked my
> > > operation on it.  But, "max_fsm_pages" produces fewer pages on Google
> > > than "deep fried orangutan pancreas", and I do not often have the time
> > > to leaf through 460,000 lines of source.  Some documentation would be
> > > swell.
> >
> > In case anything thinks he's kidding, he's not. max_fsm_pages produces a
> > total of 5 hits. 1 under runtime configuration for 7.3devel docs (which is
> > interesting since it's not runtime configurable). 1 in the postgreql.conf
> > sample file.
> >
>
> I'd be interested in trying to reproduce the problem. I'm trying to put
> some figures together on how postgres uses disk space.
>
> Unless I've missed something, there isn't anything in the earlier thread
> on whether the table gets dropped anytime, or whether the records get
> changed by INSERT's or whatever.

It seems he's inserting and deleting around 300MB of data per day. The
pg_toast table seems to be full of tuples on average 10k in size. 1000
inserts an hour, deletes 24,000 per day.

The problem with the fsm seems to stem from too many deletions between
vacuums, causing it to lose track. This shouldn't be too hard to test,
assuming you have the disk space :).
--
Martijn van Oosterhout <kleptog@svana.org>   http://svana.org/kleptog/
> Ignorance continues to thrive when intelligent people choose to do
> nothing.  Speaking out against censorship and ignorance is the imperative
> of all intelligent people.