More harrowing tales of TOAST growth - Mailing list pgsql-general

From Jeffrey Baker
Subject More harrowing tales of TOAST growth
Date
Msg-id 20020505011057.GA2248@noodles
Whole thread Raw
Responses Re: More harrowing tales of TOAST growth
List pgsql-general
(This isn't the full data I someday hope to have, but it is the
limit of my current understanding.)

I've been whining here about my databases growing for a while.  I
manage a database that I have to drop and reload every two weeks
simply because it grows to fill the available storage device, with
the associated performance penalty.

I have reduced the problem to its essence by writing a few programs
that trigger it.  My environment is Debian GNU/Linux unstable
distribution, PostgreSQL 7.2.1, x86 host with 256MB main memory and
a 2-disk stripe set of 10,000RPM disks used exclusively by the
database.

I create the simple database with these commands:

createdb grow
psql -c "create table grow (body text)" grow

I then start 10 copies of the "churn" program, which is attached
here.  Churn inserts tuples of approximately 13KB which are
compressible to 10KB.  It does this continuously in a tight loop.
The ten processes simulate my situation of many clients writing into
the database.

I also start one copy of the vac program.  This program deletes all
tuples from grow and vacuums the database continuously.  This should
free all the space in the tables, which will be filled up again by
the churn programs.  I expect a sawtooth effect of database size
over time, with the database reaching an upper limit steady state
size.

I ran this experiment for 10 minutes.  At time 0 the database size
was 2MB.  At 60 seconds the size was 22MB.  The size continued
upward:

Time | Size
0      2
60     7
120    60
180    109
240    181
300    282
360    284
420    363
480    442
540    549
600    557

I stopped the experiment by killing all churn and vac processes,
which left the final 'delete from grow' command still running.
After it finished, there were 5020 tuples in grow and 35,140 tuples
in its TOAST table.  The physical size on disk was dominated by
the TOAST table at 539MB and its index of 14MB.  The actual grow
table was a mere 2MB.

The amount of useful data in the TOAST table was a scant 65MB
according to select sum(length(chunk_data)) from <table>.  This
jives with the 5000 * 10KB tuples in the grow table.   The TOAST
data claimed 19939 pages.

I don't have a complete conclusion at this point but it seems
obvious to me that after a delete and vacuum, new tuples aren't
stuck into the freed space, but are rather tacked on the end, or
something conceptually similar.

-jwb

Attachment

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dump -C doesn't capture encoding
Next
From: Feng-Cheng Chang
Date:
Subject: Fatal error after a system crash