Thread: Memory bloating

Memory bloating

From
Micah Anderson
Date:
Hello,

We are using postgresql exclusively for our database backend for a
high-traffic site, sustaining 3-4 pages per second, in many cases
bursting well over that. At least half of those accesses are pgsql
SELECT, we rarely, if at all, use DELETE. It seems worst on tables with
more than about 1000 rows or 1000 hits an hour, or both.

Recently our load has been threatening us, fluxating from between 5 and
50 (yes, we've even been forced to reboot the machine it was so bad). The
load has been isolated to postgres, and we can't seem to tame it. If we
cannot find a way to make postgres behave we will be forced to move to
mysql, we don't want to, but it has been discussed since we can't seem to
isolate this problem. This is a last ditch effort to find out what can be
done before we dump postgresql.

The bloat is essentially the raw database file filling up VERY rapidly,
like up to 300 megs of bloat over 8 hours, when we vacuum it drops down
to 2 megs and the CPU is reduced considerably. Vacuum VERBOSE says things
like 3000 pages of data, 45000 pages freed. The VACUUM VERBOSE I include
below is particularly striking (but not the best one I've seen), it
reduced the 14 meg database filesize down to 3 megs, 2136 pages to 356
that's a factor of 7!!

Does this have anything to do with how we use vartext? Should we switch
to fixed sized text fields?

Is it that 3-4 hits/sec is too much for postgresql to handle?

btw - using 7.0.2

Following is our tables and an output of a VACUUM VERBOSE:

 List of relations
   Name    |   Type   |  Owner
-----------+----------+----------
 groupid   | sequence | postgres
 groups    | table    | postgres
 webcast   | table    | postgres
 webcastid | sequence | postgres
 weblink   | table    | postgres
 weblinkid | sequence | postgres
(6 rows)

active_prague=# \d webcast
            Table "webcast"
  Attribute   |     Type     | Modifier
--------------+--------------+----------
 id           | integer      | not null
 heading      | char(90)     |
 author       | char(45)     |
 date_entered | char(45)     |
 article      | text         |
 contact      | varchar(80)  |
 link         | varchar(160) |
 address      | varchar(160) |
 phone        | varchar(80)  |
 parent_id    | integer      |
 mime_type    | char(50)     |
 summary      | text         |
 numcomment   | smallint     |
 arttype      | char(50)     |
 html_file    | char(160)    |
 created      | timestamp    |
 modified     | timestamp    |
 linked_file  | char(160)    |
 mirrored     | boolean      |
 display      | boolean      |


The following Vacuum reduced the 14 meg database filesize down to 3
megs, that is 2136 pages to 356 that's a factor of 7!!

NOTICE:  --Relation webcast--
NOTICE:  Pages 2136: Changed 0, reaped 1781, Empty 0, New 0; Tup 1913:
Vac 9561, Keep/VTL 0/0, Crash 0, UnUsed 1433, MinLen 726, MaxLen 6858;
Re-using: Free/Avail. Space 14514420/14514420; EndEmpty/Avail. Pages
0/1781. CPU 0.14s/0.08u sec.
NOTICE:  Rel webcast: Pages: 2136 --> 356; Tuple(s) moved: 1913. CPU
0.29s/0.05u sec.
VACUUM



--
Micah Anderson
Collective Technologies
www.colltech.com

"To be and not to do is not to be at all"

Re: Memory bloating

From
Stephan Szabo
Date:
Are you using a fair number of updates?  Postgres uses a
non-overwriting storage manager, so updates effectively create
a new row and mark the old one as no longer valid (a great
simplification of reality, but...)

A useful thing would be an example of the queries that are giving
problems along with their explain output and any indexes that are on the
table in question.

Stephan Szabo
sszabo@bigpanda.com

On Tue, 3 Oct 2000, Micah Anderson wrote:

> Hello,
>
> We are using postgresql exclusively for our database backend for a
> high-traffic site, sustaining 3-4 pages per second, in many cases
> bursting well over that. At least half of those accesses are pgsql
> SELECT, we rarely, if at all, use DELETE. It seems worst on tables with
> more than about 1000 rows or 1000 hits an hour, or both.
>
> Recently our load has been threatening us, fluxating from between 5 and
> 50 (yes, we've even been forced to reboot the machine it was so bad). The
> load has been isolated to postgres, and we can't seem to tame it. If we
> cannot find a way to make postgres behave we will be forced to move to
> mysql, we don't want to, but it has been discussed since we can't seem to
> isolate this problem. This is a last ditch effort to find out what can be
> done before we dump postgresql.
>
> The bloat is essentially the raw database file filling up VERY rapidly,
> like up to 300 megs of bloat over 8 hours, when we vacuum it drops down
> to 2 megs and the CPU is reduced considerably. Vacuum VERBOSE says things
> like 3000 pages of data, 45000 pages freed. The VACUUM VERBOSE I include
> below is particularly striking (but not the best one I've seen), it
> reduced the 14 meg database filesize down to 3 megs, 2136 pages to 356
> that's a factor of 7!!
>
> Does this have anything to do with how we use vartext? Should we switch
> to fixed sized text fields?
>
> Is it that 3-4 hits/sec is too much for postgresql to handle?
>
> btw - using 7.0.2
>
> Following is our tables and an output of a VACUUM VERBOSE:
>
>  List of relations
>    Name    |   Type   |  Owner
> -----------+----------+----------
>  groupid   | sequence | postgres
>  groups    | table    | postgres
>  webcast   | table    | postgres
>  webcastid | sequence | postgres
>  weblink   | table    | postgres
>  weblinkid | sequence | postgres
> (6 rows)
>
> active_prague=# \d webcast
>             Table "webcast"
>   Attribute   |     Type     | Modifier
> --------------+--------------+----------
>  id           | integer      | not null
>  heading      | char(90)     |
>  author       | char(45)     |
>  date_entered | char(45)     |
>  article      | text         |
>  contact      | varchar(80)  |
>  link         | varchar(160) |
>  address      | varchar(160) |
>  phone        | varchar(80)  |
>  parent_id    | integer      |
>  mime_type    | char(50)     |
>  summary      | text         |
>  numcomment   | smallint     |
>  arttype      | char(50)     |
>  html_file    | char(160)    |
>  created      | timestamp    |
>  modified     | timestamp    |
>  linked_file  | char(160)    |
>  mirrored     | boolean      |
>  display      | boolean      |
>
>
> The following Vacuum reduced the 14 meg database filesize down to 3
> megs, that is 2136 pages to 356 that's a factor of 7!!
>
> NOTICE:  --Relation webcast--
> NOTICE:  Pages 2136: Changed 0, reaped 1781, Empty 0, New 0; Tup 1913:
> Vac 9561, Keep/VTL 0/0, Crash 0, UnUsed 1433, MinLen 726, MaxLen 6858;
> Re-using: Free/Avail. Space 14514420/14514420; EndEmpty/Avail. Pages
> 0/1781. CPU 0.14s/0.08u sec.
> NOTICE:  Rel webcast: Pages: 2136 --> 356; Tuple(s) moved: 1913. CPU
> 0.29s/0.05u sec.
> VACUUM


Updates

From
"Adam Lang"
Date:
Su if you update a row, you are still keeping a copy of the old row?  And to
get rid of that is to use vacuum?


Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
----- Original Message -----
From: "Stephan Szabo" <sszabo@megazone23.bigpanda.com>
To: "Micah Anderson" <micah@colltech.com>
Cc: <pgsql-general@postgresql.org>
Sent: Tuesday, October 03, 2000 1:10 PM
Subject: Re: [GENERAL] Memory bloating


>
> Are you using a fair number of updates?  Postgres uses a
> non-overwriting storage manager, so updates effectively create
> a new row and mark the old one as no longer valid (a great
> simplification of reality, but...)
>
> A useful thing would be an example of the queries that are giving
> problems along with their explain output and any indexes that are on the
> table in question.
>
> Stephan Szabo
> sszabo@bigpanda.com
>
> On Tue, 3 Oct 2000, Micah Anderson wrote:
>
> > Hello,
> >
> > We are using postgresql exclusively for our database backend for a
> > high-traffic site, sustaining 3-4 pages per second, in many cases
> > bursting well over that. At least half of those accesses are pgsql
> > SELECT, we rarely, if at all, use DELETE. It seems worst on tables with
> > more than about 1000 rows or 1000 hits an hour, or both.
> >
> > Recently our load has been threatening us, fluxating from between 5 and
> > 50 (yes, we've even been forced to reboot the machine it was so bad).
The
> > load has been isolated to postgres, and we can't seem to tame it. If we
> > cannot find a way to make postgres behave we will be forced to move to
> > mysql, we don't want to, but it has been discussed since we can't seem
to
> > isolate this problem. This is a last ditch effort to find out what can
be
> > done before we dump postgresql.
> >
> > The bloat is essentially the raw database file filling up VERY rapidly,
> > like up to 300 megs of bloat over 8 hours, when we vacuum it drops down
> > to 2 megs and the CPU is reduced considerably. Vacuum VERBOSE says
things
> > like 3000 pages of data, 45000 pages freed. The VACUUM VERBOSE I include
> > below is particularly striking (but not the best one I've seen), it
> > reduced the 14 meg database filesize down to 3 megs, 2136 pages to 356
> > that's a factor of 7!!
> >
> > Does this have anything to do with how we use vartext? Should we switch
> > to fixed sized text fields?
> >
> > Is it that 3-4 hits/sec is too much for postgresql to handle?
> >
> > btw - using 7.0.2
> >
> > Following is our tables and an output of a VACUUM VERBOSE:
> >
> >  List of relations
> >    Name    |   Type   |  Owner
> > -----------+----------+----------
> >  groupid   | sequence | postgres
> >  groups    | table    | postgres
> >  webcast   | table    | postgres
> >  webcastid | sequence | postgres
> >  weblink   | table    | postgres
> >  weblinkid | sequence | postgres
> > (6 rows)
> >
> > active_prague=# \d webcast
> >             Table "webcast"
> >   Attribute   |     Type     | Modifier
> > --------------+--------------+----------
> >  id           | integer      | not null
> >  heading      | char(90)     |
> >  author       | char(45)     |
> >  date_entered | char(45)     |
> >  article      | text         |
> >  contact      | varchar(80)  |
> >  link         | varchar(160) |
> >  address      | varchar(160) |
> >  phone        | varchar(80)  |
> >  parent_id    | integer      |
> >  mime_type    | char(50)     |
> >  summary      | text         |
> >  numcomment   | smallint     |
> >  arttype      | char(50)     |
> >  html_file    | char(160)    |
> >  created      | timestamp    |
> >  modified     | timestamp    |
> >  linked_file  | char(160)    |
> >  mirrored     | boolean      |
> >  display      | boolean      |
> >
> >
> > The following Vacuum reduced the 14 meg database filesize down to 3
> > megs, that is 2136 pages to 356 that's a factor of 7!!
> >
> > NOTICE:  --Relation webcast--
> > NOTICE:  Pages 2136: Changed 0, reaped 1781, Empty 0, New 0; Tup 1913:
> > Vac 9561, Keep/VTL 0/0, Crash 0, UnUsed 1433, MinLen 726, MaxLen 6858;
> > Re-using: Free/Avail. Space 14514420/14514420; EndEmpty/Avail. Pages
> > 0/1781. CPU 0.14s/0.08u sec.
> > NOTICE:  Rel webcast: Pages: 2136 --> 356; Tuple(s) moved: 1913. CPU
> > 0.29s/0.05u sec.
> > VACUUM


Re: Memory bloating

From
"Ross J. Reedstrom"
Date:
On Tue, Oct 03, 2000 at 11:55:45AM -0500, Micah Anderson wrote:
> Hello,
>
> We are using postgresql exclusively for our database backend for a
> high-traffic site, sustaining 3-4 pages per second, in many cases
> bursting well over that. At least half of those accesses are pgsql
> SELECT, we rarely, if at all, use DELETE. It seems worst on tables with
> more than about 1000 rows or 1000 hits an hour, or both.
>

Hmm, how about UPDATE? The exploding files and VACUUM stats indicate that
you're creating no longer valid tuples _somehow_. Any recent changes in
SQL scripts? I'd be suspicious of a 'view' page that got cloned from an
'edit' page, and is updating the 'modified' field, as an example.

You could turn on a higher level of logging, and capture the actual
queries, for a time, and find the culprit that way. Use of varchar isn't
the problem: there's got to be either UPDATE or DELETE going on.

Ross
--
Open source code is like a natural resource, it's the result of providing
food and sunshine to programmers, and then staying out of their way.
[...] [It] is not going away because it has utility for both the developers
and users independent of economic motivations.  Jim Flynn, Sunnyvale, Calif.

Re: Memory bloating

From
"Kane Tao"
Date:
Are you saving usage and navigation statistics via PostgreSQL?  Updates to
summary records like a page counter cause a new record to be written and the
old one to be flagged for deletion/cleanup as I understand it.  I think the
solution is to vacuum more frequently on certain tables or write a second
teir application that would aggregate the update requests and write them
less frequently...

- k
----- Original Message -----
From: "Micah Anderson" <micah@colltech.com>
To: <pgsql-general@postgresql.org>
Sent: Tuesday, October 03, 2000 11:55 AM
Subject: [GENERAL] Memory bloating


Hello,

We are using postgresql exclusively for our database backend for a
high-traffic site, sustaining 3-4 pages per second, in many cases
bursting well over that. At least half of those accesses are pgsql
SELECT, we rarely, if at all, use DELETE. It seems worst on tables with
more than about 1000 rows or 1000 hits an hour, or both.

Recently our load has been threatening us, fluxating from between 5 and
50 (yes, we've even been forced to reboot the machine it was so bad). The
load has been isolated to postgres, and we can't seem to tame it. If we
cannot find a way to make postgres behave we will be forced to move to
mysql, we don't want to, but it has been discussed since we can't seem to
isolate this problem. This is a last ditch effort to find out what can be
done before we dump postgresql.

The bloat is essentially the raw database file filling up VERY rapidly,
like up to 300 megs of bloat over 8 hours, when we vacuum it drops down
to 2 megs and the CPU is reduced considerably. Vacuum VERBOSE says things
like 3000 pages of data, 45000 pages freed. The VACUUM VERBOSE I include
below is particularly striking (but not the best one I've seen), it
reduced the 14 meg database filesize down to 3 megs, 2136 pages to 356
that's a factor of 7!!

Does this have anything to do with how we use vartext? Should we switch
to fixed sized text fields?

Is it that 3-4 hits/sec is too much for postgresql to handle?

btw - using 7.0.2

Following is our tables and an output of a VACUUM VERBOSE:

 List of relations
   Name    |   Type   |  Owner
-----------+----------+----------
 groupid   | sequence | postgres
 groups    | table    | postgres
 webcast   | table    | postgres
 webcastid | sequence | postgres
 weblink   | table    | postgres
 weblinkid | sequence | postgres
(6 rows)

active_prague=# \d webcast
            Table "webcast"
  Attribute   |     Type     | Modifier
--------------+--------------+----------
 id           | integer      | not null
 heading      | char(90)     |
 author       | char(45)     |
 date_entered | char(45)     |
 article      | text         |
 contact      | varchar(80)  |
 link         | varchar(160) |
 address      | varchar(160) |
 phone        | varchar(80)  |
 parent_id    | integer      |
 mime_type    | char(50)     |
 summary      | text         |
 numcomment   | smallint     |
 arttype      | char(50)     |
 html_file    | char(160)    |
 created      | timestamp    |
 modified     | timestamp    |
 linked_file  | char(160)    |
 mirrored     | boolean      |
 display      | boolean      |


The following Vacuum reduced the 14 meg database filesize down to 3
megs, that is 2136 pages to 356 that's a factor of 7!!

NOTICE:  --Relation webcast--
NOTICE:  Pages 2136: Changed 0, reaped 1781, Empty 0, New 0; Tup 1913:
Vac 9561, Keep/VTL 0/0, Crash 0, UnUsed 1433, MinLen 726, MaxLen 6858;
Re-using: Free/Avail. Space 14514420/14514420; EndEmpty/Avail. Pages
0/1781. CPU 0.14s/0.08u sec.
NOTICE:  Rel webcast: Pages: 2136 --> 356; Tuple(s) moved: 1913. CPU
0.29s/0.05u sec.
VACUUM



--
Micah Anderson
Collective Technologies
www.colltech.com

"To be and not to do is not to be at all"




Re: Memory bloating

From
Micah Anderson
Date:
Ross,

I put debug level to 2 and let it go a cycle (which I define as bringing the
CPU load to at least 20 and then seeing it drop down again), about 45
minutes or so. The logfile is pretty big so I dont want to send it to you in
email, but I didn't really see anything unusual (but then again, I dont know
what unusual would be), most everything was selects. I counted 19 UPDATE
lines, which is VERY few, and 25 DELETE, amidst about 2500 SELECTs.

Micah




On Tue, Oct 03, 2000 at 12:26:53PM -0500, Ross J. Reedstrom wrote:
> On Tue, Oct 03, 2000 at 11:55:45AM -0500, Micah Anderson wrote:
> > Hello,
> >
> > We are using postgresql exclusively for our database backend for a
> > high-traffic site, sustaining 3-4 pages per second, in many cases
> > bursting well over that. At least half of those accesses are pgsql
> > SELECT, we rarely, if at all, use DELETE. It seems worst on tables with
> > more than about 1000 rows or 1000 hits an hour, or both.
> >
>
> Hmm, how about UPDATE? The exploding files and VACUUM stats indicate that
> you're creating no longer valid tuples _somehow_. Any recent changes in
> SQL scripts? I'd be suspicious of a 'view' page that got cloned from an
> 'edit' page, and is updating the 'modified' field, as an example.
>
> You could turn on a higher level of logging, and capture the actual
> queries, for a time, and find the culprit that way. Use of varchar isn't
> the problem: there's got to be either UPDATE or DELETE going on.
>
> Ross
> --
> Open source code is like a natural resource, it's the result of providing
> food and sunshine to programmers, and then staying out of their way.
> [...] [It] is not going away because it has utility for both the developers
> and users independent of economic motivations.  Jim Flynn, Sunnyvale, Calif.

--
Micah Anderson
Collective Technologies
www.colltech.com

"To be and not to do is not to be at all"

Re: Memory bloating

From
Tom Lane
Date:
Micah Anderson <micah@colltech.com> writes:
> NOTICE:  --Relation webcast--
> NOTICE:  Pages 2136: Changed 0, reaped 1781, Empty 0, New 0; Tup 1913:
> Vac 9561, Keep/VTL 0/0, Crash 0, UnUsed 1433, MinLen 726, MaxLen 6858;
> Re-using: Free/Avail. Space 14514420/14514420; EndEmpty/Avail. Pages
> 0/1781. CPU 0.14s/0.08u sec.
> NOTICE:  Rel webcast: Pages: 2136 --> 356; Tuple(s) moved: 1913. CPU
> 0.29s/0.05u sec.
> VACUUM

Given the evident lack of any index on this table, it's not surprising
that your searches get slower and slower as more rows are added (or
updated).  Think about adding appropriate indexes.

            regards, tom lane

Re: Memory bloating

From
"Ross J. Reedstrom"
Date:
On Tue, Oct 03, 2000 at 02:01:43PM -0500, Micah Anderson wrote:
> Ross,
>
> I put debug level to 2 and let it go a cycle (which I define as bringing the
> CPU load to at least 20 and then seeing it drop down again), about 45
> minutes or so. The logfile is pretty big so I dont want to send it to you in
> email, but I didn't really see anything unusual (but then again, I dont know
> what unusual would be), most everything was selects. I counted 19 UPDATE
> lines, which is VERY few, and 25 DELETE, amidst about 2500 SELECTs.
>
> Micah

Hmm, let's see, that's 44 new tuples in 45 min, almost exactly a tuple
a minute.  The ~ 2000 pages you mentioned looked to contain about 10k
tuples, so your fitting on average 5 tuples per page. At that rate, it
should have taken about 6 days to create the 14MB file you mentioned.

BTW, what leads you to believe the large tables are causing the load
increase? And the subject line says something about memory bloat. Three
different resources: disk space, memory, and CPU cycles. Which is the
underlying problem? The vacuum stats could be a red herring.

Can you figure out which queries are running that blow the load up
so high? SELECTs that grab a lot of data and sort it can be problem.
What parameters are you launching the backend with? (number of buffers,
sort memory, etc.)

Ross
--
Open source code is like a natural resource, it's the result of providing
food and sunshine to programmers, and then staying out of their way.
[...] [It] is not going away because it has utility for both the developers
and users independent of economic motivations.  Jim Flynn, Sunnyvale, Calif.